-
Notifications
You must be signed in to change notification settings - Fork 430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAINT: updates for numpy2 closes #509 #510
Conversation
@MuellerSeb can you provide a minimal working example that would exercise the broadcasting issue in #509 (comment)? |
Here it is (works with np 1.* but fails with np 2.0.0rc1): import emcee as mc
import numpy as np
def ln_pdf(x):
return np.log(np.sqrt(np.pi) * np.exp(-((x / 2.0) ** 2)))
init = np.random.rand(50, 1)
sampler = mc.EnsembleSampler(50, 1, ln_pdf)
burn_in_state = sampler.run_mcmc(initial_state=init, nsteps=20) This gives:
|
In backend.py in L227, i added some print statements to look at the shapes: if state.blobs is not None:
print(self.blobs.shape)
print(state.blobs.shape)
self.blobs[self.iteration, :] = state.blobs With np 1.x, I get
With np 2 I get:
And
So |
And this seems to come from L179 in backend.py: if blobs is not None:
dt = np.dtype((blobs.dtype, blobs.shape[1:]))
a = np.empty((i, self.nwalkers), dtype=dt) Here, |
So I guess the special dtype is the root of this error: |
And that is maybe related to these changes in numpy: |
Here is what ChatGPT says about the problem: The Problem In NumPy 1.x, creating an array with a dtype shaped like (datatype, (0,)) and then attempting to concatenate it along a new axis could lead to an array that still retains the original datatype (float64 in your case), but with an additional dimension of size 0. This behavior effectively results in an array with a shape that includes a 0-sized dimension, which might not be what was intended but works due to how NumPy 1.x handles such cases. In NumPy 2, the behavior seems to have changed to more strictly interpret the dtype, leading to the creation of an array where the dtype is V (void) or similar, representing a block of memory of the specified size but without a direct numerical datatype associated with it. This change reflects a stricter interpretation of the dtype specification, particularly around structured arrays and zero-sized dimensions. The Solution To address this issue and ensure compatibility with both NumPy 1 and 2, you should revise the approach to creating and manipulating these arrays. If the goal is to have self.blobs be a structured array with dimensions that match blobs but starting empty, you need to reconsider how to define dt and how to initialize a. If blobs.shape[1:] results in a (0,) shape and you still wish to use this structure for consistency, consider explicitly handling this case or rethinking the necessity of the zero-sized dimension. It might be more straightforward to avoid creating a dtype that includes a zero-sized shape unless it's essential for your application. For applications that don't strictly require the structured array with a zero-sized dimension, you can simplify the dtype definition. If blobs is expected to have additional dimensions that are meaningful, use those directly. If the zero-sized dimension is not essential, you might avoid including it in your dtype definition. Here's a conceptual fix, assuming you want to preserve the non-zero dimensions of blobs: if blobs is not None:
# If blobs.shape[1:] is (0,), this needs handling or reconsideration
if blobs.shape[1:]: # This checks if there are meaningful dimensions beyond the first
dt = np.dtype((blobs.dtype, blobs.shape[1:]))
else:
# Adjust the dtype definition to suit your needs when blobs.shape[1:] is (0,)
dt = blobs.dtype # Or another appropriate adjustment
a = np.empty((i, self.nwalkers), dtype=dt)
if self.blobs is None:
self.blobs = a
else:
self.blobs = np.concatenate((self.blobs, a), axis=0) This adjustment ensures you're consciously deciding on the dtype structure based on the presence of meaningful dimensions in blobs. If handling zero-sized dimensions is crucial for your application, you may need to include conditional logic to manage these cases explicitly, aligning with the behaviors of NumPy 2. Given the migration to NumPy 2 and its stricter behaviors, it's also beneficial to consult the NumPy 2.0 migration guide for specific guidance on handling dtype and structured array changes. |
This did the trick for me: if not np.any(blobs.shape[1:]):
dt = blobs.dtype
a = np.empty((i, self.nwalkers) + blobs.shape[1:], dtype=dt)
else:
dt = np.dtype((blobs.dtype, blobs.shape[1:]))
a = np.empty((i, self.nwalkers), dtype=dt) Should work with both, np 1.x and np 2. |
Maybe it is sufficient to always use: a = np.empty((i, self.nwalkers) + blobs.shape[1:], dtype=blobs.dtype) To not depend on this custom dtype. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would only implement compat fixes to support numpy 1 and 2 and not touch the minimal required python version. The added check job in CI is a good idea until numpy 2 is out.
@@ -16,7 +16,7 @@ jobs: | |||
runs-on: ${{ matrix.os }} | |||
strategy: | |||
matrix: | |||
python-version: ["3.7", "3.8", "3.9", "3.10"] | |||
python-version: ["3.9", "3.10", "3.11", "3.12"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would only add 3.11 and 3.12 and not remove 3.7 (maybe because EOL) and 3.8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The leading_edge
job either way only runs on py 3.12 and numpy 2 wont be available for py 3.8 and lower.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll go with what @dfm decides.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to test 3.9-3.12 as implemented in this PR. Thanks!!
See: andyfaff#1 |
Thanks for figuring this all out so quickly, @andyfaff! Much appreciated! |
This issue is still unsolved and I think my suggested fix works with numpy 1 and 2. @dfm what do you think? Or did I use emcee in a wrong way? |
The fixes applied here addressed the original issue (for np1 and 2). Your comments around dtypes were unclear to me. Was there another bug you found? I tried out the fixes here with returning a blob and the blobs were stored correctly in the state. |
@andyfaff sorry! I misinterpreted the |
@dfm could you create a new release including these fixes? |
Starts to address #509
np.VisibleDeprecationWarning
tonp.exceptions.VisibleDeprecationWarning
.