Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove flyback #304

Merged
merged 13 commits into from
Mar 27, 2024
Merged

Remove flyback #304

merged 13 commits into from
Mar 27, 2024

Conversation

ercius
Copy link
Collaborator

@ercius ercius commented Nov 6, 2023

Update from_hdf5 to allow the user to crop the flyback column during load. For large data sets this can reduce the time needed in processing later significantly. The keyword is set to True to be compatible with older code.

Need to fully test once the CI builds are available. Dont merge yet. Will update as needed.

@ercius
Copy link
Collaborator Author

ercius commented Nov 7, 2023

This does not save a lot of time when loading data, but it saves ~2x the memory due to a long standing memory leak in h5py objects. Thus, for large data sets the option to load without the flyback can help with very large datasets and memory constraints.

Ready to be looked at @cjh1, @swelborn!

@ercius
Copy link
Collaborator Author

ercius commented Feb 16, 2024

Another idea is to use fancy indexing. I have not tested the speed yet:

# Remove the flyback data
    to_keep = np.ones((scan_shape[0] * scan_shape[1]), dtype=bool)
    to_keep[scan_shape[0]::scan_shape[1]] = False
    data = frames[to_keep]

@ercius
Copy link
Collaborator Author

ercius commented Feb 16, 2024

There is a minimal effect on the speed of loading with either method. The more recent method would allow any "contiguous load map" to be used rather than just deleting the last (flyback) column. It seems like a useful feature.

Copy link
Collaborator

@swelborn swelborn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not too familiar with sparse array class here, but my two cents in this review. perhaps @cjh1 can give a bit more feedback

scan_positions = scan_positions_group[()]
else:
# Generate the original scan indices from the scan_shape
orig_indices = np.ravel_multi_index([ii.ravel() for ii in np.indices(scan_shape)],scan_shape)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason to use ii for indices? instead of i for ex.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never use single letter variables.

I once spent a full day debugging something where i was already used for sqrt(-1). Ever since, I always use ii, jj, kk, etc. for loops.

Also, when using the pdb debugger, c, q, n, etc. are reserved for interacting with the debugger (continue, quit, next, etc.). That messed me up a few times as well.

@@ -710,6 +710,13 @@ def compare_with_sparse(full, sparse):
assert np.array_equal(m_array[[False, True], 0][0], position_one)


def test_keep_flyback(sparse_array_small):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no test?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like I started the test but did not finish it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test added

@ercius
Copy link
Collaborator Author

ercius commented Mar 27, 2024

I think this is ready for review by someone else.

Copy link
Member

@cjh1 cjh1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cjh1 cjh1 merged commit 4ab0ea1 into OpenChemistry:master Mar 27, 2024
11 checks passed
@cjh1
Copy link
Member

cjh1 commented Mar 27, 2024

@ercius I have cut a new release which contains this change. I have updated the images using by jupyterlab.

@ercius ercius deleted the remove_flyback branch March 27, 2024 20:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants