-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Toast 3 Work in Progress #369
base: master
Are you sure you want to change the base?
Conversation
I think the easiest way to provide feedback would be to make an export of the notebook to a Python script in a separate pull request, so we can do line by line feedback there. Then the pull request can be closed without merging. |
Good idea @zonca, will do that soon. |
Ok @zonca, I enabled the
https://app.reviewnb.com//pull/369/files/
and comment on the per-cell level of the intro.ipynb file. Since a lot has changed, there is a switch to "hide previous version". Let me know if that works, since I can't tell if this plugin is usable by everyone. |
Note that the output of the notebook is stripped on github, so refer to the PDF attached to this issue to look at that. |
Updated notebook output, with config section. |
@@ -8,7 +8,7 @@ | |||
"source": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mention if you can modify this in place or it is read-only. If it is read-only, how do I modify it?
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, some options are fixed by the OS runtime environment of python, but some can be changed after toast is imported. Will give more details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added text about setting log level manually or through environment. Same with threading.
@@ -8,7 +8,7 @@ | |||
"source": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better specify that "traditional CPU systems" means a supercomputer, otherwise it seems it is also required on a laptop.
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, if the user has an AMD Ryzen workstation with 16 cores (for example), then they probably want to use mpi4py if they are doing something more with toast than just running a notebook with a few samples. I will definitely clarify though. I have started an "intro_parallel.ipynb" where I am going to discuss using IPython.parallel with mpi4py. I'll reference that in the serial notebook.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried to clarify that toast parallelism is mainly through MPI, so that any system with more than a few cores will benefit from having mpi4py installed.
@@ -8,7 +8,7 @@ | |||
"source": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. astropy.units are a new addition to toast, and currently only used in the new Operator traits. I need to systematically go through the codebase and add support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I converted the fake focalplane simulation and plotting functions to use units. However, I'll wait on the rest of the instrument classes until we can revisit the overall plan for those.
@@ -8,7 +8,7 @@ | |||
"source": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if detlabels
is None, you could use the keys as labels, so we avoid to build the trivial dict x:x
.
please use keyword arguments for all inputs, so people don't have to look at the help of plot_focalplane
For the color, what about using endswith("A")
instead of enumerating?
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, this is cleaned up.
@@ -8,7 +8,7 @@ | |||
"source": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -8,7 +8,7 @@ | |||
"source": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
either it is an attribute, so other_simsat.config
or it is a method then needs to have a verb in the name:
other_simsat.get_config()
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The methods are now get_config()
and get_class_config()
@@ -8,7 +8,7 @@ | |||
"source": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was inspired by the traitlets methods traits()
and class_traits()
, but I can add a "get_" in there if it is more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes please
@@ -8,7 +8,7 @@ | |||
"source": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had heard about it but first time I used
@tskisner, I think the Toast interface looks really good, good job! I have a bit more feedback in the notebook. |
Thanks @zonca for the detailed review, just what I was looking for! On overall question for objects that act like a dictionary, but which have other state information. For example,
actually creates a
And then have setitem check that the number of samples matches that in the observation. This did not seem as convenient to me, but I do hate typing :-) For the MPI shared memory class, I could replace the set method like this:
or
or something else? |
inside def __setitem__(self, key, value):
# if key is undefined
data = DetectorData(self.detectors, (self.n_sample,)+ value.shape, dtype=value.dtype)
# set this into the dict for the set I don't understand, what do you need the |
I'll try to clarify... The For the MPIshared class, the |
I think I understand now- you want to allow
I can implement that, but still not sure it is more convenient. On the other hand, no reason not to support multiple ways of assignment! |
Ahhh, now I see- you can create the full-size DetectorData object first, and then the slicing notation can be applied when actually assigning from the RHS. Ok, I will try this out. I agree this would be a more convenient interface. I'll also try to modify the MPIshared package to make the set() method optional at the cost of a precalculation. |
I have added setitem support to the upstream https://github.com/tskisner/pshmem/releases/tag/0.2.5 And this new version is available on PyPI: https://pypi.org/project/pshmem/0.2.5/ So now I can work on using that syntax in toast. |
Ok, I think I have concluded that the import numpy as np
class DetData:
def __init__(self, ndet, shape, dtype):
self.dtype = np.dtype(dtype)
self.shape = [ndet]
self.flatshape = ndet
for s in shape:
self.shape.append(s)
self.flatshape *= s
self.flatdata = np.zeros(self.flatshape, dtype=self.dtype)
self.data = self.flatdata.reshape(self.shape)
def __getitem__(self, key):
print("DetData getitem {}".format(key))
return self.data[key]
def __setitem__(self, key, value):
print("DetData setitem {}".format(key))
self.data[key] = value
def __repr__(self):
return str(self.data)
class Mgr:
def __init__(self, ndet):
self.ndet = ndet
self.d = dict()
def create(self, name, shape, dtype):
self.d[name] = DetData(self.ndet, shape, dtype)
def __getitem__(self, key):
print("Calling Mgr getitem")
if key not in self.d:
# Cannot guess what shape and dtype the user wants
pass
return self.d[key]
def __setitem__(self, key, value):
print("Calling Mgr setitem")
self.d[key] = value
mgr = Mgr(2)
# This works fine, as expected:
mgr.create("A", (3, 4), np.int32)
mgr["A"][1, 0:2, 0:2] = 5
print("mgr['A'] = \n", mgr["A"])
# This works, but it is annoying, since the user has to know the name
# of the DetData class and also has to get information from the Mgr
# class:
mgr["B"] = DetData(mgr.ndet, (3, 4), np.int32)
mgr["B"][1, 0:2, 0:2] = 5
print("mgr['B'] = \n", mgr["B"])
# The code below is actually doing:
#
# Mgr.__getitem__("C").__setitem(tuple, 5)
#
# Which means that the DetData class would have to be instantiated in
# Mgr.__getitem__() where we don't know the correct shape of the buffer
# to create. Obviously this gives a key error:
mgr["C"][1, 0:2, 0:2] = 5
print("mgr['C'] = \n", mgr["C"]) The output of the above script is:
|
you first need to create the thing before slicing it: mgr = Mgr(2)
mgr["A"] = np.zeros((3,4), dtype=np.int32)
mgr["A"][1, 0:2, 0:2] = 5 It doesn't work now, but it should be implementable. |
Hi @zonca, thanks for your patience, and sorry if I am being dense :-) Below I updated the toy code to be closer to the real case. The central problem is that when assigning data (see the import numpy as np
class DetData:
def __init__(self, ndet, shape, dtype):
self.dtype = np.dtype(dtype)
self.shape = [ndet]
self.flatshape = ndet
for s in shape:
self.shape.append(s)
self.flatshape *= s
self.flatdata = np.zeros(self.flatshape, dtype=self.dtype)
self.data = self.flatdata.reshape(self.shape)
def __getitem__(self, key):
return self.data[key]
def __setitem__(self, key, value):
self.data[key] = value
def __repr__(self):
return str(self.data)
class Mgr:
def __init__(self, ndet, nsample):
self.ndet = ndet
self.nsample = nsample
self.d = dict()
def create(self, name, sample_shape, dtype):
self.d[name] = DetData(self.ndet, (self.nsample,) + sample_shape, dtype)
def __getitem__(self, key):
return self.d[key]
def __setitem__(self, key, value):
if isinstance(value, DetData):
self.d[key] = value
else:
# This is an array, verify that the number of dimensions match
sample_shape = None
if len(value.shape) < 2:
raise RuntimeError("Assigned array does not have sufficient dimensions")
elif len(value.shape) == 2:
# We assume the user meant one scalar value per sample...
sample_shape = (1,)
else:
# The first two dimensions are detector and sample. The rest of the
# dimensions are the data shape for every sample and must be fully
# specified when creating data like this.
sample_shape = value.shape[2:]
print(
"Creating DetData with {} dets, {} samples, {} samp shape".format(
self.ndet, self.nsample, sample_shape
)
)
self.d[key] = DetData(
self.ndet, (self.nsample,) + sample_shape, value.dtype
)
# If the value has the full size of the DetData object, then we can do the
# assignment, if not, then we cannot guess what detector / sample slice
# the user is trying to assign.
if (value.shape[0] == self.ndet) and (value.shape[1] == self.nsample):
# We can do it!
self.d[key][:] = value
# 2 detectors and 5 samples
mgr = Mgr(2, 5)
# This works fine, as expected:
mgr.create("A", (3, 4), np.int32)
mgr["A"][1, 2:3, 0:2, 0:2] = 5
print("mgr['A'] = \n", mgr["A"])
# This works, but it is annoying, since the user has to know the name
# of the DetData class and also has to get information from the Mgr
# class:
mgr["B"] = DetData(mgr.ndet, (mgr.nsample, 3, 4), np.int32)
mgr["B"][1, 2:3, 0:2, 0:2] = 5
print("mgr['B'] = \n", mgr["B"])
# This creates a buffer with the full number of detectors and samples and uses the
# last dimensions of the RHS to determine the shape of the data per sample. However,
# we have no information about what LHS slice we are assigning the RHS data to. UNLESS
# the user gives a RHS data with the full n_detector x n_sample data set:
# mgr["C"] is created by not assigned, since we don't know where to assign the data
# along the first 2 axes (detector and sample).
mgr["C"] = np.ones((1, 1, 3, 4), dtype=np.int32)
mgr["C"][1, 2:3, 0:2, 0:2] = 5
print("mgr['C'] = \n", mgr["C"])
# mgr["D"] is created AND assigned, since we specify data of the full size.
mgr["D"] = np.ones((mgr.ndet, mgr.nsample, 3, 4), dtype=np.int32)
mgr["D"][1, 2:3, 0:2, 0:2] = 5
print("mgr['D'] = \n", mgr["D"]) I think that the How about we support cases
Does that seem acceptable? |
here: # mgr["C"] is created by not assigned, since we don't know where to assign the data
# along the first 2 axes (detector and sample).
mgr["C"] = np.ones((1, 1, 3, 4), dtype=np.int32)
mgr["C"][1, 2:3, 0:2, 0:2] = 5
print("mgr['C'] = \n", mgr["C"]) This case is not supported, the user needs to initialize the array in 2 ways:
so the use case is: # provide just 1 timeline, it will be copied to all detectors, we should support both 3D and 4D
mgr["C"] = np.ones((mgr.n_samples, 3, 4), dtype=np.int32)
# or
mgr["C"] = np.ones((1, mgr.n_samples, 3, 4), dtype=np.int32)
mgr["C"][1, 2:3, 0:2, 0:2] = 5
print("mgr['C'] = \n", mgr["C"]) inside |
Ok, no problem that sounds good. Will work on implementing and addressing your other feedback. |
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
* Various fixes and features for dealing with realistic data. * Move HWPSS utility functions into a separate source file * Add a new CalibrateDetectors operator which takes a dictionary of factors to apply per observation. * Change detector timeconstant deconvolution to use serial rather than batched FFTs by default. This reduces memory footprint in the common case where most parallelism comes from MPI. * Add option to AzimuthIntervals to also cut extraneous long intervals * Add new operator to simultaneous remove a Maxipol style HWPSS template and build relative calibration factors from the 2f magnitude * New operator for estimating HWP synchronous signal. This work introduces a new operator that can model HWPSS in various ways, and optionally estimate the relative gains between detectors based on the 2f harmonics. Features include: - Detection and flagging of samples with a stopped HWP. - Model coefficients can be estimated on the whole observation, on fixed-length chunks, or on pre-defined intervals. The chunk-wise model coefficients are then smoothly interpolated over the observation. - Model can optionally include a time drift separate from the chunking. - The 2f harmonic is used to estimate the relative gain between detectors, either as a fixed value per observation or as a continuous calibration timestream. This can be used to flag outlier detectors and generate calibration tables / timestreams for application. - Includes proper treatment of flagged samples in normalization of model coefficients and covariance. - Extensive optional debug plots. * Fix unit tests * Fix hwpss destriping template unit test * - Add helper methods for determining if an observation is distributed purely by detector or sample. Use these everywhere in the code that currently does this manual check. - Extend flagging of stopped HWP to also include acceleration / deceleration periods. Add unit test for this functionality. - Generate an error when all fit chunks fail. - Handle case where only one chunk fails. * Fix typo * Remove support for python-3.8, which has reached end of life. * Bump requirements to python-3.9.
Fix not using the `det_mask` input value when selecting detectors.
Change default bit mask to `defaults.det_mask_invalid`. `defaults.det_mask_nonscience` exclude turnaround, it is valuable to deglitch or jump correct the turnarounds as well.
scipy.signal.convolve will choose the fastest convolve method (direct or fft) based on scipy.signal.choose_conv_method result.
Add missing while loop, that iteratively find jumps.
Move `len(mytoi)` outside of loop to avoid repeated calculation.
…efault bit mask (#794) * Fix not using given `det_mask` bit mask Fix not using the `det_mask` input value when selecting detectors. * Change default bit mask Change default bit mask to `defaults.det_mask_invalid`. `defaults.det_mask_nonscience` exclude turnaround, it is valuable to deglitch or jump correct the turnarounds as well.
Jump finder fix
* Update bundled pybind11 and random123 to latest versions for better compatibility with clang++ on arm64. * Add __init__.py files to data directories to silence warnings. * Use importlib.resources.as_file() instead of pkg_resources for compatibility with python-3.12. * Build wheels with numpy-2.0.x, which is backwards compatible at runtime with numpy-1.x and also compatible with numpy-2.1.x. * When building suitesparse for wheels, remove patch and use the cmake system to enable only cholmod and build everything (rather than use the archaic Makefiles directly). * Lift runtime requirements on suitesparse and numpy since we are now compatible with the latest versions. * Bump versions of vendored OpenBLAS and suitesparse. * For wheels on macos, build our own openblas rather than use libscipy_openblas, which has strange symbol name mangling that does not seem to work with clang++. * In the unit test workflow, add python-3.12 tests and also run tests on macos arm64. * In the wheel test and deploy frameworks, add python-3.12 and macos arm64 to the build matrix.
* Fix HEALPix unit conversion at load time * Fix another place where unit conversion might have failed
Add input option `njump_limit`. If there are more than `njump_limit` jumps in one detector data, then flag the detector and time stream as invalid.
ops.SimpleJumpCorrect: Flag detector invalid if too many jumps.
* Create an all-inclusive example unit test for ground-based data. This work aims to solve two problems: - Lack of an example unit test for ground telescope data which can be used as a starting point (by removing various things) for future unit tests. - Inconsistent techniques for "scanning from a map" across many unit tests. Basically I have removed ad-hoc code for generating fake sky signal and scanning into timestreams and centralized that in several unit test helper functions. * Fix modified unit tests. * Fix monopole subtraction when plotting wcs maps * Remove stale, commented code
bugfix error message in sim_ground_utils elevation modulation
* Improvements to AzimuthIntervals Attempt to better handle azimuth data that is flagged and data when the telescope is not scanning. * Improve robustness of numerical derivative in presence of noise and flags. Add unit test. * Tweak polynomial order in turnaround finding * Fix bug when skipping zero-length objects from numpy.split * Do not purge original data in demodulation unit tests, since we are making comparison plots of the original. * Fix typo * Pull common gap filling code out of multiple operators and place it in utils. * Small tweak to support saving / loading jump information * Fix a typo and use a better set of parameters for one case of gap filling. * Add small operator to wrap the gap filling utility. Change deglitch defaults to not wipe exisiting detector flags. * Fix typo * Debug test failure * Add unit test for gap filling. Also fix a flag interpretation issue. * Add missing file * Tweak noise levels in azimuth data in unit tests
* Add a new, general purpose convolution function In several places throughout the code we perform Fourier domain convolution and use a variety of techniques. This work aims to have a common function for some of these use cases which carefully handles the symmetric extension of data in the Fourier domain buffer and is unit tested. Changes include: - New toast.fft.convolve function that can take precomputed kernels or use a callback function to generate the kernels. Unit tests that verify roundtrip and expected sample phase shift introduced by a Butterworth filter. - Porting of the timeconstant deconvolution operator to this new function. - Small unrelated fix to toast timing plots - Unrelated fixes to DetectorData indexing. In a couple places we were triggering data copying by using numpy "advanced indexing". * Remove numpy2 style output API
* Initial commit * Fully functional with additional catalog format checking * Unit test now produces correct polarization angles * Add SimCatalog to ground simulation workflow * Fix cooler target * Fix typo * Remove debugging statement * Remove doubled copyright statement * The reference frame is Celestial, not Horizontal * Remove another stale debugging statement
* Include python-3.13 in test matrix. Bump release version. * conda-build is no longer needed * Check wheel tests. * Use tomlkit instead of toml, to avoid bringing in a new, trivial dependency. * Another place where toml was used instead of tomlkit * Arrays are not compatible with toml table * Restore normal testing
* Implement a new targeting mode: MaxDepth and fix a bug in checking for Solar avoidance * Clean up script * Add weighted horizontal patch target * Limit SSO avoidance checks to add_scan() * Enable hit tracking for the max-depth targets * Fix missing variable
Update plot focalplane function
Hi @zonca and @keskitalo, this PR is for your feedback on API changes that we discussed offline. In addition to looking at the source, I have been updating the
tutorials/01_Introduction/intro.ipynb
notebook as a "look and feel" example. I have attached a rendered version of the notebook:intro.pdf
Main features are:
Observation class as the new data model, with
detdata
,shared
,intervals
, andview
attributes as the key places where the contents are influenced.Improved processing model with changes to the
Operator
class and the newPipeline
operator. These classes are configured withtraitlets
.New distributed map classes (
PixelDistribution
andPixelData
) which split the calculation of the distribution from the actual data storage. These have the new Alltoallv communication pattern.There are only 2-3 operators that have been ported to the new API as a demo. I'll continue on some config file work that needs to be updated since the switch to traitlets.