-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow non-active scalars to be modified in python operators #2061
base: master
Are you sure you want to change the base?
Conversation
Update: this problem is part of the master branch, and it is not new here. It is now documented as part of #2075. |
6018bca
to
fe7fbfd
Compare
ce045ee
to
514cc36
Compare
This PR is now functional. It has some bigger changes to the pipeline in 514cc36, which were needed primarily to make the behavior of child-producing operators consistent (#2075), which are needed because multi-scalars operators use child-producing operators. These changes are also a step towards saving intermediate data sources. The changes in the pipeline result in the snapshot operator not working. But the addition of intermediate data sources should bring the snapshot operator back (since the snapshot operator is essentially creating an intermediate data source). |
643374c
to
ea9ce44
Compare
Previously, the child dataset for the external pipeline would create a deep copy of the parent dataset. This was not done in the internal dataset. Instead, in the internal dataset, the child dataset would be empty, and it would only have spacing and dimensions that matched that of the parent (with the exception of when the parent was a tilt series, then the spacing of the tilt axis of the child would be altered). Additionally, in the internal dataset, if the active_scalars were set on the empty child dataset, a new scalars named 'Scalars' would be created. The changes in this commit cause the external dataset to match the behavior of the internal dataset, in that, the child dataset is an empty dataset with the same spacing as that of the parent (except for the tilt series parent case), and if the active scalars get set on an empty child dataset, a new scalars named 'Scalars' would be created. Signed-off-by: Patrick Avery <[email protected]>
Previously, child data sources from python operators could only be volumes. However, new changes coming soon will allow child data sources to be tilt series as well. Add in a check so that a child data source is appropriately marked as a tilt series if it has tilt angles. Signed-off-by: Patrick Avery <[email protected]>
In the python operator "Dataset" classes, allow the non-active scalars to be modified via "set_scalars()". If there are scalars already present in the dataset, and the newly set scalars have different dimensions compared to the other scalars, the other scalars are discarded. Additionally, this allows a child dataset to be either a volume or another tilt series, so that the user can indicate whether the tilt angles and spacing should be preserved. Signed-off-by: Patrick Avery <[email protected]>
In the EMD format, the spacing of the tilt axis was previously calculated by comparing the distance between the first and second angles. Other formats, though, like the GenericHDF5 format, do not give the tilt axis a spacing. For consistency, and so that the data does not change spacing when a GenericHDF5 file is loaded and used in an external pipeline, use a spacing of 1.0 both in the EMD format, and in the pipeline executor. Signed-off-by: Patrick Avery <[email protected]>
For the internal pipeline, try to set the child data sources' active scalars to match that of the parent if possible. This prevents the active scalars changing due to operations that produce child data sources. Signed-off-by: Patrick Avery <[email protected]>
When live updates are performed on data that contains multiple scalars, the number of scalars can change in between updates. Add in a safety check in the DataPropertiesPanel to account for this. This prevents a crash. Signed-off-by: Patrick Avery <[email protected]>
Signed-off-by: Patrick Avery <[email protected]>
Signed-off-by: Patrick Avery <[email protected]>
This is for a little more general use than the previous "Pipeline::moveModulesDown()" was. Signed-off-by: Patrick Avery <[email protected]>
Signed-off-by: Patrick Avery <[email protected]>
Previously, if other operators were performed after a child-producing operator (such as reconstruction), the behavior would be somewhat unpredictable, depending on several factors, including whether a description.json file was present, and whether the pipeline was internal or external. This fixes the behavior so that operators can be performed after child-producing operators. This was done by making sure that the output from child-producing operators gets passed down as input to the next operator. This needed to be done both in the external pipeline, and in the internal pipeline. Additionally, this also required that modules be passed down to the output of the next operator as well. This commit additionally allows child-producing operators to work in the internal pipeline without the description.json file present, which was not possible before (it required that the child be named explicitly). The external pipeline should not emit that the pipeline is finished if a child datasource is encountered. That was removed. The snapshot operator does not have any useful behavior, currently. But it should work well with intermediate data sources, since it will essentially create an intermediate data source. Signed-off-by: Patrick Avery <[email protected]>
ssize_t is producing an error on Windows. Signed-off-by: Patrick Avery <[email protected]>
This ensures the modules get cleaned up. Signed-off-by: Patrick Avery <[email protected]>
To provide an update on this branch: most things seem to be working very well. You can perform multi-scalars operations (which produce children) and non-child-producing operators, and they can both be used as a part of the pipeline, internal and external. You can add/remove operators as well - the pipeline appears to be flexible. There may be other bugs we haven't found yet, but the one big bug that's hanging up this PR is that, if you have operators in your pipeline, and you close the application, it crashes. Unfortunately, the backtrace from gdb varies from run to run, and it is not very clear what the problem is. Most likely, it is some kind of memory issue (deleting an object twice, accessing an array out-of-bounds, etc.). Once we figure out what the issue is, this will hopefully be ready to merge. |
There is a new datasource every time now, so this is not necessary. Additionally, this is causing crashing when closing the application when operators are present, and it is also causing crashing occasionally when deleting datasources that have operators. Signed-off-by: Patrick Avery <[email protected]>
ea9ce44
to
b4616b7
Compare
The bug seems to be fixed. This PR now seems fairly robust - I have a hard time making it crash. I did encounter one crash, though, that is not fixed: If you have two operators, and a volume module on the output, and you delete the first operator, it crashes. This doesn't occur for me with the slice/contour modules, nor with only one operator present, nor by deleting the second operator. It's a very specific crash it seems. Backtrace shows it has something to do with the histogram:
|
Here are a few things to think about for the design of this branch when we return to it:
If a child data source was not created, it is assumed that the operator's input In the master branch, if a child data source was created, the input
When we introduce intermediate/persistent data sources into the pipeline chain, however, the intermediate data sources will not be moved up/down when the operators are removed/added, but a new one will need to be created. It's something we'll need to consider, and we might need to consider moving modules to/from the intermediate data sources when they are created/destroyed. |
@psavery We should revisit this and get it merged. |
This primarily adds a new function,
dataset.set_scalars()
, which allows the user to modify non-active scalars in the python operators. Additionally, the user can add new scalars withdataset.set_scalars()
.If there are other scalars that are present in the dataset when
dataset.set_scalars()
is called, and if the dimensions of the other scalars do not match the dimensions of the new scalars, the other scalars are discarded. This is because all scalars must have the same dimensions.Some other new parts from this PR:
dataset.create_child_dataset()
, because that isn't consistent with the internal dataset.dataset.create_child_dataset()
, to specify whether the child should be a volume (default is True) or not. Child datasets that aren't volumes will retain their tilt angles.Some examples of operations involving multiscalars are attached here:
multiscalars_operators.tar.gz
They include:
Some caveats:
For the internal pipeline, a description.json file must be present that specifies what the result will be named in the results dict, or else it won't read the output. This is not a requirement for the external pipeline.Live reconstruction (settingThe crash is fixed, but it appears that only the active scalars get updated in the render window during live updates. Need to fix that.self.progress.data
in the python operators) does not currently work for multiple scalars, and causes a crash. This might should be fixed.In the internal pipeline only, if a live update of the data is performed on data in an operator without a description.json file, tomviz crashes.Update: Caveat 1 is part of the master branch (it is not new with this PR), and it is now documented in #2075.
Update: Caveat 3 is also a part of the master branch (it is not new with this PR), but it is not documented anywhere. It is related, however, to an issue mentioned in #2075, where a child data source does not appear to get created properly if there is no description.json file for the internal pipeline.