Added new features to the ndcube.add method #794

PCJY · 2024-12-11T15:32:25Z

PR Description

This PR aims to fix issue #734 created by @DanRyanIrish .

write down the scenarios.

(list, to see what has been covered)

Testing scenarios without mask

redrafting the tests.

One and only one of them has a unit. [Error]
Both NDCube and NDData have unit
Both have uncertainty.
NDCube has uncertainty.
NDData has uncertainty.
Neither has uncertainty.
Neither has a unit.
Both have uncertainty.
NDCube has uncertainty.
NDData has uncertainty.
Neither has uncertainty.

name them again:
test_cube_add_cube_unit_mask_nddata_unc_unit_mask

Handling Masks

Determing data result of operation

The operation_ignores_mask kwarg determines the resulting data value when adding objects, and accounting for mask values. Below are the different scenarios for the addition of a single pixel NDCube, named cube with a data value of 1, and a single pixel NDData object, named nddata with a data value of 2:

my draft
what this means:
when OIM is T, does it ignore the mask after the addition or during the addition?
during the addition, because we want to see the arithmetic operation results here but not the mask. if it deals with the mask after the addition, there would be no need to check the arithmetic values here
no need to do boolean operations here for the masks?
this is why mask handling and arithmetic operation are separate with each other.
data and mask are two separate things: First data, then mask;
addition is done on both the data and the mask.

cube.data = 1, nddata.data = 2

cube.mask	nddata.mask	operation_ignores_mask	resulting data value
F	F	T	3
F	F	F	3
T	F	T	3
T	F	F	2
F	T	T	3
F	T	F	1
T	T	T	3
T	T	F	None

What the distinct cases are:
When OIM is T, the result is always 3, i.e. the actual result of the addition of the two data values.
When OIM is F, the result is always the addition result of any value with its corresponding mask being F (when there is not one, the result is None).

Determining the new mask value produced by the operation

The handle_mask kwarg takes a function which determines the new mask value resulting from the addition. While this function can be anything that takes two boolean arrays and outputs a resulting boolean array of the shape, the most commonly used are expected to by numpy.logical_and and numpy.logical_or. Since the user supplies the function, the resulting mask can be implemented something like:

new_mask = handle_mask(self.mask, value.mask) if handle_mask else None

My understanding:
handle_mask is a function,
as long as it has a value, it will be True, and does the operation on the two masks,
otherwise, the new_mask value would be None, meaning:
1), the user did not set anything for the handle_mask kwarg, should an error be raised here?
2), there is no need to set any value for the handle_mask kwarg.

ndcube/ndcube.py

DanRyanIrish · 2024-12-11T16:38:16Z

ndcube/ndcube.py

+            # addition
+            new_data = self.data + value_data


The addition should be done as part of the masked array addition. You've already done this below, you just need to extract the added data from the results as well as the mask.

DanRyanIrish · 2024-12-12T11:52:40Z

ndcube/ndcube.py

+            return self._new_instance(
+                data=new_data, uncertainty=new_uncertainty, mask=new_mask
+            )


Instead of having a separate return here for the NDData case, I think we should build a dictionary of kwargs that we can give self._new_instance, here. So, you can create an empty kwargs dictionary at the start of the method, and add the new data, uncertainty, etc. in the relevant place, e.g.

kwargs["uncertainty"] = new_uncertainty

Then the final line of the method would become

return self._new_instance(**kwargs)

Let me know if this doesn't make sense

…thmetic

Co-authored-by: DanRyanIrish <[email protected]>

…o nddataArithmetic and further modify the _add_ method.

DanRyanIrish · 2024-12-19T09:58:08Z

ndcube/ndcube.py

            if self.uncertainty is not None and value.uncertainty is not None:
                new_uncertainty = self.uncertainty.propagate(
-                    np.add, value.uncertainty, correlation=0
+                    np.add, value.uncertainty, result_data = value.data, correlation=0


The result_data needs to be the result of the operation. So, assuming you moved the addition of the datas using the masked array to before the uncertainty propagation, you could do:

Suggested change

np.add, value.uncertainty, result_data = value.data, correlation=0

np.add, value.uncertainty, result_data = kwargs["data"], correlation=0

DanRyanIrish · 2024-12-19T10:01:20Z

ndcube/ndcube.py

            # combine mask
            self_ma = np.ma.MaskedArray(self.data, mask=self.mask)
            value_ma = np.ma.MaskedArray(value_data, mask=value.mask)
+
+            # addition
            result_ma = self_ma + value_ma
-            new_mask = result_ma.mask
+
+            # extract new mask and new data
+            kwargs["mask"] = result_ma.mask
+            kwargs["data"] = result_ma.data


As mentioned in above comment, I think it makes sense to do this before the uncertainty propagation so you can use the kwargs["data"] value in that propagation.

DanRyanIrish · 2024-12-19T10:10:57Z

ndcube/ndcube.py

+            kwargs["data"] = result_ma.data
+
+            # return the new NDCube instance
+            return self._new_instance(**kwargs)


Move this line to the end of the method and use the kwargs approach when handling the other cases, e.g. Quantity. So, for example, L1082 would become:

kwargs["data"] = self.data + value.to_value(cube_unit)

DanRyanIrish · 2024-12-19T10:13:21Z

A changelog file needs to be added.

And your branch needs to be updated with the latest version of main.

…thmetic

PCJY · 2025-01-18T22:01:37Z

Hi @DanRyanIrish, as we have discussed in our project meetings, below are the issues we encountered and may need further discussions with others in the community:

The issue is mainly around how NumPy handles masks when performing an addition for two NumPy.MaskedArray.
We think the expected outcome for an addition should be: the sum of any value that is not masked by its individual mask.
E.g. [1] ([T]) + [2] ([F]) = [2].

However, from experimentation, it can be seen that
NumPy returns in this way:
[1] ([T]) + [2] ([F]) = [1].

I find this confusing because even if it does combine the mask and then apply it on the result, it should be:
[1] ([T]) + [2] ([F]) = [-].

Please correct me if there is anything wrong in my understanding.

PCJY · 2025-01-18T22:27:46Z

@DanRyanIrish, Secondly, we also encountered some issues around the propagate method:
it ignores the mask of the objects that are passed in, and still takes into account the uncertainties of the masked elements when it should not have done so.
Following your guidance and suggestions, this issue is currently being worked on by setting the corresponding entries that should be masked of the uncertainty array to be 0, before passed in to the propagate method.
A clearer example of the issue was implemented as shown in the code below with the screenshot of its output attached.

from ndcube import NDCube
import numpy as np
from astropy.nddata import StdDevUncertainty
from astropy.wcs import WCS

data = np.array([[1, 2], [3, 4]])  
uncertainty = StdDevUncertainty(np.array([[0.1, 0.2], [0.3, 0.4]])) 
mask = np.array([[False, True], [False, False]])  
wcs1 = WCS(naxis=2) 
wcs1.wcs.ctype = ["HPLT-TAN", "HPLN-TAN"]

cube = NDCube(data, wcs=wcs1, uncertainty=uncertainty, mask=mask)
print(cube)

def add_operation(cube1, cube2):
    """
    Example function to add two data arrays with uncertainty propagation.
    """
    result_data = cube1.data + cube2.data 
    # Propagate the uncertainties using the NDCube objects
    propagated_uncertainty = cube1.uncertainty.propagate(
        np.add, cube2, result_data=result_data, correlation = 0
    )
    return result_data, propagated_uncertainty

# adding the cube to itself
result_data, propagated_uncertainty = add_operation(cube, cube)

print("Original Data:\n", cube.data)
print("Original Uncertainty:\n", cube.uncertainty.array)
print("Result Data (after addition):\n", result_data)
print("Propagated Uncertainty:\n", propagated_uncertainty.array)

DanRyanIrish · 2025-01-20T11:31:10Z

Hi @PCJY. I think the first thing we need to do is decide what behaviours we want to implement in the case where at least one of the NDCube and NDData have a mask. I think we need to get some feedback from other users on this decision. I propose the following scheme (@Cadair, thoughts on this?):

Firstly, if an object has no mask, that is equivalent to all pixels being unmasked.
Secondly, for a given pixel in both objects:

If both are unmasked, the resultant
i. data value is the sum of both pixels
ii. mask value is False
iii. uncertainty value is the propagation of the two uncertainties. If one or other object doesn't have uncertainty, the uncertainty of that component is assumed to be 0.
If it is masked in one object, but not the other, the resultant:
i. data value is equal to the unmasked value
ii. mask value is False
iii. uncertainty value is the same as the unmasked pixel
If both pixels are masked, this is where is gets ambiguous. I propose, in order to remain consistent with the above:
i. The operation is not performed and the data, mask and uncertainty values remain the same as the left-hand operand, i.e. the NDCube.

Alternatives for parts of the scheme could include:
2. If it is masked in one object, but not the other, the resultant:
ii. mask value is True.

If both pixels are masked:
i. The operation IS performed as normal but the mask value is True.

Once we agree on a scheme, the way forward on your uncertainty questions will become clear.

@Cadair what are your thoughts on this scheme. I also think we should bring this up at the sunpy weekly meeting to get other thoughts.

DanRyanIrish · 2025-01-20T11:46:55Z

I find this confusing because even if it does combine the mask and then apply it on the result, it should be: [1] ([T]) + [2] ([F]) = [-].

This is where I also find numpy masked arrays counter-intuitive. However, the logic is as follows:

If one pixel is masked, retain the data of the left-hand operand and set the mask value to True.
Notice that order of the operands matters. Because you did [1] ([T]) + [2] ([F]), the results is [1] ([T]), which is displayed as [--]. I would expect if you did the operation the other way around ([2] ([F]) + [1] ([T])), the result would be [2] ([T]).

Notice that this is not the same as the scheme I've proposed in my previous comment, in part because it's confusing, as you've found.

DanRyanIrish · 2025-01-20T11:52:08Z

@PCJY, until we agree a way forward with the mask, you should proceed by implementing in the case for which neither object has a mask. So no need for masked arrays.

…thmetic

Cadair · 2025-01-21T09:25:46Z

I propose the following scheme

I haven't thought too much each of these individual cases, but the fact there is a list is enough to make me think we probably need a way for the user to choose. This is obviously not possible with the __add__ operator, so we would need to have a NDCube.add method (and presumably subtract) which accepted kwargs.

Is this also not a problem for other operators as well?

DanRyanIrish · 2025-01-21T09:45:44Z

I think this is a good idea. As well as add and subtract, I think we would also need multiply and divide methods.

As far as I can see, this ambiguity only arises when there are masks involved. So we could still implement the dunder methods as wrappers around the above methods, but have the raise and error/return NotImplemented if the non-NDCube operand has a mask, and require users use the NDCube.add method instead.

Co-authored-by: DanRyanIrish <[email protected]>

make kwargs ndcube and return it Co-authored-by: DanRyanIrish <[email protected]>

Co-authored-by: DanRyanIrish <[email protected]>

what's needed to be done regarding units is checking whether units assigned are consistent, if users do assign it Co-authored-by: DanRyanIrish <[email protected]>

Co-authored-by: DanRyanIrish <[email protected]>

assert_cubes_equal(), check masks equal or not, in two scenarios Co-authored-by: DanRyanIrish <[email protected]>

Added new features to the ndcube._add_ method

9c38077

DanRyanIrish reviewed Dec 11, 2024

View reviewed changes

DanRyanIrish reviewed Dec 12, 2024

View reviewed changes

PCJY and others added 5 commits December 16, 2024 23:41

Merge branch 'main' of https://github.com/sunpy/ndcube into nddataAri…

1d5d2ab

…thmetic

Update ndcube/ndcube.py

aaa9ef0

Co-authored-by: DanRyanIrish <[email protected]>

Update ndcube/ndcube.py

ed4f61e

Co-authored-by: DanRyanIrish <[email protected]>

Modified the _add_ method further.

ea43a1d

Merge branch 'nddataArithmetic' of https://github.com/PCJY/ndcube int…

a891ff9

…o nddataArithmetic and further modify the _add_ method.

nabobalis added this to the 2.4.0 milestone Dec 18, 2024

DanRyanIrish reviewed Dec 19, 2024

View reviewed changes

PCJY added 8 commits December 23, 2024 14:10

Further modifies the _add_ method.

f575e2c

Added a changelog file for this new feature.

8951635

Merge branch 'main' of https://github.com/sunpy/ndcube into nddataAri…

58e4363

…thmetic

Added a new method test_cube_add_uncertainty_and_mask to test_ndcube.py.

bcf4fb9

Modified the test_cube_add_uncertainty_and_mask method in test_ndcube.py

c4d639a

Modified the test_cube_add_uncertainty_and_mask further.

bd317e3

Fixed how the masks are combined.

e0375ec

Set masked uncertainty entries to 0.

0158737

PCJY and others added 3 commits January 21, 2025 08:51

Moved uncertainty combination out of the mask-combining If Statements.

9074f45

Merge branch 'main' of https://github.com/sunpy/ndcube into nddataAri…

9e267d3

…thmetic

Merge branch 'main' into nddataArithmetic

d8c2db9

PCJY and others added 30 commits March 17, 2025 13:44

Update ndcube/ndcube.py

eb3d0e8

Co-authored-by: DanRyanIrish <[email protected]>

Implementing the fill_masked method.

bad2a26

Merge branch 'main' of https://github.com/sunpy/ndcube into NDCubefill

0cd0e60

About units.

468fcb2

Further implementing, preparing for testing.

a35feeb

Added test for the fill_masked method.

0cbfbd7

Fixed error about docstring.

0d4055f

Changed the docstring again.

be4639a

Update ndcube/ndcube.py

74c648e

make kwargs ndcube and return it Co-authored-by: DanRyanIrish <[email protected]>

Update ndcube/conftest.py

b7e99bd

Co-authored-by: DanRyanIrish <[email protected]>

Update ndcube/ndcube.py

d422668

what's needed to be done regarding units is checking whether units assigned are consistent, if users do assign it Co-authored-by: DanRyanIrish <[email protected]>

Update ndcube/tests/helpers.py

cc65f5d

Co-authored-by: DanRyanIrish <[email protected]>

deal with unmasking after using self.mask

75bf81a

Co-authored-by: DanRyanIrish <[email protected]>

Notes from Meeting.

be1db6e

Modified NDCube.fill_masked method and its tests.

68def22

Debugging

26811c8

Update ndcube/ndcube.py

db285ba

Co-authored-by: DanRyanIrish <[email protected]>

Small changes from previous meeting.

8b6d9bc

Changed test arguments.

1f59a44

Fixing bugs in tests.

c26563a

Fixed coverage issue by adding more test cases.

c8a86a5

Exclude defensive assertions from coverage test.

e60eb24

Merge branch 'main' of https://github.com/sunpy/ndcube into NDCubefill

b7b9f9f

Update ndcube/tests/helpers.py

4983a54

assert_cubes_equal(), check masks equal or not, in two scenarios Co-authored-by: DanRyanIrish <[email protected]>

Changed code for Single-Bool-True-Mask case.

d40e887

Merge branch 'NDCubefill' into nddataArithmetic

bce2bb3

Change fixture.

8070808

Change fixture.

9c7aa14

Merge branch 'NDCubefill' into nddataArithmetic

9753808

Removed the operation_ignore_mask argument and all the logics using it.

6ac02e3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added new features to the ndcube.add method #794

Added new features to the ndcube.add method #794

PCJY commented Dec 11, 2024 •

edited

Loading

DanRyanIrish Dec 11, 2024

DanRyanIrish Dec 12, 2024

DanRyanIrish Dec 12, 2024

DanRyanIrish Dec 19, 2024

DanRyanIrish Dec 19, 2024

DanRyanIrish Dec 19, 2024

DanRyanIrish commented Dec 19, 2024

PCJY commented Jan 18, 2025

PCJY commented Jan 18, 2025 •

edited

Loading

DanRyanIrish commented Jan 20, 2025 •

edited

Loading

DanRyanIrish commented Jan 20, 2025

DanRyanIrish commented Jan 20, 2025 •

edited

Loading

Cadair commented Jan 21, 2025

DanRyanIrish commented Jan 21, 2025

	np.add, value.uncertainty, result_data = value.data, correlation=0
	np.add, value.uncertainty, result_data = kwargs["data"], correlation=0

Added new features to the ndcube.__add__ method #794

Are you sure you want to change the base?

Added new features to the ndcube.__add__ method #794

Conversation

PCJY commented Dec 11, 2024 • edited Loading

PR Description

Testing scenarios without mask

Handling Masks

Determing data result of operation

Determining the new mask value produced by the operation

DanRyanIrish Dec 11, 2024

Choose a reason for hiding this comment

DanRyanIrish Dec 12, 2024

Choose a reason for hiding this comment

DanRyanIrish Dec 12, 2024

Choose a reason for hiding this comment

DanRyanIrish Dec 19, 2024

Choose a reason for hiding this comment

DanRyanIrish Dec 19, 2024

Choose a reason for hiding this comment

DanRyanIrish Dec 19, 2024

Choose a reason for hiding this comment

DanRyanIrish commented Dec 19, 2024

PCJY commented Jan 18, 2025

PCJY commented Jan 18, 2025 • edited Loading

DanRyanIrish commented Jan 20, 2025 • edited Loading

DanRyanIrish commented Jan 20, 2025

DanRyanIrish commented Jan 20, 2025 • edited Loading

Cadair commented Jan 21, 2025

DanRyanIrish commented Jan 21, 2025

Added new features to the ndcube.add method #794

Added new features to the ndcube.add method #794

PCJY commented Dec 11, 2024 •

edited

Loading

PCJY commented Jan 18, 2025 •

edited

Loading

DanRyanIrish commented Jan 20, 2025 •

edited

Loading

DanRyanIrish commented Jan 20, 2025 •

edited

Loading