Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeWarning: divide by zero encountered in divide #1006

Open
NimaSarajpoor opened this issue Jul 13, 2024 · 13 comments
Open

RuntimeWarning: divide by zero encountered in divide #1006

NimaSarajpoor opened this issue Jul 13, 2024 · 13 comments

Comments

@NimaSarajpoor
Copy link
Collaborator

Running the following script

import numpy as np
import stumpy
import time

def test_extract_several_consensus():
    Ts = [np.random.rand(n) for n in [64, 128]]
    Ts_ref = [T.copy() for T in Ts]
    Ts_comp = [T.copy() for T in Ts]

    m = 20

    k = 2  # Get the first `k` consensus motifs
    for _ in range(k):
        print(f'===== {_} =====')
        radius, Ts_idx, subseq_idx = stumpy.ostinato(Ts_comp, m)
    
        consensus_motif = Ts_comp[Ts_idx][subseq_idx : subseq_idx + m].copy()
        for i in range(len(Ts_comp)):
            if i == Ts_idx:
                query_idx = subseq_idx
            else:
                query_idx = None

            idx = np.argmin(stumpy.core.mass(consensus_motif, Ts_comp[i], query_idx=query_idx))

            Ts_comp[i][idx : idx + m] = np.nan
            Ts_ref[i][idx : idx + m] = np.nan

            np.testing.assert_almost_equal(
                Ts_ref[i][np.isfinite(Ts_ref[i])], Ts_comp[i][np.isfinite(Ts_comp[i])]
            )


if __name__ == "__main__":
    test_extract_several_consensus()

Gives:

===== 0 =====
===== 1 =====
/Users/nimasarajpoor/miniconda3/envs/py310/lib/python3.10/site-packages/stumpy/core.py:2257: 
RuntimeWarning: divide by zero encountered in divide

I took a look at the code:

stumpy/stumpy/core.py

Lines 2248 to 2260 in e7bb2b8

T = _preprocess(T, copy)
check_window_size(m, max_size=T.shape[-1])
T_subseq_isfinite = rolling_isfinite(T, m)
T[~np.isfinite(T)] = np.nan
T_subseq_isconstant = process_isconstant(T, m, T_subseq_isconstant)
T[np.isnan(T)] = 0
M_T, Σ_T = compute_mean_std(T, m)
Σ_T[T_subseq_isconstant] = 1.0 # Avoid divide by zero in next inversion step
Σ_T_inverse = 1.0 / Σ_T
M_T_m_1, _ = compute_mean_std(T, m - 1)
return T, M_T, Σ_T_inverse, M_T_m_1, T_subseq_isfinite, T_subseq_isconstant

and tried to swap lines 2252 and 2253, but that did not work.


Also:
If I disable JIT, I get an extra warning, saying:

python3.10/site-packages/numpy/lib/nanfunctions.py:1879: RuntimeWarning: Degrees of freedom <= 0 for slice.
@seanlaw
Copy link
Contributor

seanlaw commented Jul 13, 2024

Hmmm, this seems strange. The error is coming from line 2257:

Σ_T_inverse = 1.0 / Σ_T 

It would be good to dig into what the inputs and outputs are. It suggests that perhaps:

T_subseq_isconstant = process_isconstant(T, m, T_subseq_isconstant) 

isn't capturing/processing the right constant subsequences?

@NimaSarajpoor
Copy link
Collaborator Author

Right! I will take a closer look (Also, I want to see what happens to T_subseq_isconstant when a subsequence is all np.nan)

@NimaSarajpoor
Copy link
Collaborator Author

@seanlaw

and tried to swap lines 2252 and 2253, but that did not work.

Okay. I think this works! (Probably I made a mistake before).

isn't capturing/processing the right constant subsequences?

I can see that this is happening when we have a subsequence with all non-finite values. For example, running the following script throws the divide by zero encountered in divide warning.

import numpy as np

from stumpy import core

# based on the function `stumpy.core.preprocess_diagonal`
def compute_std_inverse(T):  
    T[~np.isfinite(T)] = np.nan
    T_subseq_isconstant = core.process_isconstant(T, m, T_subseq_isconstant=None)
    
    T[np.isnan(T)] = 0
    _, Σ_T = core.compute_mean_std(T, m)
    Σ_T[T_subseq_isconstant] = 1.0
    Σ_T_inverse = 1.0 / Σ_T

    return Σ_T_inverse


if __name__ == "__main__":
    n, m = 10, 3
    T = np.random.rand(n)

    nan_start = n // 2
    T[nan_start : nan_start + m] = np.nan

    compute_std_inverse(T)

we are mixing two things in the line Σ_T[T_subseq_isconstant] = 1.0 . Σ_T is computed for a pre-processed time series T (where all non-finite values are replaced with 0). However, T_subseq_isconstant is NOT computed for that pre-processed time series T.

The reason that we did not get any failure in the unit testing of stumpy.stump is because 1.0 / Σ_T will be used just for computing the pearson correlation down the road (and not the intermediate variables like cov) and, also, we do the Pearson correlation only for subsequence idx when T_subseq_isfinite[idx] is True.


To avoid similar issue in the future, we may at least throw a warning when T in process_isconstant(T, m, ...) has one/more non-finite value(s). We may say something like this:

The input time series `T` contains non-finite values. At those indices, the value of returned array is not meaningful.

If we decide to go with this, then we may need to re-think the usage of stumpy.core.fix_isconstant_isfinite_conflicts

@seanlaw
Copy link
Contributor

seanlaw commented Jul 14, 2024

Okay. I think this works! (Probably I made a mistake before).

What is the logic behind swapping the two lines? There's a reason why it is in its current order and I think the current order matters! So, I am against swapping as it will change the overall/existing logic

we are mixing two things in the line Σ_T[T_subseq_isconstant] = 1.0 . Σ_T is computed for a pre-processed time series T (where all non-finite values are replaced with 0). However, T_subseq_isconstant is NOT computed for that pre-processed time series T.

The reason that we did not get any failure in the unit testing of stumpy.stump is because 1.0 / Σ_T will be used just for computing the pearson correlation down the road (and not the intermediate variables like cov) and, also, we do the Pearson correlation only for subsequence idx when T_subseq_isfinite[idx] is True.

Sorry, I'm still not understanding the problem and how this is caused by ostinato. Please allow me some time to review the code that you shared in your first comment. Perhaps, in the meantime, you can take me step-by-step what is happening in the ostinato code above and what eventually triggers the warning when we modify the original time series with np.nan ?

@NimaSarajpoor
Copy link
Collaborator Author

Part 1

What is the logic behind swapping the two lines?

(I noticed that in ostinato, the story is a bit different. I explained it in the Part 2 below.)
The logic is that T_subseq_isconstant should capture the constant subsequences AFTER setting non-finite values to 0.0.

This logic comes from the answer to this question: "what should one expect from Σ_T[T_subseq_isconstant] = 1.0"? The expectation is: if Σ_T[i] is 0.0, then T_subseq_isconstant[i] must be True, so that Σ_T[i] can be set to 1.0 . However, currently, this expectation is not being met for a non-finite subsequence whose finite values, if there is any, are 0.0.

Example:

T = np.array([np.nan, 0.0, np.nan])
m = 3
T_subseq_isconstant = core.process_isconstant(T, m, T_subseq_isconstant=None)
print(T_subseq_isconstant)

T[np.isnan(T)] = 0
_, Σ_T = core.compute_mean_std(T, m)
print(Σ_T)

Σ_T[T_subseq_isconstant] = 1.0
Σ_T_inverse = 1.0 / Σ_T
print('Σ_T_inverse: ', Σ_T_inverse)

Running code above gives:

T_subseq_isconstant:  [False]
Σ_T:  [0.]

RuntimeWarning: divide by zero encountered in divide Σ_T_inverse = 1.0 / Σ_T
Σ_T_inverse:  [inf]

Another example:

# running the following should give us that warning
n, m = 10, 3
T = np.random.rand(n)
T[:m] = np.nan
    
stumpy.stump(T, m)

Part 2

Perhaps, in the meantime, you can take me step-by-step what is happening in the ostinato code above and what eventually triggers the warning when we modify the original time series with np.nan ?

In ostinato, we can see the following flow (This is a bit different than Part 1 above!!):
(1) process each time series in Ts by applying the function core.preprocess (and we also compute Ts_subseq_isconstant)
(2) pass the processed data and pre-computed Ts_subseq_isconstant to _ostinato
(3) _ostinato calls stump (we pass the processed time series and the pre-computed Ts_subseq_isconstant to it)
(4) In stump, we call core.preprocess_diagonal and pass THAT already-processed data and the pre-computed Ts_subseq_isconstant to it.
(5) core.preprocess_diagonal(T, m, ...) is shown below:

stumpy/stumpy/core.py

Lines 2248 to 2260 in 0107d39

T = _preprocess(T, copy)
check_window_size(m, max_size=T.shape[-1])
T_subseq_isfinite = rolling_isfinite(T, m)
T[~np.isfinite(T)] = np.nan
T_subseq_isconstant = process_isconstant(T, m, T_subseq_isconstant)
T[np.isnan(T)] = 0
M_T, Σ_T = compute_mean_std(T, m)
Σ_T[T_subseq_isconstant] = 1.0 # Avoid divide by zero in next inversion step
Σ_T_inverse = 1.0 / Σ_T
M_T_m_1, _ = compute_mean_std(T, m - 1)
return T, M_T, Σ_T_inverse, M_T_m_1, T_subseq_isfinite, T_subseq_isconstant

(Side Note: In this case, T is already processed and has ONLY finite value. More on this on Part 3).

In line T_subseq_isconstant = process_isconstant(T, m, T_subseq_isconstant) we are passing THAT processed T and the pre-computed T_subseq_isconstant. process_isconstant returns the pre-computed T_subseq_isconstant. Although the processed data T has 0.0 instead of np.nan, the T_subseq_isconstant is False for any subsequence that was originally non-finite. Therefore, for subsequence idx, T_subseq_isconstant[idx] can be False while Σ_T[idx] is 0.0. So, Σ_T_inverse = 1.0 / Σ_T can throw a warning again here.

Example:

# running the following gives us the warning
n, m = 10, 3

TA = np.random.rand(n)
TB = np.random.rand(n)
TC = np.random.rand(n)
TA[:m] = np.nan

stumpy.ostinato([TA, TB, TC], m)

Here, I also see an extra warning:

tumpy/core.py:3467: UserWarning: A large number of values in `P` are smaller than 1e-06.
For a self-join, try setting `ignore_trivial=True`.

Part 3

As mentioned in Part 2, _ostinato passes preprocessed data to stump. Therefore, the computed matrix profile is not correct for a subsequence that was originally non-finite. However, I think, later in the code, the core._mass (see line 259 below) will take care of it.

stumpy/stumpy/ostinato.py

Lines 239 to 276 in 0107d39

mp = partial_mp_func(
Ts[j],
m,
Ts[h],
ignore_trivial=False,
T_A_subseq_isconstant=Ts_subseq_isconstant[j],
T_B_subseq_isconstant=Ts_subseq_isconstant[h],
)
si = np.argsort(mp[:, 0])
for q in si:
radius = mp[q, 0]
if radius >= bsf_radius:
break
for i in range(k):
if i != j and i != h:
QT = core.sliding_dot_product(Ts[j][q : q + m], Ts[i])
radius = np.max(
(
radius,
np.min(
core._mass(
Ts[j][q : q + m],
Ts[i],
QT,
M_Ts[j][q],
Σ_Ts[j][q],
M_Ts[i],
Σ_Ts[i],
Ts_subseq_isconstant[j][q],
Ts_subseq_isconstant[i],
)
),
)
)
if radius >= bsf_radius:
break
if radius < bsf_radius:
bsf_radius, bsf_Ts_idx, bsf_subseq_idx = radius, j, q

This is because the corresponding value of such subsequence in M_T above is np.inf which affects the distance here:

stumpy/stumpy/core.py

Lines 1093 to 1106 in 0107d39

if np.isinf(M_T) or np.isinf(μ_Q):
D_squared = np.inf
elif Q_subseq_isconstant and T_subseq_isconstant:
D_squared = 0
elif Q_subseq_isconstant or T_subseq_isconstant:
D_squared = m
else:
denom = (σ_Q * Σ_T) * m
denom = max(denom, config.STUMPY_DENOM_THRESHOLD)
ρ = (QT - (μ_Q * M_T) * m) / denom
ρ = min(ρ, 1.0)
D_squared = np.abs(2 * m * (1.0 - ρ))

So, we will eventually end up with correct value for bsf_radius.

Example: Running the following script

def func():
    Ts = [
        np.random.rand(100), 
        np.random.rand(100),
        np.random.rand(100)
    ]

    m = 3
    Ts[0][:m] = np.nan

    Ts_subseq_isconstant = None
    
    Ts_copy = [None] * len(Ts)
    M_Ts = [None] * len(Ts)
    Σ_Ts = [None] * len(Ts)
    if Ts_subseq_isconstant is None:
        Ts_subseq_isconstant = [None] * len(Ts)

    for i, T in enumerate(Ts):
        Ts_copy[i], M_Ts[i], Σ_Ts[i], Ts_subseq_isconstant[i] = core.preprocess(
            T, m, copy=True, T_subseq_isconstant=Ts_subseq_isconstant[i]
        )

    mp = stumpy.stump(
            Ts_copy[0],
            m,
            Ts_copy[1],
            ignore_trivial=False,
            T_A_subseq_isconstant=Ts_subseq_isconstant[0],
            T_B_subseq_isconstant=Ts_subseq_isconstant[1],
        )

    print('matrix profile value for the first subsequence in Ts[0]: ', mp[0, 0])

if __name__ == "__main__":
    func()

gives:

core.py:2257: RuntimeWarning: divide by zero encountered in divide
  Σ_T_inverse = 1.0 / Σ_T

matrix profile value for the first subsequence in Ts[0]:  0.0

However, the matrix profile value should have been np.inf.

@NimaSarajpoor
Copy link
Collaborator Author

NimaSarajpoor commented Jul 16, 2024

(I noticed that in ostinato, the story is a bit different. I explained it in the Part 2 below.)
The logic is that T_subseq_isconstant should capture the constant subsequences AFTER setting non-finite values to 0.0

I think this logic is aligned with computing M_T, Σ_T here:

stumpy/stumpy/core.py

Lines 2248 to 2260 in 0107d39

T = _preprocess(T, copy)
check_window_size(m, max_size=T.shape[-1])
T_subseq_isfinite = rolling_isfinite(T, m)
T[~np.isfinite(T)] = np.nan
T_subseq_isconstant = process_isconstant(T, m, T_subseq_isconstant)
T[np.isnan(T)] = 0
M_T, Σ_T = compute_mean_std(T, m)
Σ_T[T_subseq_isconstant] = 1.0 # Avoid divide by zero in next inversion step
Σ_T_inverse = 1.0 / Σ_T
M_T_m_1, _ = compute_mean_std(T, m - 1)
return T, M_T, Σ_T_inverse, M_T_m_1, T_subseq_isfinite, T_subseq_isconstant

as the line M_T, Σ_T = compute_mean_std(T, m) occurs after T[np.isnan(T)] = 0 .


In core.preprocess, we do all these computation (i.e. M_T, Σ_T, AND T_subseq_isconstant) before T[np.isnan(T)] = 0 (see below)

stumpy/stumpy/core.py

Lines 2135 to 2142 in 0107d39

T[np.isinf(T)] = np.nan
T_subseq_isconstant = process_isconstant(T, m, T_subseq_isconstant)
if M_T is None or Σ_T is None:
M_T, Σ_T = compute_mean_std(T, m)
T[np.isnan(T)] = 0
return T, M_T, Σ_T, T_subseq_isconstant

I feel it make sense as long as we do all these relevant computation next to each other (and provide clear description in the docstring / comment)

@seanlaw
Copy link
Contributor

seanlaw commented Jul 18, 2024

The logic is that T_subseq_isconstant should capture the constant subsequences AFTER setting non-finite values to 0.0.

The more I read this, the more I disagree. While it might seem "smart" to do it this way, it is polluting the meaning of T_subseq_isconstant. The vector T_subseq_isconstant should ONLY be capture whether or not each subsequence within the ORIGINAL time series, T, is constant. Therefore, it is 100% correct that [np.nan, 0.0, np.nan] is NOT a constant subsequence and we should insist upon this! Otherwise, it will lead to a lot of confusion in the future if this is not maintained and kept "pure"

In ostinato, we can see the following flow (This is a bit different than Part 1 above!!):
(1) process each time series in Ts by applying the function core.preprocess (and we also compute Ts_subseq_isconstant)
(2) pass the processed data and pre-computed Ts_subseq_isconstant to _ostinato
(3) _ostinato calls stump (we pass the processed time series and the pre-computed Ts_subseq_isconstant to it)
(4) In stump, we call core.preprocess_diagonal and pass THAT already-processed data and the pre-computed Ts_subseq_isconstant to it.
(5) core.preprocess_diagonal(T, m, ...) is shown below:

If the issue is that Ts_copy[i] is an already preprocessed time series then, perhaps, the solution would be to change ostinato to something like:

    Ts_copy = [None] * len(Ts)
    M_Ts = [None] * len(Ts)
    Σ_Ts = [None] * len(Ts)
    if Ts_subseq_isconstant is None:
        Ts_subseq_isconstant = [None] * len(Ts)

    for i, T in enumerate(Ts):
        _, M_Ts[i], Σ_Ts[i], Ts_subseq_isconstant[i] = core.preprocess(
            T, m, copy=True, T_subseq_isconstant=Ts_subseq_isconstant[i]
        )
        Ts_copy[i] = T.copy()

    bsf_radius, bsf_Ts_idx, bsf_subseq_idx = _ostinato(
        Ts_copy, m, M_Ts, Σ_Ts, Ts_subseq_isconstant
    )

This way Ts_copy[i] would NOT be a processed copy of Ts[i] and, instead, is a RAW copy of Ts[i]. Would that work? The ONLY reason that we are pre-computing M_Ts, Σ_Ts, Ts_subseq_isconstant is to avoid re-computing them later and, instead, we pass in those arrays. However, we can accept the small cost of making a copy of Ts[i].

Then, all of the M_Ts, Σ_Ts, Ts_subseq_isconstant would correspond correctly to the UNPROCESSED Ts_copy and we can add a comment to ostinato to remind the reader that we DON'T use a processed copy of Ts[i] because stump will process it again itself?

I haven't tested this but it captures the spirit of what I am hoping for

I think this logic is aligned with computing M_T, Σ_T here:

Yeah, I don't think I agree that "this is the logic". With the exception of converting non-finite numbers into np.nan, I am adamant that T_subseq_isconstant should reflect the truth/reality of the original raw time series!

@NimaSarajpoor
Copy link
Collaborator Author

NimaSarajpoor commented Jul 20, 2024

Therefore, it is 100% correct that [np.nan, 0.0, np.nan] is NOT a constant subsequence and we should insist upon this! Otherwise, it will lead to a lot of confusion in the future if this is not maintained and kept "pure"

So, in that case:

(i) In core.preprocess_diagonal: should we replace

Σ_T[T_subseq_isconstant] = 1.0 # Avoid divide by zero in next inversion step

with

mask =  np.logical_or(T_subseq_isconstant, Σ_T==0.0)
Σ_T[mask] = 1.0

to avoid getting warning that comes from the line Σ_T_inverse = 1.0 / Σ_T?

Note: This is to avoid getting warning in core.preprocess_diagonal and the functions that call it, likestumpy.stump. I will talk about stumpy.ostinato later below but this is just for core.preprocess_diagonal and stumpy.stump.

(ii) should we improve the docstring of core.preprocess_diagonal

stumpy/stumpy/core.py

Lines 2232 to 2246 in 3077d0d

M_T : numpy.ndarray
Rolling mean with a subsequence length of `m`
Σ_T_inverse : numpy.ndarray
Inverted rolling standard deviation
M_T_m_1 : numpy.ndarray
Rolling mean with a subsequence length of `m-1`
T_subseq_isfinite : numpy.ndarray
A boolean array that indicates whether a subsequence in `T` contains a
`np.nan`/`np.inf` value (False)
T_subseq_isconstant : numpy.ndarray
A boolean array that indicates whether a subsequence in `T` is constant (True)

by saying something like this:

  • M_T is the rolling mean of processed T, where non-finite values are replaced with 0.0.
  • T_subseq_isconstant[i] False wherever T_subseq_isfinite[i] is True.

? (or, is that too much?)


Regarding Part 2 and Part 3 of my previous comment

This way Ts_copy[i] would NOT be a processed copy of Ts[i] and, instead, is a RAW copy of Ts[i]. Would that work? The ONLY reason that we are pre-computing M_Ts, Σ_Ts, Ts_subseq_isconstant is to avoid re-computing them later and, instead, we pass in those arrays. However, we can accept the small cost of making a copy of Ts[i].

Then, all of the M_Ts, Σ_Ts, Ts_subseq_isconstant would correspond correctly to the UNPROCESSED Ts_copy and we can add a comment to ostinato to remind the reader that we DON'T use a processed copy of Ts[i] because stump will process it again itself?

Got the point.

Note1: If we only implement

mask =  np.logical_or(T_subseq_isconstant, Σ_T==0.0)
Σ_T[mask] = 1.0

(in core.preprocess_diagonal) then we do not get warning in stumpy.ostinato when we pass the PROCESSED T to stumpy.stump. However, the issue I pointed out in "Part 3" of my previous comment remains.

Note 2:
I made the following change in stumpy.ostinato

for i, T in enumerate(Ts):
        _, M_Ts[i], Σ_Ts[i], Ts_subseq_isconstant[i] = core.preprocess(
            T, m, copy=True, T_subseq_isconstant=Ts_subseq_isconstant[i]
        )
        Ts_copy[i] = T.copy()

The tests in test_ostinato.py are passing. Also, the AB-join matrix profile

stumpy/stumpy/ostinato.py

Lines 239 to 246 in 3077d0d

mp = partial_mp_func(
Ts[j],
m,
Ts[h],
ignore_trivial=False,
T_A_subseq_isconstant=Ts_subseq_isconstant[j],
T_B_subseq_isconstant=Ts_subseq_isconstant[h],
)

(that is computed in _ostinato) is now correct. So, the concern raised in the part 3 of my previous comment is addressed.

Note 2-WARNING

In the same function _ostinato, we now pass the UNPROCESSED T to core._mass.

core._mass(

This is a njit-decorated function with fastmath=True. The docstring says:

This private function assumes only finite numbers in your inputs and it is the
responsibility of the user to pre-process and post-process their results if the
original time series contains np.nan/np.inf values.

we may try to revise the fastmath flag (see #708).

@seanlaw
Copy link
Contributor

seanlaw commented Jul 20, 2024

mask = np.logical_or(T_subseq_isconstant, Σ_T==0.0)
Σ_T[mask] = 1.0

So, I like the spirit of this and I think it is headed in the right direction. However, it seems implicit/dangerous to set all Σ_T[Σ_T==0.0] == 1.0 since we don't necessarily know WHAT is causing Σ_T==0.0. Instead, I would rather set Σ_T to 1.0 for each specific condition that we know of. For example, maybe we should be adding something like:

Σ_T[T_subseq_isconstant] = 1.0
Σ_T[~T_subseq_isfinite] = 1.0
# Add other conditions

This way, it is explicitly clear WHEN Σ_T needs to be set to 1.0. Then, we should add unit tests to ALL of those scenarios

If we only implement ... (in core.preprocess_diagonal) then we do not get warning in stumpy.ostinato when we pass the PROCESSED T to stumpy.stump. However, the issue I pointed out in "Part 3" of my previous comment remains

Again, if we are explicit in our code about WHY we are setting something to 1.0 then I am okay. Logically, is there any reason beyond:

  1. A subsequence containing non-finite numbers
  2. A subsequence having constant values

for which Σ_T would be 0.0?

@seanlaw
Copy link
Contributor

seanlaw commented Jul 20, 2024

should we improve the docstring of core.preprocess_diagonal
by saying something like this:
M_T is the rolling mean of processed T, where non-finite values are replaced with 0.0.

I think it's better to say ". Each subsequence in T that contains one or more non-finite values will have its corresponding M_T value set to 0.0.

T_subseq_isconstant[i] False wherever T_subseq_isfinite[i] is True.

Umm, I think you mean "T_subseq_isconstant[i] is automatically set to False whenever T_subseq_isfinite[i] is False."??

@seanlaw
Copy link
Contributor

seanlaw commented Jul 20, 2024

So, the concern raised in the part 3 of my previous comment is addressed.

Cool! I'd like to see the final code for this.

In the same function _ostinato, we now pass the UNPROCESSED T to core._mass.

I don't like this. I think we should agree that the private functions should already be preprocessed. We may need to re-think things more in order to ensure consistency

@seanlaw
Copy link
Contributor

seanlaw commented Jul 20, 2024

The tests in test_ostinato.py are passing.

Have you added the test above to the unit tests?

@NimaSarajpoor
Copy link
Collaborator Author

@seanlaw

it is explicitly clear WHEN Σ_T needs to be set to 1.0.

Logically, is there any reason beyond:

  1. A subsequence containing non-finite numbers
  2. A subsequence having constant values
    for which Σ_T would be 0.0?

There is no more reason other than those two. I agree. Being explicit helps us to know for what subsequences exactly we are setting the standard deviation to 1.0.

Umm, I think you mean "T_subseq_isconstant[i] is automatically set to False whenever T_subseq_isfinite[i] is False."??

Right! My bad 😅

In the same function _ostinato, we now pass the UNPROCESSED T to core._mass.

I don't like this.

Right. If we decide to pass UNPROCESSED data to _ostinato, it will then be passed to core._mass as well. Need to think more on it.

The tests in test_ostinato.py are passing.

Have you added the test above to the unit tests?

So, what I meant was that changing the code does not result in failing current tests. Will add new tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants