You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current _calc_binary_ig( ) evaluates split points between data points with the same feature values but different labels, which might not be suitable for datasets that contain a lot of such data points.
I dont think this really constitutes a bug really, its true to the algorithm.
I guess for the above you are recommending ignoring splits such as
[(2,-1), (2,-1)], [(2,1),(3,1),(3,1)]
so we would then evaluate (default split)
[ ] [(2,-1), (2,-1),(2,1),(3,1),(3,1)]
skip
[(2,-1)] [(2,-1),(2,1),(3,1),(3,1)] split == 0 I think by the logic
and
[(2,-1),(2,-1)] ,[(2,1),(3,1),(3,1)] split == 1
then continue with
[(2,-1), (2,-1),(2,1)] [(3,1),(3,1)] split == 2
I can enforce this
# evaluate each split pointforsplitinrange(len(orderline)):
next_class=orderline[split][1] # +1 if this class, -1 if other# Check here that the distance is different to the next oneifsplit==0andorderline[split][0] ==orderline[split+1][0]:
continueeliforderline[split][0] ==orderline[split-1][0]:
continue
need to double check the logic a bit confusing about first item, but this gives me IG
0.770950 not of 0.42
TonyBagnall
changed the title
[BUG] ShapeletTransform: binary ig calculation problem
[ENH] ShapeletTransform: binary ig calculation problem
May 20, 2024
Describe the bug
The current _calc_binary_ig( ) evaluates split points between data points with the same feature values but different labels, which might not be suitable for datasets that contain a lot of such data points.
Steps/Code to reproduce the bug
from aeon.transformations.collection.shapelet_based._shapelet_transform import _calc_binary_ig
orderline = [(2,-1),(2,-1),(2,1),(3,1),(3,1)]
c1, c2 = 3, 2
_calc_binary_ig(orderline,c1,c2)
Expected results
0.42
Actual results
0.97
Versions
System:
python: 3.9.7 (tags/v3.9.7:1016ef3, Aug 30 2021, 20:19:38) [MSC v.1929 64 bit (AMD64)]
executable: c:\xxx\python.exe
machine: Windows-10-10.0.19041-SP0
Python dependencies:
pip: 22.3.1
setuptools: 57.4.0
scikit-learn: 1.4.0
aeon: 0.7.1
statsmodels: None
numpy: 1.24.0
scipy: 1.10.1
pandas: 2.0.3
matplotlib: 3.5.0
joblib: 1.3.2
numba: 0.58.1
pmdarima: None
tsfresh: None
The text was updated successfully, but these errors were encountered: