You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are rare cases in the dataset where the data_min and data_max columns in the catalog don't match the min/max measured from the actual (decoded) images.
For example, event R19011212048075 for img_type='ir069'. This entry in the CATALOG.csv is
The minimum value in this case is -23540.1 degrees C, which is strange value. And if we actually look at the minimum in the image stored in SEVIR, we see a value of -18312, which decodes to -183.12. That's different than what's reported above.
Explanation
Looking at the data, this happens when there are a few bad pixels in the image, typically in very high and thick clouds:
Data is converted to int16 before being written to .h5, however the min/max values entered in the CATALOG are recorded before this casting is done. In cases of bad pixels, these values get very large (as what happened in this case), and the true minimum of the data causes and int16 overflow when scaled. So the pixel value stored for these bad pixels in SEVIR is garbage (as is the value stored in the CATALOG).
Unfortunately, this cannot be fixed easily without recreating the whole dataset. A good practice would be in preprocessing to clip pixels to a physically reasonable range computed by filtering out outliers like this one.
The text was updated successfully, but these errors were encountered:
There are rare cases in the dataset where the
data_min
anddata_max
columns in the catalog don't match the min/max measured from the actual (decoded) images.For example, event
R19011212048075
forimg_type='ir069'
. This entry in theCATALOG.csv
isThe minimum value in this case is -23540.1 degrees C, which is strange value. And if we actually look at the minimum in the image stored in SEVIR, we see a value of
-18312
, which decodes to-183.12
. That's different than what's reported above.Explanation
Looking at the data, this happens when there are a few bad pixels in the image, typically in very high and thick clouds:
Data is converted to
int16
before being written to.h5
, however the min/max values entered in the CATALOG are recorded before this casting is done. In cases of bad pixels, these values get very large (as what happened in this case), and the true minimum of the data causes andint16
overflow when scaled. So the pixel value stored for these bad pixels in SEVIR is garbage (as is the value stored in the CATALOG).Unfortunately, this cannot be fixed easily without recreating the whole dataset. A good practice would be in preprocessing to clip pixels to a physically reasonable range computed by filtering out outliers like this one.
The text was updated successfully, but these errors were encountered: