The data in "TrainingData" link is inconsistent with genTrainingData.py #12

mengyuest · 2024-11-21T18:54:44Z

Hi, I am checking the score lookup table. Based on genTrainingData.py the X are like grid-data and should have zero-mean and unit range at every entry, and has shape (32x41x41=53792, 3). But I downloaded the data from Training data in the README.md and the first 32 lines are as follows:

I ran

import numpy as np
import h5py
f = h5py.File('HCAS_rect_TrainingData_v6_pra0_tau00.h5', 'r')
X_train = np.array(f['X'])
print(X_train[:32])

And it generates

[[ 2.48775044e-03 -1.25392984e-17 -5.00000000e-01]
 [ 2.26453615e-03 -1.25666343e-17 -5.00000000e-01]
 [ 2.04132186e-03 -1.25939702e-17 -5.00000000e-01]
 [ 1.81810758e-03 -1.26213060e-17 -5.00000000e-01]
 [ 1.59489329e-03 -1.26486419e-17 -5.00000000e-01]
 [ 1.14846472e-03 -1.27033136e-17 -5.00000000e-01]
 [ 7.02036150e-04 -1.27579854e-17 -5.00000000e-01]
 [-1.90820993e-04 -1.28673288e-17 -5.00000000e-01]
 [-1.08367814e-03 -1.29766723e-17 -5.00000000e-01]
 [-1.97653528e-03 -1.30860157e-17 -5.00000000e-01]
 [-2.06582099e-03 -1.30969501e-17 -5.00000000e-01]
 [-4.20867814e-03 -1.33593744e-17 -5.00000000e-01]
 [-6.44082099e-03 -1.36327331e-17 -5.00000000e-01]
 [-1.09051067e-02 -1.41794504e-17 -5.00000000e-01]
 [-1.53693924e-02 -1.47261677e-17 -5.00000000e-01]
 [-2.42979639e-02 -1.58196023e-17 -5.00000000e-01]
 [-3.32265353e-02 -1.69130370e-17 -5.00000000e-01]
 [-4.21551067e-02 -1.80064716e-17 -5.00000000e-01]
 [-6.00122496e-02 -2.01933409e-17 -5.00000000e-01]
 [-7.78693924e-02 -2.23802102e-17 -5.00000000e-01]
 [-9.57265353e-02 -2.45670795e-17 -5.00000000e-01]
 [-1.13583678e-01 -2.67539488e-17 -5.00000000e-01]
 [-1.31440821e-01 -2.89408181e-17 -5.00000000e-01]
 [-1.49297964e-01 -3.11276873e-17 -5.00000000e-01]
 [-1.67155107e-01 -3.33145566e-17 -5.00000000e-01]
 [-1.85012250e-01 -3.55014259e-17 -5.00000000e-01]
 [-2.20726535e-01 -3.98751645e-17 -5.00000000e-01]
 [-2.65369392e-01 -4.53423377e-17 -5.00000000e-01]
 [-3.10012250e-01 -5.08095109e-17 -5.00000000e-01]
 [-3.54655107e-01 -5.62766841e-17 -5.00000000e-01]
 [-4.26083678e-01 -6.50241612e-17 -5.00000000e-01]
 [-4.97512250e-01 -7.37716384e-17 -5.00000000e-01]]
>>>

notice that the first column is not having a unit range, and the second column starts with close-to-zero value. Which is weird. If I just ran the process in genTrainingData.py to generate the X:

import numpy as np
ranges = np.array([0.0,25.0,50.0,75.0,100.0,150.0,200.0,300.0,400.0,500.0,510.0,750.0,1000.0,1500.0,2000.0,3000.0,4000.0,5000.0,7000.0,9000.0,11000.0,13000.0,15000.0,17000.0,19000.0,21000.0,25000.0,30000.0,35000.0,40000.0,48000.0,56000.0])
thetas = np.linspace(-np.pi,np.pi,41)
psis  = np.linspace(-np.pi,np.pi,41)
X = np.array([[r,t,p] for p in psis for t in thetas for r in ranges])
means = np.mean(X, axis=0)
rnges = np.max(X, axis=0) - np.min(X, axis=0)
min_inputs = np.min(X, axis=0)
max_inputs = np.max(X, axis=0)
rnges = np.where(rnges==0.0, 1.0, rnges)
X  = (X - means) / rnges
print(X[:32])

I will get

[[-0.20399554 -0.5        -0.5       ]
 [-0.20354911 -0.5        -0.5       ]
 [-0.20310268 -0.5        -0.5       ]
 [-0.20265625 -0.5        -0.5       ]
 [-0.20220982 -0.5        -0.5       ]
 [-0.20131696 -0.5        -0.5       ]
 [-0.20042411 -0.5        -0.5       ]
 [-0.19863839 -0.5        -0.5       ]
 [-0.19685268 -0.5        -0.5       ]
 [-0.19506696 -0.5        -0.5       ]
 [-0.19488839 -0.5        -0.5       ]
 [-0.19060268 -0.5        -0.5       ]
 [-0.18613839 -0.5        -0.5       ]
 [-0.17720982 -0.5        -0.5       ]
 [-0.16828125 -0.5        -0.5       ]
 [-0.15042411 -0.5        -0.5       ]
 [-0.13256696 -0.5        -0.5       ]
 [-0.11470982 -0.5        -0.5       ]
 [-0.07899554 -0.5        -0.5       ]
 [-0.04328125 -0.5        -0.5       ]
 [-0.00756696 -0.5        -0.5       ]
 [ 0.02814732 -0.5        -0.5       ]
 [ 0.06386161 -0.5        -0.5       ]
 [ 0.09957589 -0.5        -0.5       ]
 [ 0.13529018 -0.5        -0.5       ]
 [ 0.17100446 -0.5        -0.5       ]
 [ 0.24243304 -0.5        -0.5       ]
 [ 0.33171875 -0.5        -0.5       ]
 [ 0.42100446 -0.5        -0.5       ]
 [ 0.51029018 -0.5        -0.5       ]
 [ 0.65314732 -0.5        -0.5       ]
 [ 0.79600446 -0.5        -0.5       ]

You can see that only the last columns between the two outputs are the same, whereas the rest columns are not consistent even if taking numerical error into account. Could you please explain why it became this, and can we safely replace the X in the data with the one I created here? Thanks for your response.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The data in "TrainingData" link is inconsistent with genTrainingData.py #12

The data in "TrainingData" link is inconsistent with genTrainingData.py #12

mengyuest commented Nov 21, 2024

The data in "TrainingData" link is inconsistent with genTrainingData.py #12

The data in "TrainingData" link is inconsistent with genTrainingData.py #12

Comments

mengyuest commented Nov 21, 2024