Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The data in "TrainingData" link is inconsistent with genTrainingData.py #12

Open
mengyuest opened this issue Nov 21, 2024 · 0 comments
Open

Comments

@mengyuest
Copy link

Hi, I am checking the score lookup table. Based on genTrainingData.py the X are like grid-data and should have zero-mean and unit range at every entry, and has shape (32x41x41=53792, 3). But I downloaded the data from Training data in the README.md and the first 32 lines are as follows:

I ran

import numpy as np
import h5py
f = h5py.File('HCAS_rect_TrainingData_v6_pra0_tau00.h5', 'r')
X_train = np.array(f['X'])
print(X_train[:32])

And it generates

[[ 2.48775044e-03 -1.25392984e-17 -5.00000000e-01]
 [ 2.26453615e-03 -1.25666343e-17 -5.00000000e-01]
 [ 2.04132186e-03 -1.25939702e-17 -5.00000000e-01]
 [ 1.81810758e-03 -1.26213060e-17 -5.00000000e-01]
 [ 1.59489329e-03 -1.26486419e-17 -5.00000000e-01]
 [ 1.14846472e-03 -1.27033136e-17 -5.00000000e-01]
 [ 7.02036150e-04 -1.27579854e-17 -5.00000000e-01]
 [-1.90820993e-04 -1.28673288e-17 -5.00000000e-01]
 [-1.08367814e-03 -1.29766723e-17 -5.00000000e-01]
 [-1.97653528e-03 -1.30860157e-17 -5.00000000e-01]
 [-2.06582099e-03 -1.30969501e-17 -5.00000000e-01]
 [-4.20867814e-03 -1.33593744e-17 -5.00000000e-01]
 [-6.44082099e-03 -1.36327331e-17 -5.00000000e-01]
 [-1.09051067e-02 -1.41794504e-17 -5.00000000e-01]
 [-1.53693924e-02 -1.47261677e-17 -5.00000000e-01]
 [-2.42979639e-02 -1.58196023e-17 -5.00000000e-01]
 [-3.32265353e-02 -1.69130370e-17 -5.00000000e-01]
 [-4.21551067e-02 -1.80064716e-17 -5.00000000e-01]
 [-6.00122496e-02 -2.01933409e-17 -5.00000000e-01]
 [-7.78693924e-02 -2.23802102e-17 -5.00000000e-01]
 [-9.57265353e-02 -2.45670795e-17 -5.00000000e-01]
 [-1.13583678e-01 -2.67539488e-17 -5.00000000e-01]
 [-1.31440821e-01 -2.89408181e-17 -5.00000000e-01]
 [-1.49297964e-01 -3.11276873e-17 -5.00000000e-01]
 [-1.67155107e-01 -3.33145566e-17 -5.00000000e-01]
 [-1.85012250e-01 -3.55014259e-17 -5.00000000e-01]
 [-2.20726535e-01 -3.98751645e-17 -5.00000000e-01]
 [-2.65369392e-01 -4.53423377e-17 -5.00000000e-01]
 [-3.10012250e-01 -5.08095109e-17 -5.00000000e-01]
 [-3.54655107e-01 -5.62766841e-17 -5.00000000e-01]
 [-4.26083678e-01 -6.50241612e-17 -5.00000000e-01]
 [-4.97512250e-01 -7.37716384e-17 -5.00000000e-01]]
>>> 

notice that the first column is not having a unit range, and the second column starts with close-to-zero value. Which is weird. If I just ran the process in genTrainingData.py to generate the X:

import numpy as np
ranges = np.array([0.0,25.0,50.0,75.0,100.0,150.0,200.0,300.0,400.0,500.0,510.0,750.0,1000.0,1500.0,2000.0,3000.0,4000.0,5000.0,7000.0,9000.0,11000.0,13000.0,15000.0,17000.0,19000.0,21000.0,25000.0,30000.0,35000.0,40000.0,48000.0,56000.0])
thetas = np.linspace(-np.pi,np.pi,41)
psis  = np.linspace(-np.pi,np.pi,41)
X = np.array([[r,t,p] for p in psis for t in thetas for r in ranges])
means = np.mean(X, axis=0)
rnges = np.max(X, axis=0) - np.min(X, axis=0)
min_inputs = np.min(X, axis=0)
max_inputs = np.max(X, axis=0)
rnges = np.where(rnges==0.0, 1.0, rnges)
X  = (X - means) / rnges
print(X[:32])

I will get

[[-0.20399554 -0.5        -0.5       ]
 [-0.20354911 -0.5        -0.5       ]
 [-0.20310268 -0.5        -0.5       ]
 [-0.20265625 -0.5        -0.5       ]
 [-0.20220982 -0.5        -0.5       ]
 [-0.20131696 -0.5        -0.5       ]
 [-0.20042411 -0.5        -0.5       ]
 [-0.19863839 -0.5        -0.5       ]
 [-0.19685268 -0.5        -0.5       ]
 [-0.19506696 -0.5        -0.5       ]
 [-0.19488839 -0.5        -0.5       ]
 [-0.19060268 -0.5        -0.5       ]
 [-0.18613839 -0.5        -0.5       ]
 [-0.17720982 -0.5        -0.5       ]
 [-0.16828125 -0.5        -0.5       ]
 [-0.15042411 -0.5        -0.5       ]
 [-0.13256696 -0.5        -0.5       ]
 [-0.11470982 -0.5        -0.5       ]
 [-0.07899554 -0.5        -0.5       ]
 [-0.04328125 -0.5        -0.5       ]
 [-0.00756696 -0.5        -0.5       ]
 [ 0.02814732 -0.5        -0.5       ]
 [ 0.06386161 -0.5        -0.5       ]
 [ 0.09957589 -0.5        -0.5       ]
 [ 0.13529018 -0.5        -0.5       ]
 [ 0.17100446 -0.5        -0.5       ]
 [ 0.24243304 -0.5        -0.5       ]
 [ 0.33171875 -0.5        -0.5       ]
 [ 0.42100446 -0.5        -0.5       ]
 [ 0.51029018 -0.5        -0.5       ]
 [ 0.65314732 -0.5        -0.5       ]
 [ 0.79600446 -0.5        -0.5       ]

You can see that only the last columns between the two outputs are the same, whereas the rest columns are not consistent even if taking numerical error into account. Could you please explain why it became this, and can we safely replace the X in the data with the one I created here? Thanks for your response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant