You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Normalization: Are the 12 output values from the ResNet18 model normalized (output from predict3_npz)? If so, could you provide details?
Coordinate System and Camera Model: The paper mentions the use of a pinhole camera model. However, I have observed instances of negative depth values in the output. Could you clarify the coordinate system employed and how the camera model is defined? Specifically, how should one interpret the translation vector, and what does a negative depth signify in this context?
3D to 2D Projection: I aim to project the 3D coordinates obtained from the ResNet18 model onto the 2D image plane to visualize the hand's location (first 3 values of the output). Could you provide guidance or the correct methodology to accurately perform this projection?
2.Ground Truth Data Format:
In the real_eval_data/regular/gt_events/...txt files, each entry comprises 15 values across 150 data points. My understanding is that the ground truth should consist of 12 values. Could you clarify these 15 values?
Thanks for the help!
The text was updated successfully, but these errors were encountered:
Hi all,
Thanks for the great work! I have few doubts:
1. ResNet18 Model Output Interpretation:
Normalization: Are the 12 output values from the ResNet18 model normalized (output from predict3_npz)? If so, could you provide details?
Coordinate System and Camera Model: The paper mentions the use of a pinhole camera model. However, I have observed instances of negative depth values in the output. Could you clarify the coordinate system employed and how the camera model is defined? Specifically, how should one interpret the translation vector, and what does a negative depth signify in this context?
3D to 2D Projection: I aim to project the 3D coordinates obtained from the ResNet18 model onto the 2D image plane to visualize the hand's location (first 3 values of the output). Could you provide guidance or the correct methodology to accurately perform this projection?
2.Ground Truth Data Format:
Thanks for the help!
The text was updated successfully, but these errors were encountered: