Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about evaluation on 3DPW #56

Open
mimiliaogo opened this issue Nov 25, 2022 · 7 comments
Open

Question about evaluation on 3DPW #56

mimiliaogo opened this issue Nov 25, 2022 · 7 comments

Comments

@mimiliaogo
Copy link

Hi,
I noticed that in your code, you evaluate the final 3D vertices results (MPVPE, MPJPE) after adding predicted camera parameters. Of course, the gt camera params are also added to the gt smpl vertices before evaluation. However, I notice in other papers ( like METRO), the 3D vertices are evaluated without camera predictions, which means the accuracy of camera predictions will not affect the final results.
Are these differences in the evaluation process make it incomparable between you and others?
Thanks.

@hongsukchoi
Copy link
Owner

hongsukchoi commented Nov 25, 2022 via email

@mimiliaogo
Copy link
Author

In

smpl_mesh_coord, smpl_joint_coord = self.mesh_model.layer[gender](smpl_pose, smpl_shape, smpl_trans)
, the ground truth 3DPW vertices are generated with camera trans.

@hongsukchoi
Copy link
Owner

hongsukchoi commented Nov 26, 2022 via email

@mimiliaogo
Copy link
Author

@hongsukchoi
Sorry, I found that the predicted camera parameter I mentioned is in your another repo:
https://github.com/hongsukchoi/3DCrowdNet_RELEASE/blob/6e773064c8d6950b382f66d76b615aada4f2594b/main/model.py#L65
Also, what do you mean in this sentence: "Other methods assuming a
single image input will do the same?"
Doesn't everybody use the same 3DPW dataset?
Thank you so much!

@hongsukchoi
Copy link
Owner

hongsukchoi commented Nov 27, 2022 via email

@mimiliaogo
Copy link
Author

Hi,
In my understanding, there are world coordinates (3D), camera coordinates (3D), and pixel coordinates (2D).
And the model's output is already in camera coordinates, so the predicted camera is for the human mesh projecting from 3D camera coordinates to 2D coordinates.
Therefore, I think the predicted 3D mesh has nothing to do with camera params. Camera parameters are used for 3D->2D projection.
This is also mentioned by FastMETRO author in postech-ami/FastMETRO#3 (comment)
BTW, can you explain a little about smpl coordinate? I'm not sure what I think is the same as you.

Thank you so much.

@hongsukchoi
Copy link
Owner

To conclude, you can compare Pose2Mesh with MeTro or any other methods. Parsing on GT 3D meshes is
just transforming the world coordinate parameters to the camera coordinate system.

The SMPL coordinate system is the world coordinate system, where the template SMPL mesh lies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants