-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Applying to different dataset #6
Comments
Hi, the depth ground truth looks reasonable. |
Hi, 2D detection is doing OK, not great but I have many hard labels (occluded in image view). However what I can't seem to get good results in is orientation. There seems to be this really strong bias towards one particular orientation and I'm not sure why, my dataset is varied. I wrote some visualization functions and when the data is loaded in mono_dataset, both KITTI data and my custom dataset render "proper" labels. I started looking around and noticed the alpha2theta_3d and convertAlpha2Rot functions, I couldn't really understand the physical meaning of the offset Using "Alpha" (pedestrians GT are shown but this is only trained on cars) Using "Alpha" without the "offset": Reducing image size by half, (1280, 1920, 3) -> (640, 960, 3) and increasing batch_size x4 for more stable training: I might try to increase the regression weight for angle next, but I feel with this kind of result I am missing something fundamental. Any guidance would be much appreciated. Finally, here's my config, very close to yours:
Cheers |
Hi, Can you try to plot the orientation loss? You can try to visualize data based on roty and alpha separately. |
In this experiment, I convert rot_y (from ground truth) to alpha as you do with KITTI. I added alpha_loss to the loss dictionary for visualization. Seems it is learning, but result is still not great. Seems like all the boxes are oriented the same way. Here's alpha loss and depth loss and the result:
Our dataset is hard so of course I don't expect KITTI level performance but clearly something is going wrong here. It is also not feasible to use KITTI pre-trained model since the image size is so different.
I don't see any visualization functions in your repository. Basically I used my the default MonoDTR loading functions, then wrote some visualization functions. |
Hi @jacoblambert, |
I try to inference your image with my custom code, but the 2d, 3d box is not correct. Are you fixed this problems? If in this case, could you recommend me some tips to fix this problem? Thanks |
I could not fix this problem. The only issue I can think of is, there is some problem with my label files of calib matrices, but I do not know where. |
Hi @jacoblambert, |
Hi,
Do you have any advice with regards to training MonoDTR algorithm to another dataset?
Basically I have a dataset in KITTI format: images, annotation, pointcloud, calibration between the camera and the LiDAR. I load the KITTI dataset and my custom dataset in the same way, no problem.
Major difference: my images are full HD (1920 x 1080). I created a custom config, with major differences as follows:
I can then run the data preparation scripts:
And the generated depth images seem reasonable:
And the training code runs but the loss does not go down and when validation comes, NMS fails, the reason seems to be far too many detections to convert to tensor:
I'm not sure where to go from here, so I wanted to ask if you have any intuition on what I could debug, maybe something is hard-coded for KITTI. Or is there something I should change in the model to better handle HD images? As the KITTI images are a very different aspect ratio.
Cheers,
Jacob
The text was updated successfully, but these errors were encountered: