-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training the depth task on A2D2 #16
Comments
I think it is because in the basic_files.json, my positions look like this: Why are the middle values 0 and 0? And how can I modify that? Do I have to make a script to modify the basic_files.json? Or can it be done from your scripts? Thank you! |
Ok, here is an update. I have managed to change the positions from the basic_files.json with this script:
However, this is the error I get now, and I am really confused.. what does it mean and why do I get it? Before making this change to the .json file, the warp_images function worked perfectly well. I know that because the previous error was 2 lines below this one, at the reprojection loss calculation..
Thanks again! |
Hello, I think the first error you got is identified correctly by you. The numbers in the Did you compare the dataloader output (input to the network and loss computation) with the one supplied by the KITTI loader? Do they match? This would be my strategy to verify that at least the data loading works alright. |
Hello! Thank you for your response. So, I managed to start the training process. Turns out that the last error was related to the K matrix's dimension. Mine was 3x3 and the one provided for the KITTI dataset was 4x4. I solved this by adding [0 , 0, 0, 1] both as a column and as a row. I cannot get the validation dataloader to work though. For the training I just commented out that line. But I need it for the evaluation script. This is the error I get:
And this is the dataloader:
Could the error be related to the way the gt files are stored? For the a2d2 dataset I created the depth maps from the lidar points and I stored them as .png files with this function: cv2.imwrite(path, undist_image_front_center), while undist_image_front_center is of type np.uint8. Thank you! |
Hi, Later on, when you convert the depth gt maps to the final format, you should take care that the correct Hope this helps! |
Hello! Thanks for the detailed explanation. It indeed helped me. However, I did not recreated the gt images, but I added this line of code instead, above the line that created the exception: There seems to be a problem with the depth training, though. These are the images after the evaluation: It seems that all the pixels have the same value. I saved the images ( rectified ones used for training ) with uint8 data type. So both my gt and training images are uint8. Could this be the problem? Or maybe because I set the stereo_T parameter to 0? And why do i need this parameter exactly? I only use images that come from a front camera. But I cannot set it to null, because it would trigger an assertion error: it has to have a value. Thanks again! |
From the information you supplied I would say that the general format, the images are stored in should not have an influence on the depth training as long as they are loaded and stored correctly. Also, the stereo_T parameter is not used during the training process on sequences, so it should also not matter, which value you set. A value of 0 is, however, not meaningful, as this would correspond to two cameras being at the exact same position. |
Hello, As far as I can tell, the images are stored and loaded correctly, I really do not know what the problem might be. A good idea is to try another dataset and see if the results are the same.. I noticed something about the K matrix, though. For the kitti training, it has these values: [0.58, 0, 0.5, 0], [0, 1.92, 0.5, 0], [0, 0, 1, 0], [0, 0, 0, 1] However, I used this one for A2D2: [[ 825.10199941612268, 0.0, 959.46189953054625], A long shot, but could this be the problem? And if yes, what K matrix should I choose for the A2D2 dataset? Thanks! |
Yes, you are right, this matrix seems to be in the wrong format for the code. The values in the first row need to be devided by the width (in pixels) of the image and the values in the second row need to be divided by the height of the image. The K matrix in this sense is stored in the format f_x per pixels and f_y per pixel. Same for the principal point of the image. When being loaded, the K matrix is scaled by the target resolution of the image in the Resize() transform. |
Also, the matrix should be stored as a 4 times 4 matrix. Is this the case for A2D2 already in your code? |
Yes, mine was also 4x4 because I padded it with 0,0,01, both on the last row and the last column. I get from your comments in the code that the K matrix is actually the extrinsics matrix. I restarted now the training using this matrix, that I found on the A2D2 website. Homogeneous transformation matrix from global coordinates to view point coordinates ("extrinsic" matrix) [[ 9.96714314e-01 8.09967396e-02 -3.24531964e-04 -1.75209613e+00] However, given that the images are 1920 × 1208, if I were to divide the intrinsics matrix to the dimensions you mentioned, the numbers won't match. |
No, this is missunderstood I think. The matrix is still the intrinsics matrix and the matrix you originally had was nearly correct. The matrix you are searching for would rather look like this: [[825/1920, 0, 959/1920, 0], [0, 824/1208, 642/1208, 0], [0,0,1,0], [0,0,0,1]] Here, the camera intrinsics are simply devided by the resolution of the image. |
Sorry, this is clearly an error in the comments from my side. It should be Intrinsic camera matrix :) |
Hello,
I am trying to train the depth task on A2D2 as well. I created all the data necessary, the .json files and I created the 2 train and validation dataloaders. However I get this error:
I believe it has to do something with the frames for the depth, because I have set the 'video_mode' to video in the depth dataloader for training. However, if I set the 'video_mode' to mono, I get this error:
It is because the frame_ids is an empty tuple. What can I do to fix this?
Thank you very much!
The text was updated successfully, but these errors were encountered: