-
Notifications
You must be signed in to change notification settings - Fork 420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Copernicus-FM #2646
Add Copernicus-FM #2646
Conversation
Args: | ||
x: Input mini-batch. | ||
meta_info: Longitudes, latitudes, times, and areas of each patch. | ||
Use NaN for unknown metadata. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an unintuitive UI. I would rather have separate values for each which are either Tensor or None. It's also a shame that we can't mix this in a single mini-batch, if a single value is NaN that metadata is ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can make it possible to mix in the batch in principle, but needs looping over the batch dim to assign known/unknown, probably will change a lot of codes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is ready from my side, but I'll give others a couple days to review. I'm particularly concerned about whether the documentation is sufficient for people to figure out how to use the model. There are ways we could make this more user-friendly, but I don't want to diverge too much from the original source code.
Just read through this and found myself wanting an example of how to use it (same thing I hit previously when trying to use Scale-MAE) E.g. even though the args in Maybe it'd be nice to put an example in the docstring? (This also applies to other pre-trained models actually) |
Agreed, we actually got a similar request for DOFA: zhu-xlab/DOFA#14 These newer models (Copernicus-FM, Panopticon) add a lot more (optional) metadata, so are even more confusing to use. Not sure if this should be API documentation or tutorials or what. I probably don't have a ton of time to work on this personally but @wangyi111 might. |
Should be easy for me to add the docstring. Regarding tutorial, is there such place for demonstrating a pretrained model? I only see https://torchgeo.readthedocs.io/en/stable/tutorials/pretrained_weights.html |
Yep, that's the right location. We could either expand that tutorial to cover additional models, or add a second tutorial specifically for using FMs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vote we merge this as is and @wangyi111 can open a separate PR to expand our tutorials on Scale-MAE, DOFA, Copernicus-FM, etc. Any objections?
oh I just added some docstring to copernicusfm class |
How would you feel about renaming a few things for consistency:
Could also split meta_info into 4 separate variables for ease of use. Don't want to diverge too much from the original implementation, but also want to make it user friendly and intuitive. |
Good for me. Only one is wv_plane, which is not only the dim of wavelength but also bandwidth and language embed. maybe something like hyper_dim? i kind of wanted to call it meta_dim but metadata also means another thing in this model.. |
Maybe |
these can still lead to input image features, maybe |
Finished renaming. Remaining ideas to improve usabillity:
Don't want to spend too much time on this because we still need to finish Copernicus-Bench and Copernicus-Pretrain, but once it's merged it makes it harder to change without breaking backwards compatibility. |
Co-authored-by: Yi Wang <[email protected]>
Co-authored-by: Yi Wang <[email protected]>
Add Copernicus-FM, an extension of the DOFA foundation model, able to process any spectral or non-spectral sensor modality using extended dynamic hypernetworks and flexible metadata encoding.
Key features:
References