-
Notifications
You must be signed in to change notification settings - Fork 0
Efficient Vision Transformers for Segmentation
Girish-Anadv-07 edited this page Feb 26, 2024
·
1 revision
Vision transformers can provide demonstrable improvements over CNN-based models; however, as the resulting models are often still complex, and existing vision-transformer applications to neuroimaging utilize a U-Net style architecture. Our previous work with MeshNet suggests that Vision Transformers can be used more efficiently by for example using dilations, or other tricks utilized by MeshNet.
- NeuroNeural
- ML Theory/Data Science
- Graduate or exceptional undergrad. Experience with transformers and CNNs in application is highly recommended.
- Brad Baker ([email protected])
- Sergey Plis ([email protected])
- Catalyst: https://github.com/catalyst-team/catalyst
- Catalyst Neuro: https://github.com/catalyst-team/neuro
- Dosovitskiy, Alexey, et al. "An image is worth 16x16 words: Transformers for image recognition at scale." arXiv preprint arXiv:2010.11929 (2020).
- Hatamizadeh, Ali, et al. "Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images." arXiv preprint arXiv:2201.01266 (2022).
- Wang, Dayang, Zhan Wu, and Hengyong Yu. "TED-net: Convolution-free T2T Vision Transformer-based Encoder-decoder Dilation network for Low-dose CT Denoising." International Workshop on Machine Learning in Medical Imaging. Springer, Cham, 2021.
- Wang, Dayang, et al. "CTformer: Convolution-free Token2Token Dilated Vision Transformer for Low-dose CT Denoising." arXiv preprint arXiv:2202.13517 (2022).
- Semester or longer
- Set of Experiments with plots demonstrating the effectiveness of the model on HCP data set and others
- 2-4 page report summarizing primary methodology and results.
- Submission to Machine Learning Conference (e.g. MLSP)