diff --git a/2024/2024_11_04_Unlocking_the_Power_of_Spatial_and_Temporal_Information/README.md b/2024/2024_11_04_Unlocking_the_Power_of_Spatial_and_Temporal_Information/README.md new file mode 100644 index 0000000..6b82d47 --- /dev/null +++ b/2024/2024_11_04_Unlocking_the_Power_of_Spatial_and_Temporal_Information/README.md @@ -0,0 +1,11 @@ +# Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training + +## Abstract + +We will discuss pre-training multimodal vision-language models for applications in computer-aided radiology. The multimodal models we will examine are trained jointly on raw medical images and corresponding free-text radiology reports. Radiology reports, generated abundantly within typical clinical workflows, serve as a valuable source of medical image annotations but have yet to be fully leveraged in modeling efforts. + +I will present a [recent ICML 2024 conference paper](https://icml.cc/virtual/2024/poster/34857) that addresses this issue. I will begin with examples to illustrate the rationale for developing multimodal models in radiology and provide an overview of recent work and public dataset that form the basis of this research. Then, I will detail the paper’s main contributions: (1) extending the multimodal framework to account for multiple representations of anatomy in chest radiographs, and (2) advancing temporal modeling of longitudinal data. + +## Source paper + +[Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training](https://arxiv.org/abs/2405.19654) \ No newline at end of file diff --git a/2024/2024_11_04_Unlocking_the_Power_of_Spatial_and_Temporal_Information/Unlocking_the_Power_of_Spatial_and_Temporal_Information.pdf b/2024/2024_11_04_Unlocking_the_Power_of_Spatial_and_Temporal_Information/Unlocking_the_Power_of_Spatial_and_Temporal_Information.pdf new file mode 100644 index 0000000..59c96a5 Binary files /dev/null and b/2024/2024_11_04_Unlocking_the_Power_of_Spatial_and_Temporal_Information/Unlocking_the_Power_of_Spatial_and_Temporal_Information.pdf differ diff --git a/README.md b/README.md index 5c83e78..5b0038c 100644 --- a/README.md +++ b/README.md @@ -14,7 +14,7 @@ Join us at https://meet.drwhy.ai. * 14.10 - Do Not Explain Vision Models without Context - Paulina Tomaszewska * 21.10 - [Positional Label for Self-Supervised Vision Transformer](https://github.com/MI2DataLab/MI2DataLab_Seminarium/tree/master/2024/2024_10_21_Positional_Label_for_Self-Supervised_Vision_Transformer) - Filip Kołodziejczyk * 28.10 - [Adversarial examples vs. context consistency defense for object detection](https://github.com/MI2DataLab/MI2DataLab_Seminarium/tree/master/2024/2024_10_28_Adversarial_attacks_against_object_detection.md) - Hubert Baniecki -* 04.11 - Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training - Bartosz Kochański +* 04.11 - [Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training](https://github.com/MI2DataLab/MI2DataLab_Seminarium/tree/master/2024/2024_11_04_Unlocking_the_Power_of_Spatial_and_Temporal_Information_in_Medical_Multimodal_Pre-training) - Bartosz Kochański * 18.11 - User study: Visual Counterfactual Explanations for Improved Model Understanding - Bartek Sobieski * 25.11 - Vision Transformers provably learn spatial structure - Vladimir Zaigrajew * 02.12 - Null-text Inversion for Editing Real Images using Guided Diffusion Models - Dawid Płudowski