diff --git a/promptdistill2024/index.html b/promptdistill2024/index.html index 83cfe0f..10bb9ef 100644 --- a/promptdistill2024/index.html +++ b/promptdistill2024/index.html @@ -1,397 +1,43 @@ - - - - - - - - - - + + + + + + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - MICV - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +
+
+
+

+ +
+
+ + + + + + + + Prompt Distillation + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+
+
+
+
+

Transfer Relationships via Prompt for Medical Image Classification

+
+ Under Review +
+
+ Gayoon Choi, + Yumin Kim, + Seong Jae Hwang +
+
+ Yonsei University +
+
+
+ + +
+
+
+ +
+
+
+
+ Visualizing Prompt Distillation and Baselines +
+
+

+ We introduce Prompt Distillation, revealing relationships in knowledge with visual prompts for a transfer learning. +

+
+
+
+ +
+
+ +
+
+

Abstract

+
+ While Vision Transformers have facilitated remarkable advancements in computer vision, they require vast training data and iterations. + Transfer learning is widely used to overcome these challenges, utilizing knowledge from pre-trained networks. + However, sharing entire network weights for transfer learning is difficult in the medical field due to data privacy concerns. + To address these concerns, we introduce an innovative transfer strategy called Prompt Distillation which shares prompts instead of network weights. + It compresses knowledge in pre-trained networks to prompts by effectively leveraging the attention mechanism. + In experiments, it outperformed training from scratch and achieved comparable performance to full-weight transfer learning, while reducing the parameter scale by up to 90 times compared to full-weight. + Moreover, it demonstrates the ability to transfer knowledge between already-trained networks by merely inserting prompts. + It is validated through medical image classification across three domains, chest X-ray, pathology, and retinography, distinct in degrees of distribution shifts. +
+
+
+ + + + + + + + + + + + +
+
+ +
+
+
+
+

Method

+
+

Pipeline

+
+ Main figure +
+

+ The pipeline of Prompt Distillation based transfer learning. + After pre-training, prompts are inserted inside the pre-trained source network and trained for a few epochs (Step 2). + Then, these learned prompts are shared in place of network weights to target networks for transfer learning (Step 3). + ``Train" and ``Frozen" refer to whether backpropagation is performed, which involves calculating the gradients and updating the parameters, or not. +

+
+
+

Prompt Distillation

+
+ Main figure +

+ The framework of prompt distillation. + Red tokens represent prompts, which are injected into Transformer encoders. + During prompt distillation, Transformer remains frozen (i.e. not back-propagated), and only prompts are trained (i.e. back-propagated). + Prompts from the previous layer are removed as new prompts are inserted into the next layer. +

+
+
+
+
+
+ +
+
+
+
+

Quantitative Results

+
+

Transfer Learning via Prompt Distillation

+
+ transfer learning table +

+ Quantitative results of prompt distillation compared to the scratch learning and full-weights transfer learning on three domains. + Prompt distil- lation enhances performance beyond scratch and close to full-weight transfer learning. + An interesting point is that in ColonPath, where domain shifts are large, transferring relationships solely through prompt distillation enhances the performance of the target network. +

+
+
+

Enhancing Already-trained Networks

+
+ knowledge enhancement table +

Improvements in the performance of already-trained networks are observed through the synergistic adaptation between the existing network weights and distilled prompts.

+
+
+

Knowledge Compression Strategies

+
+ knowledge compression table +

Sup.: Supervised Learning, O.R.: Ordered Representation Learning, K.D.: Knowledge Distillation.

+

+ Comparing distinct knowledge compression strategies. + Supervised learning is effective and efficient overall. + It outperforms other methods with a straightforward objective, no need for structural modifications, and is easily applicable to any network. +

+
+
+
+
+
+ +
+
+
+
+

Quantitative Ablation

+
+

The Number of Prompt Embeddings

+
+ transfer learning table +

+ The effect of the number of prompt embeddings in transfer learning performance. + Too few prompts are insufficient for effectively compressing knowledge, while too many disrupt the attention. +

+
+
+
+
+
+ +
+
+

BibTeX

+
@article{promptdistill2024,
+                author    = {Choi, Gayoon and Kim, Yumin and Hwang, Seong Jae},
+                title     = {Prompt Distillation for Weight-free Transfer Learning},
+                month     = {July},
+                year      = {2024},
+                }
+
+
+ + + + + + +
+ - - - -
- - - - -
--> - -
- - - - -
- - - - - - - - - - - - - - - - - - - - -
-

- - - - - - - - - - - - - - - - -
- - - -
- -
- - - - - - - - Prompt Distillation - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
-
-
-
-
-

Prompt Distillation for Weight-free Transfer Learning

- - -
- Under Review -
-
- - Gayoon Choi, - - Yumin Kim, - - Seong Jae Hwang - -
- -
- Yonsei University -
- +
- -
- - +
-
- - -
-
-
-
- Visualizing Prompt Distillation and Baselines -
-
-

- We introduce Prompt Distillation, -
reveal relationships in knowledge with visual prompts for a transfer learning. -

-
-
-
- -
-
- -
-
-

Abstract

-
- While Vision Transformers have facilitated remarkable advancements in computer vision, they require vast training data and iterations. - Transfer learning is widely used to overcome these challenges, utilizing knowledge from pre-trained networks. - However, sharing entire network weights for transfer learning is difficult in the medical field due to data privacy concerns. - To address these concerns, we introduce an innovative transfer strategy called Prompt Distillation which shares prompts instead of network weights. - It compresses knowledge in pre-trained networks to prompts by effectively leveraging the attention mechanism. - In experiments, it outperformed training from scratch and achieved comparable performance to full-weight transfer learning, while reducing the parameter scale by up to 90 times compared to full-weight. - Moreover, it demonstrates the ability to transfer knowledge between already-trained networks by merely inserting prompts. - It is validated through medical image classification across three domains, chest X-ray, pathology, and retinography, distinct in degrees of distribution shifts. -
-
-
- - - - - - - - - - - - - -
-
- -
-
-
-
-

Method

- -
-

Pipeline

-
- Main figure -
-

- The pipeline of Prompt Distillation based transfer learning. - After pre-training, prompts are inserted inside the pre-trained source network and trained for a few epochs (Step 2). - Then, these learned prompts are shared in place of network weights to target networks for transfer learning (Step 3). - ``Train" and ``Frozen" refer to whether backpropagation is performed, which involves calculating the gradients and updating the parameters, or not. -

-
- -
-

Knowledge Compression Strategies

-
- Main figure -

- The framework of prompt distillation. - Red tokens represent prompts, which are injected into Transformer encoders. - During prompt distillation, Transformer remains frozen (i.e. not back-propagated), and only prompts are trained (i.e. back-propagated). - Prompts from the previous layer are removed as new prompts are inserted into the next layer. -

-
- -
-
-
-
- -
-
-
-
-

Quantitative Results

- -
-

Transfer Learning via Prompt Distillation

-
- transfer learning table -

- Quantitative results of prompt distillation compared to the scratch learning and full-weights transfer learning on three domains. -

-
- -
-

Knowledge Enhancement

-
- knowledge enhancement table -

- Analyze the enhancement ability to already-trained networks. -

-
- -
-

Knowledge Compression

-
- knowledge compression table -

- Comparing distinct knowledge compression strategies. -

-
- -
- -
- -
- -
- -
-
-
-
-

Quantitative Ablation

- -
-

The Number of Prompt and Distillation Epochs

-
- transfer learning table -

- Effects of the number of prompt embeddings and distillation epochs. -

-
-
-
-
-
- - -
-
-

BibTeX

-
@article{promptdistill2024,
-    author    = {Choi, Gayoon and Kim, Yumin and Hwang, Seong Jae},
-    title     = {Prompt Distillation for Weight-free Transfer Learning},
-    month     = {July},
-    year      = {2024},
-}
-
-
- - - - - - - - - - - - - - - - - - - -
- -
- - - - - - - - - - - - - - - - - - - - - - -