From edfccd98a461ea77209aeb7213365756d87db117 Mon Sep 17 00:00:00 2001 From: seilk Date: Sat, 8 Jun 2024 09:58:04 +0900 Subject: [PATCH] update research topics - seil --- publication/2024-arxiv-wolf/index.html | 1581 ++++------------------- publication/2024-miccai-cxrl/index.html | 1120 ++-------------- 2 files changed, 337 insertions(+), 2364 deletions(-) diff --git a/publication/2024-arxiv-wolf/index.html b/publication/2024-arxiv-wolf/index.html index 4dcef2b..9fdcd9b 100644 --- a/publication/2024-arxiv-wolf/index.html +++ b/publication/2024-arxiv-wolf/index.html @@ -1,1318 +1,267 @@ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - WoLF: Large Language Model Framework for CXR Understanding | MICV - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
-

WoLF: Large Language Model Framework for CXR Understanding

- - - - - - - - - - - - - - - - -
- - -
-
- - -
-
- - - -
- - -

Abstract

-

Semantic segmentation has innately relied on extensive pixel-level - labeled annotated data, leading to the emergence of unsupervised methodologies. Among - them, leveraging self-supervised Vision Transformers for unsupervised semantic - segmentation (USS) has been making steady progress with expressive deep features. Yet, for - semantically segmenting images with complex objects, a predominant challenge remains: the - lack of explicit object-level semantic encoding in patch-level features. This technical - limitation often leads to inadequate segmentation of complex objects with diverse - structures. To address this gap, we present a novel approach, EAGLE, which emphasizes - object-centric representation learning for unsupervised semantic segmentation. - Specifically, we introduce EiCue, a spectral technique providing semantic and structural - cues through an eigenbasis derived from the semantic similarity matrix of deep image - features and color affinity from an image. Further, by incorporating our object-centric - contrastive loss with EiCue, we guide our model to learn object-level representations with - intra- and inter-image object-feature consistency, thereby enhancing semantic accuracy. - Extensive experiments on COCO-Stuff, Cityscapes, and Potsdam-3 datasets demonstrate the - state-of-the-art USS results of EAGLE with accurate and consistent semantic segmentation - across complex scenes.

- - - - -
-
-
-
-
Type
- -
-
-
-
-
- - - -
-
-
-
-
Publication
-
arxiv
-
-
-
-
-
- - -
- -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
-
-
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - \ No newline at end of file + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + WoLF: Large Language Model Framework for CXR Understanding | MICV + + + + + +
+
+
+

WoLF: Large Language Model Framework for CXR Understanding

+ + +
+
+
+ +
+
+
+

Abstract

+

+ Significant methodological strides have been made toward Chest X-ray (CXR) understanding via modern + vision-language models (VLMs), demonstrating impressive Visual Question Answering (VQA) and CXR + report generation abilities. However, existing CXR understanding frameworks still possess several + procedural caveats. (1) Previous methods solely use CXR reports, which are insufficient for + comprehensive Visual Question Answering (VQA), especially when additional health-related data like + medication history and prior diagnoses are needed. (2) Previous methods use raw CXR reports, which + are often arbitrarily structured. While modern language models can understand various text formats, + restructuring reports for clearer, organized anatomy-based information could enhance their + usefulness. (3) Current evaluation methods for CXR-VQA primarily emphasize linguistic correctness, + lacking the capability to offer nuanced assessments of the generated answers. In this work, to + address the aforementioned caveats, we introduce WoLF, a Wide-scope Large Language Model Framework + for CXR understanding. To resolve (1), we capture multi-faceted records of patients, which are + utilized for accurate diagnoses in real-world clinical scenarios. Specifically, we adopt the + Electronic Health Records (EHR) to generate instruction-following data suited for CXR understanding. + Regarding (2), we enhance report generation performance by decoupling knowledge in CXR reports based + on anatomical structure even within the attention step via masked attention. To address (3), we + introduce an AI-evaluation protocol optimized for assessing the capabilities of LLM. Through + extensive experimental validations, WoLF demonstrates superior performance over other models on + MIMIC-CXR in the AI-evaluation arena about VQA (up to +9.47%p mean score) and by metrics about + report generation (+7.3%p BLEU-1). +

+
+
+
+
+
Type
+ +
+
+
+
+
+
+
+
+
+
Publication
+
arxiv
+
+
+
+
+
+
+
+
+
+
+ + + + + + + + + + + + + + + diff --git a/publication/2024-miccai-cxrl/index.html b/publication/2024-miccai-cxrl/index.html index 262bcfc..df45a72 100644 --- a/publication/2024-miccai-cxrl/index.html +++ b/publication/2024-miccai-cxrl/index.html @@ -1,332 +1,43 @@ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - @@ -336,616 +47,123 @@ - - - - - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + Advancing Text-Driven Chest X-Ray Generation with Policy-Based Reinforcement Learning | MICV - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + class="page-wrapper " data-wc-page-id="e53fe15e44f0a6bcaa17d1aa78d58172"> - - - - - - - - -
- - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - + +
+
+

Advancing Text-Driven Chest X-Ray Generation with Policy-Based Reinforcement Learning

- - - - - - - - - - - - - - - +
- - -
- -

Abstract

-

.

- - - - +

+ Recent advances in text-conditioned image generation diffusion models have begun paving the way for + new opportunities in modern medical domain, in particular, generating Chest X-rays (CXRs) from + diagnostic reports. Nonetheless, to further drive the diffusion models to generate CXRs that + faithfully reflect the complexity and diversity of real data, it has become evident that a + nontrivial learning approach is needed. In light of this, we propose CXRL, a framework motivated by + the potential of reinforcement learning (RL). Specifically, we integrate a policy gradient RL + approach with well-designed multiple distinctive CXR-domain specific reward models. This approach + guides the diffusion denoising trajectory, achieving precise CXR posture and pathological details. + Here, considering the complex medical image environment, we present “RL with Comparative Feedback” + (RLCF) for the reward mechanism, a human-like comparative evaluation that is known to be more + effective and reliable in complex scenarios compared to direct evaluation. Our CXRL framework + includes jointly optimizing learnable adaptive condition embeddings (ACE) and the image generator, + enabling the model to produce more accurate and higher perceptual CXR quality. Our extensive + evaluation of the MIMIC-CXR-JPG dataset demonstrates the effectiveness of our RL-based tuning + approach. Consequently, our CXRL generates pathologically realistic CXRs, establishing a new + standard for generating CXRs with high fidelity to real-world clinical scenarios. +

- - -
@@ -957,246 +175,52 @@

Abstract

- -
-
- - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -