diff --git a/index.html b/index.html index 373119fe3..fd6656e3d 100644 --- a/index.html +++ b/index.html @@ -3,24 +3,11 @@ + content="MedSyn: Text-guided Anatomy-aware Synthesis of High-Fidelity 3D CT Images"> - Nerfies: Deformable Neural Radiance Fields + MedSyn: Text-guided Anatomy-aware Synthesis of High-Fidelity 3D CT Images - - - @@ -42,85 +29,55 @@ - -
-

Nerfies: Deformable Neural Radiance Fields

+

MedSyn: Text-guided Anatomy-aware Synthesis of High-Fidelity 3D CT Images

- 1University of Washington, - 2Google Research + 1Boston University, + 2Stanford University, + 3Carnegie Mellon University, + 4University of Pittsburgh Medical Center, + 5University of Pittsburgh
@@ -175,81 +115,6 @@

Nerfies: Deformable Neural Radiance Fie

-
-
-
- -

- Nerfies turns selfie videos from your phone into - free-viewpoint - portraits. -

-
-
-
- - -
-
-
- -
-
-
- -
@@ -258,198 +123,28 @@

Abstract

- We present the first method capable of photorealistically reconstructing a non-rigidly - deforming scene using photos/videos captured casually from mobile phones. -

-

- Our approach augments neural radiance fields - (NeRF) by optimizing an - additional continuous volumetric deformation field that warps each observed point into a - canonical 5D NeRF. - We observe that these NeRF-like deformation fields are prone to local minima, and - propose a coarse-to-fine optimization method for coordinate-based models that allows for - more robust optimization. - By adapting principles from geometry processing and physical simulation to NeRF-like - models, we propose an elastic regularization of the deformation field that further - improves robustness. -

-

- We show that Nerfies can turn casually captured selfie - photos/videos into deformable NeRF - models that allow for photorealistic renderings of the subject from arbitrary - viewpoints, which we dub "nerfies". We evaluate our method by collecting data - using a - rig with two mobile phones that take time-synchronized photos, yielding train/validation - images of the same pose at different viewpoints. We show that our method faithfully - reconstructs non-rigidly deforming scenes and reproduces unseen views with high - fidelity. + This paper introduces an innovative methodology for producing high-quality 3D lung CT images guided by textual information. While diffusion-based generative models are increasingly used in medical imaging, current state-of-the-art approaches are limited to low-resolution outputs and underutilize radiology reports' abundant information. The radiology reports can enhance the generation process by providing additional guidance and offering fine-grained control over the synthesis of images. Nevertheless, expanding text-guided generation to high-resolution 3D images poses significant memory and anatomical detail-preserving challenges. Addressing the memory issue, we introduce a hierarchical scheme that uses a modified UNet architecture. We start by synthesizing low-resolution images conditioned on the text, serving as a foundation for subsequent generators for complete volumetric data. To ensure the anatomical plausibility of the generated samples, we provide further guidance by generating vascular, airway, and lobular segmentation masks in conjunction with the CT images. The model demonstrates the capability to use textual input and segmentation tasks to generate synthesized images. Algorithmic comparative assessments and blind evaluations conducted by 10 board-certified radiologists indicate that our approach exhibits superior performance compared to the most advanced models based on GAN and diffusion techniques, especially in accurately retaining crucial anatomical features such as fissure lines and airways. This innovation introduces novel possibilities. This study focuses on two main objectives: (1) the development of a method for creating images based on textual prompts and anatomical components, and (2) the capability to generate new images conditioning on anatomical elements. The advancements in image generation can be applied to enhance numerous downstream tasks.

- -
-
-

Video

-
- -
-
-
- - +
-
-
- -
- - -
-
-

Visual Effects

-

- Using nerfies you can create fun visual effects. This Dolly zoom effect - would be impossible without nerfies since it would require going through a wall. -

- -
-
- - - -
-

Matting

-
-
-

- As a byproduct of our method, we can also solve the matting problem by ignoring - samples that fall outside of a bounding box during rendering. -

- -
- -
-
-
- - - -
-
-

Animation

- - -

Interpolating states

-
-

- We can also animate the scene by interpolating the deformation latent codes of two input - frames. Use the slider here to linearly interpolate between the left frame and the right - frame. -

-
-
-
- Interpolate start reference image. -

Start Frame

-
-
-
- Loading... -
- -
-
- Interpolation end reference image. -

End Frame

-
-
-
- - - -

Re-rendering the input video

-
-

- Using Nerfies, you can re-render a video from a novel - viewpoint such as a stabilized camera by playing back the training deformations. -

-
-
- -
- - -
-
- - - - -
-
-

Related Links

- -
-

- There's a lot of excellent work that was introduced around the same time as ours. -

-

- Progressive Encoding for Neural Optimization introduces an idea similar to our windowed position encoding for coarse-to-fine optimization. -

-

- D-NeRF and NR-NeRF - both use deformation fields to model non-rigid scenes. -

-

- Some works model videos with a NeRF by directly modulating the density, such as Video-NeRF, NSFF, and DyNeRF -

-

- There are probably many more by the time you are reading this. Check out Frank Dellart's survey on recent NeRF papers, and Yen-Chen Lin's curated list of NeRF papers. -

-
-
-
- - -
-
-

BibTeX

-
@article{park2021nerfies,
-  author    = {Park, Keunhong and Sinha, Utkarsh and Barron, Jonathan T. and Bouaziz, Sofien and Goldman, Dan B and Seitz, Steven M. and Martin-Brualla, Ricardo},
-  title     = {Nerfies: Deformable Neural Radiance Fields},
-  journal   = {ICCV},
-  year      = {2021},
-}
+
@ARTICLE{10566053,
+  author={Xu, Yanwu and Sun, Li and Peng, Wei and Jia, Shuyue and Morrison, Katelyn and Perer, Adam and Zandifar, Afrooz and Visweswaran, Shyam and Eslami, Motahhare and Batmanghelich, Kayhan},
+  journal={IEEE Transactions on Medical Imaging}, 
+  title={MedSyn: Text-guided Anatomy-aware Synthesis of High-Fidelity 3D CT Images}, 
+  year={2024},
+  doi={10.1109/TMI.2024.3415032}}
+
@@ -458,10 +153,10 @@

BibTeX

+ href="https://arxiv.org/pdf/2310.03559"> - +