Med-R1-Alpha

Unleashing Reasoning in Medical Large Language Models

Overview

Building on the advancements of DeepSeek-R1 in enhancing reasoning through reinforcement learning (RL), recent open-source projects have explored RL’s potential in training LLMs. However, these efforts have remained confined to limited domains and smaller-scale models, without fully expanding to complex, large-scale applications.

Med-R1 is designed to unleash the reasoning capability of medical LLMs through RL training, relying only on low-cost, small-scale multiple-choice QA data rather than the construction of large-scale Chain-of-Thought (CoT) datasets or distillation from proprietary models like GPT. Unlike conventional methods that depend on costly supervised fine-tuning with extensive CoT annotations, Med-R1 demonstrates that a well-trained instruction-tuned LLM can develop strong reasoning skills with minimal training samples.

In our early experiments, Med-R1-Alpha, trained with RL using only a small set of multiple-choice QA data, has already outperformed Huatuo-O1, which was trained via supervised fine-tuning on large-scale CoT data distilled from GPT-O1. This result highlights that reinforcement learning can effectively incentivize and refine reasoning capabilities without relying on expensive CoT supervision. Similar to DeepSeek-R1-Zero, Med-R1-Alpha shows that RL-trained models can achieve superior reasoning ability with fewer samples and without the need for complex, manually annotated CoT data.

Beyond textual reasoning, we have also unleashed this capability in medical Vision-Language Models (VLMs). Our work, MedVLM-R1, demonstrates how reinforcement learning can significantly enhance multi-modal reasoning within medical AI. Med-R1 represents a step towards building cost-efficient, highly capable medical LLMs that leverage RL to develop strong reasoning abilities without dependence on large-scale, closed-source CoT data.

Preview Results

TODO

Citation

@Misc{med-r1-alpha,
  title = {Med-R1-Alpha: Unleashing Reasoning in Medical Large Language Models},
  author = {Med-R1 Team},
  howpublished = {\url{https://github.com/cheliu-computation/Med-R1-Alph}},
  year = {2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Med-R1-Alpha

Overview

Preview Results

TODO

Citation

About

Releases

Packages

License

cheliu-computation/Med-R1-Alpha

Folders and files

Latest commit

History

Repository files navigation

Med-R1-Alpha

Overview

Preview Results

TODO

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages