- 👋 Hi, I’m Runze Liu, a second-year master student at Tsinghua Unversity.
- 👀 I’m interested in Large Language Models (LLMs), Reinforcement Learning (RL) and Reinforcement Learning from Human Feedback (RLHF).
🎯
Focusing
I am Runze Liu, a second-year master's student at Tsinghua University.
-
Tsinghua University
- Qingdao
-
05:18
(UTC +08:00) - https://ryanliu112.github.io
- https://scholar.google.com/citations?user=LiIfGakAAAAJ
Highlights
- Pro
Pinned Loading
-
compute-optimal-tts
compute-optimal-tts PublicOfficial codebase for "Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling".
-
Awesome-Process-Reward-Models
Awesome-Process-Reward-Models PublicA comprehensive collection of process reward models.
-
ChangWinde/RAT
ChangWinde/RAT Public[AAAI 2025 Oral] Official code for "RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted Behaviors"
Python 10
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.