Skip to content
View vwxyzjn's full-sized avatar
😃
😃

Highlights

  • Pro

Block or report vwxyzjn

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. huggingface/trl huggingface/trl Public

    Train transformer language models with reinforcement learning.

    Python 10.1k 1.3k

  2. lm-human-preference-details lm-human-preference-details Public

    RLHF implementation details of OAI's 2019 codebase

    Python 154 8

  3. cleanrl cleanrl Public

    High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

    Python 5.7k 647

  4. ppo-implementation-details ppo-implementation-details Public

    The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization

    Python 648 99

  5. cleanba cleanba Public

    CleanRL's implementation of DeepMind's Podracer Sebulba Architecture for Distributed DRL

    Python 105 11

  6. portwarden portwarden Public

    Create Encrypted Backups of Your Bitwarden Vault with Attachments

    Go 590 33