Skip to content

Latest commit

 

History

History
163 lines (117 loc) · 7.15 KB

README.md

File metadata and controls

163 lines (117 loc) · 7.15 KB

Awesome-RLHF-Vision

Awesome visitor badge GitHub stars GitHub forks GitHub license

Introduction

Welcome to the Awesome-RLHF-Vision repository! This is a curated collection of research papers focusing on Reinforcement Learning with Human Feedback (RLHF) as applied to vision models. Our goal is to create a comprehensive resource that is continuously updated to track the latest advancements in this exciting field.


⭐ If you find this repository useful or interesting, please consider giving it a star to show your support! ⭐

Table of Contents

  1. Introduction
  2. Research Papers
  3. Contributing
  4. License
  5. Acknowledgements
  6. Contact

Research Papers

Here you can find a list of research papers categorized by topics related to RLHF for vision models.

format:
- [title](paper link) [links]
  - author1, author2, and author3...
  - publisher
  - keyword
  - summary
  - code
  - experiment environments and datasets

Papers

2024

2023

Datasets

  • RLHF-V-Dataset

    • OpenBMB
    • Keyword: Human Preference, Rankings, Trustworthiness
    • Task: Diverse multimodal understanding tasks
  • RLAIF-V-Dataset

    • OpenBMB
    • Keyword: AI Preference, Rankings, Trustworthiness
    • Task: Diverse multimodal understanding tasks
  • Picka-Pic-v1

    • Yuval Kirstain and Adam Polyak and Uriel Singer and Shahbuland Matiana and Joe Penna and Omer Levy
    • Keyword: Human Preference, Rankings
    • Task: Image Generation
  • Picka-Pic-v2

    • Yuval Kirstain and Adam Polyak and Uriel Singer and Shahbuland Matiana and Joe Penna and Omer Levy
    • Keyword: Human Preference, Rankings
    • Task: Image Generation
  • ImageRewardDB

    • THUDM
    • Keyword: Human Preference, Rankings
    • Task: Image Generation
  • Simulacra Aesthetic Captions

    • John David Pressman and Katherine Crowson and Simulacra Captions Contributors
    • Keyword: Human Preference, Ratings
    • Task: Image Generation
  • HPS

    • Xiaoshi Wu and Keqiang Sun and Feng Zhu and Rui Zhao and Hongsheng Li
    • Keyword: Human Preference, Ratings
    • Task: Image Generation
  • RichHF-18k

    • Google Research
    • Keyword: Human Preference, Ratings, Human-labeled Heatmaps (e.g., artifact regions of distorted pixels) and Misalignment Tokens in Prompts
    • Task: Image Generation

Note: This list is continually updated. Make sure to check back regularly for the most recent papers.

Contributing

We welcome contributions from the community! If you have a paper or resource you'd like to add, feel free to submit a pull request or open an issue. Please follow our contribution guidelines for details.


License

This repository is licensed under the MIT License. See the LICENSE file for more information.


Acknowledgements

We would like to thank the contributors and researchers whose efforts have made this compilation possible.

Contact

For any questions or suggestions, feel free to open an issue or contact us directly ([email protected]). We appreciate your feedback!


Thank you for visiting Awesome-RLHF-Vision! Happy reading and researching!