X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages (2023.05.07)
Feilong Chen, Minglun Han, Haozhi Zhao, Qingyang Zhang, Jing Shi, etc . - 【arXiv.org】
Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision (2023.05.04)
Zhiqing Sun, Yikang Shen, Qinhong Zhou, Hongxin Zhang, Zhenfang Chen, etc . - 【arXiv.org】
AutoML-GPT: Automatic Machine Learning with GPT (2023.05.04)
Shujian Zhang, Chengyue Gong, Lemeng Wu, Xingchao Liu, Mi Zhou . - 【arXiv.org】
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes (2023.05.03)
Cheng-Yu Hsieh, Chun-Liang Li, Chih-Kuan Yeh, Hootan Nakhost, Yasuhisa Fujii, etc . - 【arXiv.org】
Unlimiformer: Long-Range Transformers with Unlimited Length Input (2023.05.02)
Amanda Bertsch, Uri Alon, Graham Neubig, Matthew R. Gormley . - 【arXiv.org】
Transfer Visual Prompt Generator across LLMs (2023.05.02)
Ao Zhang, Hao Fei, Yuan Yao, Wei Ji, Li Li, etc . - 【arXiv.org】
Nikhil Mehta, Milagro Teruel, Patricio Figueroa Sanz, Xinwei Deng, A. Awadallah, etc
Segment Anything Model for Medical Image Analysis: an Experimental Study (2023.04.20)
Maciej A. Mazurowski, Haoyu Dong, Han Gu, Jichen Yang, N. Konz, etc
Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models (2023.04.19)
Pan Lu, Baolin Peng, Hao Cheng, Michel Galley, Kai-Wei Chang, etc
Accuracy of Segment-Anything Model (SAM) in medical image segmentation tasks (2023.04.18)
Sheng He, Rina Bao, Jingpeng Li, P. Grant, Yangming Ou
Chuanfei Hu, Xinde Li
F. Putz, Johanna Grigo, T. Weissmann, P. Schubert, D. Hoefler, etc
Deep learning universal crater detection using Segment Anything Model (SAM) (2023.04.16)
I. Giannakis, A. Bhardwaj, L. Sam, G. Leontidis . - 【arXiv.org】
Segment Anything Model (SAM) for Digital Pathology: Assess Zero-shot Segmentation on Whole Slide Imaging (2023.04.09)
Ruining Deng, C. Cui, Quan Liu, Tianyuan Yao, L. W. Remedios, etc . - 【arXiv.org】
TagGPT: Large Language Models are Zero-shot Multimodal Taggers (2023.04.06)
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling (2023.04.03)
Stella Biderman, Hailey Schoelkopf, Quentin Anthony, Herbie Bradley, Kyle O'Brien, etc
BloombergGPT: A Large Language Model for Finance (2023.03.30)
Shijie Wu, Ozan Irsoy, Steven Lu, Vadim Dabravolski, Mark Dredze, etc
Scaling Expert Language Models with Unsupervised Domain Discovery (2023.03.24)
Suchin Gururangan, Margaret Li, Mike Lewis, Weijia Shi, Tim Althoff, etc
Sparks of Artificial General Intelligence: Early experiments with GPT-4 (2023.03.22)
S'ebastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, etc
CoLT5: Faster Long-Range Transformers with Conditional Computation (2023.03.17)
J. Ainslie, Tao Lei, Michiel de Jong, Santiago Ontan'on, Siddhartha Brahma, etc . - 【ArXiv】
Meet in the Middle: A New Pre-training Paradigm (2023.03.13)
A. Nguyen, Nikos Karampatziakis, Weizhu Chen . - 【ArXiv】
High-throughput Generative Inference of Large Language Models with a Single GPU (2023.03.13)
Ying Sheng, Lianmin Zheng, Binhang Yuan, Zhuohan Li, Max Ryabinin, etc . - 【ArXiv】
Stabilizing Transformer Training by Preventing Attention Entropy Collapse (2023.03.11)
Shuangfei Zhai, T. Likhomanenko, Etai Littwin, Dan Busbridge, Jason Ramapuram, etc . - 【ArXiv】
An Overview on Language Models: Recent Developments and Outlook (2023.03.10)
Chen Wei, Yun Cheng Wang, Bin Wang, C.-C. Jay Kuo . - 【ArXiv】
Foundation Models for Decision Making: Problems, Methods, and Opportunities (2023.03.07)
Sherry Yang, Ofir Nachum, Yilun Du, Jason Wei, P. Abbeel, etc . - 【ArXiv】
How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding (2023.03.07)
Yuchen Li, Yuan-Fang Li, Andrej Risteski . - 【ArXiv】
LLaMA: Open and Efficient Foundation Language Models (2023.02.27)
Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, etc . - 【ArXiv】
Self-Instruct: Aligning Language Model with Self Generated Instructions (2022.12.20)
Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, etc . - 【ArXiv】
Solving Math Word Problem via Cooperative Reasoning induced Language Models (2022.10.28)
Xinyu Zhu, Junjie Wang, Lin Zhang, Yuxiang Zhang, Ruyi Gan, etc . - 【ArXiv】
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback (2022.04.12)
Yuntao Bai, Andy Jones, Kamal Ndousse, Amanda Askell, Anna Chen, etc . - 【ArXiv】
PaLM: Scaling Language Modeling with Pathways (2022.04.05)
Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, etc . - 【ArXiv】
Training language models to follow instructions with human feedback (2022.03.04)
Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, etc . - 【ArXiv】
LoRA: Low-Rank Adaptation of Large Language Models (2021.06.17)
Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, etc . - 【International Conference on Learning Representations】
Language Models are Unsupervised Multitask Learners
Alec Radford, Jeff Wu, Rewon Child, D. Luan, Dario Amodei, etc