Skip to content

This repository's goal is to precompile all past presentations of the Huggingface reading group

License

Notifications You must be signed in to change notification settings

isamu-isozaki/huggingface-reading-group

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 

Repository files navigation

Hugginface Reading Group

Welcome to the Huggingface Reading Group! The goal of this group is to have a weekly presentation on research papers/groups of papers. The goal of this repository is to compile all the past presentation write-ups and recordings.

Brief History

This group was started by Huggingface community member James Kelly on 09/26/2023. In the beginning, we "presented" via a summary of papers in discord threads but we started 1/12/2024 to do presentations in discord calls thanks to Phil Butler. The presentations, in general, are targetted for the general audience on the subject of Generative Models but no research papers are off limits.

0: Ambiguity-Aware In-Context Learning with Large Language Models(Presented on 9/27/2023)

Presenter: James Kelly

Paper: Ambiguity-Aware In-Context Learning with Large Language Models

Discord Thread

1: Controlling Neural Networks with Rule Representations(Presented on 10/05/2023)

Presenter: James Kelly

Paper: Controlling Neural Networks with Rule Representations (NeurIPs, 2021)

Code

Discord Thread

2: Understanding Instaflow/Rectified Flow(Presented on 10/11/2023)

Presenter: Isamu Isozaki

Paper: InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation

Write up

Discord Thread

3: Mysteries of Text Embeddings(Presented on 10/19/2023)

Presenter: Isamu Isozaki

Papers: Text Embeddings Reveal (Almost) As Much As Text+NEFTune: Noisy Embeddings Improve Instruction Finetuning

Discord Thread

4: Training Image Derivatives: Increased Accuracy and Universal Robustness(Presented on 11/08/2023)

Presenter: Vsevolod I. Avrutskiy. Author of the paper

Paper: Training Image Derivatives: Increased Accuracy and Universal Robustness

Discord Thread

5: Understanding Zephyr(Presented on 11/16/2023)

Presenter: Isamu Isozaki

Paper: Zephyr: Direct Distillation of LM Alignment

Write up

Discord Thread

6: Literature Review on RAG(Retrieval Augmented Generation) for Custom Domains(Presented on 11/29/2023)

Presenter: Isamu Isozaki

Papers: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks + Improving the Domain Adaptation of Retrieval Augmented Generation (RAG) Models for Open Domain Question Answering + RA-DIT: Retrieval-Augmented Dual Instruction Tuning

Write up

Discord Thread

7: Understanding MagVIT2: Language Model Beats Diffusion: Tokenizer is key to visual generation(Presented on 12/13/2023)

Presenter: Isamu Isozaki

Paper: Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation

Write up

Discord Thread

8: Understanding Common Diffusion Noise Schedules and Sample Steps are Flawed(Presented on 12/21/2023)

Presenter: Isamu Isozaki

Paper: Common Diffusion Noise Schedules and Sample Steps are Flawed

Write up

Discord Thread

9: The Tyranny of Possibilities in the Design of Task-Oriented LLM Systems: A Scoping Survey(Presented on 1/5/2024)

Presenter: Dhruv Dhamani. Author of the paper

Paper: The Tyranny of Possibilities in the Design of Task-Oriented LLM Systems: A Scoping Survey

Discord Thread

10: Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation(Presented on 1/12/2024)

Presenter: Phil Butler

Paper: Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation

Write up

Unfortunately, no recordings but a coauthors came.

11: Literature Review on AI in Law(Presented on 2/2/2024)

Presenter: Isamu Isozaki

Papers: On the acceptability of arguments and its fundamental role in non-monotonic reasoning, logic programming, and n-person games+An Answer Set Programming Approach to Argumentative Reasoning in the ASPIC+ Framework+HYPO’s legacy: introduction to the virtual special issue+Induction of Defeasible Logic Theories in the Legal Domain+Pile of Law: Learning Responsible Data Filtering from the Law and a 256GB Open-Source Legal Dataset+Large Language Models in Law: A Survey+The Smart Court - A New Pathway to Justice in China?

Write up

Recording

Slides

12: A forthcoming decoder-only foundation model for time-series forecasting & further research(Presented on 2/9/2024)

Presenter: Tonic

Paper: A decoder-only foundation model for time-series forecasting

Recording

Slides

13: Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Presenter: Eric Auld

Paper: Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Recording

14: Neural Circuit Diagrams: Robust Diagrams for the Communication, Implementation, and Analysis of Deep Learning Architectures

Presenter: Vincent Abbott. Author of the paper

Paper: Neural Circuit Diagrams: Robust Diagrams for the Communication, Implementation, and Analysis of Deep Learning Architectures

Recording

15: SOTA on Model Merging

Presenter: Prateek Yadav. Author of TIES-Merging and ComPEFT

Papers: TIES-Merging: Resolving Interference When Merging Models+Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch+ComPEFT: Compression for Communicating Parameter Efficient Updates via Sparsification and Quantization+Learning to Route Among Specialized Experts for Zero-Shot Generalization

Recording

16: Gemini 1.5 Pro: Unlock reasoning and knowledge from entire books and movies in a single prompt

Presenter: Shashank Shekhar

Papers: Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context + Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference + Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity + Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts

Recording

Slides

17: HyperZ⋅Z⋅W Operator Connects Slow-Fast Networks for Full Context Interaction

Presenter: Harvie Zhang. Author of the paper

Paper: HyperZ⋅Z⋅W Operator Connects Slow-Fast Networks for Full Context Interaction

Recording

18: ProteinBERT: A universal deep-learning model of protein sequence and function

Presenter: Dan Ofer. Author of the papers

Papers: ProteinBERT: A universal deep-learning model of protein sequence and function+Detecting anomalous proteins using deep representations+Protein Language Models Expose Viral Mimicry and Immune Escape

Recording

Slides

19: Just Say the Name: Online Continual Learning with Category Names Only via Data Generation

I was absent this meeting so if anyone knows, please let me know/do a pr to fill this part!

Paper: Just Say the Name: Online Continual Learning with Category Names Only via Data Generation

20: Graph Machine Learning in the Era of Large Language Models (LLMs)

Presenter: Isamu Isozaki

Papers: Graph Machine Learning in the Era of Large Language Models (LLMs)+Large Language Models on Graphs: A Comprehensive Survey+House-GAN: Relational Generative Adversarial Networks for Graph-constrained House Layout Generation

Recording

Write up

Slides

21: Story Generation with AI

Presenter: Isamu Isozaki

Papers: GROVE: A Retrieval-augmented Complex Story Generation Framework with A Forest of Evidence+Creating Suspenseful Stories: Iterative Planning with Large Language Models+Improving Pacing in Long-Form Story Planning+Large Language Models Fall Short: Understanding Complex Relationships in Detective Narratives+Reading Subtext: Evaluating Large Language Models on Short Story Summarization with Writers+DOC: Improving Long Story Coherence With Detailed Outline Control+End-to-end Story Plot Generator+Weaver: Foundation Models for Creative Writing

Recording

Write up

Slides

22: AlphaFold 3

Presnter: starrynightdev

Papers: Accurate structure prediction of biomolecular interactions with AlphaFold 3+Highly accurate protein structure prediction with AlphaFold

Recording

Write ups: Huggingface blog+Github blog

Slides

23: AI for Physics. Hamilton Neural Networks/Lagrangian Neural Networks

Presenter: PS_Venom

Papers: Hamiltonian Neural Networks+Lagrangian Neural Networks

Recording

Slides

24: Understanding Current State of Reasoning with LLMs

Presenter: Isamu Isozaki

Papers: Natural Language Reasoning, A Survey + Emergent Abilities of Large Language Models + Chain-of-Thought Prompting Elicits Reasoning in Large Language Models + Finetuned Language Models Are Zero-Shot Learners + Show Your Work: Scratchpads for Intermediate Computation with Language Models + Language Models (Mostly) Know What They Know + Tree of Thoughts: Deliberate Problem Solving with Large Language Models + Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Language Models+Boosting Logical Reasoning in Large Language Models through a New Framework: The Graph of Thought + Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters + Large Language Models Can Be Easily Distracted by Irrelevant Context + Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting + Large Language Models Cannot Self-Correct Reasoning Yet + The Impact of Reasoning Step Length on Large Language Models + Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning + Efficient Tool Use with Chain-of-Abstraction Reasoning + Self-playing Adversarial Language Game Enhances LLM Reasoning

Recording

Slides

Write up

25: Multimodal Structured Generation & CVPR’s 2nd MMFM Challenge

Presenter: Franz Louis Cesista. Author of paper

Paper: Multimodal Structured Generation: CVPR's 2nd MMFM Challenge Technical Report

Recording

Slides

26: SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound

Presenter: Rishit Dagli. First author of paper

Paper: SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound

Recording

27: Understanding Penetration Testing with LLMs

Presenter: Isamu Isozaki, Manil Shrestha

Papers: PentestGPT: An LLM-empowered Automatic Penetration Testing Tool+LLM Agents can Autonomously Hack Websites+LLM Agents can Autonomously Exploit One-day Vulnerabilities+Teams of LLM Agents can Exploit Zero-Day Vulnerabilities+LLMs as Hackers: Autonomous Linux Privilege Escalation Attacks+AutoAttacker: A Large Language Model Guided System to Implement Automatic Cyber-attacks

Recording

Slides

Write up

About

This repository's goal is to precompile all past presentations of the Huggingface reading group

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published