Kindly request the inclusion on a line of papers on harmful fine-tuning for LLMs #33

huangtiansheng · 2024-10-06T20:05:04Z

Thank you for the wonderful paper collection. We have a line of research on harmful fine-tuning for LLMs. Could you please include this line of work into the repo?

Title	Link	Code	Venue	Classification	Model	Comment
Vaccine: Perturbation-aware Alignment for Large Language Models against Harmful Fine-tuning	arxiv	github	NeurIPS'24	Defense	LLM	Harmful fine-tuning
Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning	arxiv	github	NeurIPS'24	Defense	LLM	Harmful fine-tuning
Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation	arxiv	github	arXiv	Defense	LLM	Harmful fine-tuning
Antidote: Post-fine-tuning Safety Alignment for Large Language Models against Harmful Fine-tuning	arxiv	To-be-released	arXiv	Defense	LLM	Harmful fine-tuning
Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey	arxiv	awesome project	arXiv	Survey& Other awesome project	LLM	Harmful fine-tuning

Thank you in advance!

Best,
Tiansheng Huang

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kindly request the inclusion on a line of papers on harmful fine-tuning for LLMs #33

Kindly request the inclusion on a line of papers on harmful fine-tuning for LLMs #33

huangtiansheng commented Oct 6, 2024 •

edited

Loading

Kindly request the inclusion on a line of papers on harmful fine-tuning for LLMs #33

Kindly request the inclusion on a line of papers on harmful fine-tuning for LLMs #33

Comments

huangtiansheng commented Oct 6, 2024 • edited Loading

huangtiansheng commented Oct 6, 2024 •

edited

Loading