Skip to content

Latest commit

 

History

History
33 lines (32 loc) · 1.18 KB

LLM Jailbreaking Techniques.md

File metadata and controls

33 lines (32 loc) · 1.18 KB
created modified tags type status
2024-11-27T08:53
2024-11-27 08:59
llm
large-language-model
nlp
natural-language-processing
security
cybersecurity
jail-break
red-team
red-teaming
vulnerability
map-of-content
ongoing

Main body of note goes here

Technique Description Link(s)
Bijection Learning https://arxiv.org/abs/2410.01294
Multi-turn jailbreaks via Monte Carlo Tree Search
Transferring attacks from ACG https://blog.haizelabs.com/posts/acg/
Evolutionary Algorithms https://github.com/haizelabs/dspy-redteam
BEAm search-based AdverSarial aTtack (BEAST) https://arxiv.org/abs/2402.15570

References

Related

  • Links to other notes which are directly related go here