created

modified

tags

type

status

2024-11-27T08:53

2024-11-27 08:59

llm

large-language-model

nlp

natural-language-processing

security

cybersecurity

jail-break

red-team

red-teaming

vulnerability

map-of-content

ongoing

Main body of note goes here

Technique	Description	Link(s)
Bijection Learning		https://arxiv.org/abs/2410.01294
Multi-turn jailbreaks via Monte Carlo Tree Search
Transferring attacks from ACG		https://blog.haizelabs.com/posts/acg/
Evolutionary Algorithms		https://github.com/haizelabs/dspy-redteam
BEAm search-based AdverSarial aTtack (BEAST)		https://arxiv.org/abs/2402.15570

References

Provide feedback