This is a repo of Cass's tools for investigating AI Safety.
- [link] Example of getting Anthropic to "send" a user a "malicious" email
- Mitigating Risks in AI Tool Use: A Case Study on RAG and Customer Support Agents
Dependencies are managed via Nix Flakes along with Poetry for Python packages.