This repository contains a structured and detailed guide on AI Red Teaming, exploring its history, risks, techniques, and architectural principles. The book provides both theoretical and practical insights into the red teaming of AI systems, covering foundational knowledge and hands-on demonstrations.
AI Red Teaming is a structured process designed to evaluate the security, robustness, and ethical alignment of artificial intelligence systems. This book is an effort to guide practitioners, researchers, and enthusiasts in understanding and implementing AI red teaming strategies effectively.
Through this repository, you will:
- Learn the history and evolution of AI risks and failures.
- Understand the taxonomy of AI attacks and adversarial techniques.
- Dive into the inner workings of large language models (LLMs), including transformers, tokenization, and hyperparameters.
- Gain hands-on experience with tools and techniques such as adversarial testing and jailbreaking.
The content is divided into chapters across four main sections: AI Red Teaming, LLM Architecture, Prompt Injections, and LLM Training. Below is the structure with direct links to the files.
- 1.1 History of AI Risks
- 1.2 AI Risks
- 1.3 What is AI Red Teaming?
- 1.4 AI Attacks Taxonomy (Part 1)
- 1.5 AI Attacks Taxonomy (Part 2)
- 1.6 Jailbreaking Demo
- 2.1 History of LLMs
- 2.2 Tokenization
- 2.3 Self-Attention
- 2.4 Transformer Architecture
- 2.5 Hyperparameters
- 2.6 LLM Comparisons
- 3.1 Intro to Prompt Injections
- 3.1.1 Real-life Prompt Injection Examples
- 3.2 Anatomy of a Prompt
- 3.3 Craft a Prompt Injection Prompt
- 3.4 Basics of Prompt Injection Techniques
- 3.5 Intermediate Prompt Injection Techniques
- 3.7 Try Prompt Injections on Medusa Pokebot
- 3.8 Advanced Prompt Injections
- 4.1 Inference Advanced Parameters
- 4.2 Foundation Model Training
- 4.3 Fine-Tuning LLMs
- 4.4 Retrieval-Augmented Generation (RAG)
To begin exploring the content:
- Clone this repository:
git clone https://github.com/your-repository-name.git
- Navigate through the chapters using the links provided in the Structure section.
Contributions are highly encouraged! Here's how you can get involved:
- Found a bug, typo, or inconsistency?
You can contribute by creating a ticket or an issue.
To do so:- Go to the Issues tab in this repository.
- Click on New Issue.
- Fill in the issue template with relevant details.
- Submit your ticket to notify maintainers of the problem or suggestion.
This is an excellent way to contribute even if you're not comfortable with coding or writing directly in the repository.
- Click the Fork button at the top right of this repository to create your own copy.
- Clone your forked repository to your local machine:
git clone https://github.com/your-username/ai-red-teaming.git
- Create a new branch to work on a specific feature or fix:
git checkout -b feature-or-fix-name
- Add your contributions or edits to the appropriate file(s).
- Ensure that your changes are accurate and do not introduce issues.
- Commit your changes with a descriptive message:
git add . git commit -m "Add detailed description of your changes"
- Push your changes to your forked repository:
git push origin feature-or-fix-name
- Go to the original repository and create a Pull Request (PR) from your forked branch. Describe your changes and why they should be merged.
This project is licensed under the MIT License. See the LICENSE file for details.