Name	Name	Last commit message	Last commit date
parent directory ..
.gitignore	.gitignore
README.md	README.md
agent-security-evaluation-tutorial.ipynb	agent-security-evaluation-tutorial.ipynb
example_prompts.csv	example_prompts.csv
model_testing_tools.py	model_testing_tools.py
prompt_manipulation_tools.py	prompt_manipulation_tools.py
requirements.txt	requirements.txt
system_prompt.txt	system_prompt.txt

Name

Last commit message

Last commit date

README.md

agent-security-evaluation-tutorial.ipynb

example_prompts.csv

model_testing_tools.py

prompt_manipulation_tools.py

requirements.txt

system_prompt.txt

Agent Security Evaluation Tutorial

Overview

This tutorial teaches prompt injection attacks and defenses through hands-on security testing of AI systems. Instead of just reading about vulnerabilities, you'll actually exploit them and then build protections against them.

What's included

Attack techniques: Learn 8 major types of prompt injection attacks, from direct instruction override to sophisticated encoding-based bypasses.

Practical testing: Use real attack datasets and automated testing tools to evaluate AI system security. Includes 91 documented attack examples from security research.

Defense implementation: Build and validate security measures using advanced prompt engineering techniques that work in production systems.

Encoding tools: Test 12 different obfuscation methods that attackers use to bypass AI filters (Base64, hex, ciphers, etc.).

What you'll learn

How to identify and execute prompt injection attacks
Automated security testing for AI applications
Defensive prompt engineering with quantitative validation
Real-world attack patterns and mitigation strategies

Prerequisites

Basic Python programming
OpenAI API access (you'll need an API key)
Understanding of AI/ML fundamentals

Files included

agent-security-evaluation-tutorial.ipynb - Main tutorial notebook
model_testing_tools.py - Automated testing framework
prompt_manipulation_tools.py - Encoding/obfuscation utilities
system_prompt.txt - Example defensive prompt
example_prompts.csv - Dataset of 91 real attack examples

Setup

Install required packages: pip install openai python-dotenv pandas
Create a .env file with your OpenAI API key: OPENAI_API_KEY=your_key_here
Run the notebook cells in order

The tutorial takes about 30-45 minutes to complete and includes both automated testing and manual experimentation.

Warning

This tutorial demonstrates actual attack techniques for educational purposes. Use these methods only on systems you own or have explicit permission to test.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Agent Security Evaluation Tutorial

Overview

What's included

What you'll learn

Prerequisites

Files included

Setup

Warning

FilesExpand file tree

agent-security-apex

Directory actions

More options

Directory actions

More options

Latest commit

History

agent-security-apex

Folders and files

parent directory

README.md

Agent Security Evaluation Tutorial

Overview

What's included

What you'll learn

Prerequisites

Files included

Setup

Warning