Releases: Azure/PyRIT
v0.9.0
What's Changed
Targets
HTTPTarget
Improvements that properly parse the HTTP version, automatically calculate the content-length, and make headers case insensitive.- FIX: Fixed IndexError with
RealtimeTarget
to handle responses properly
Datasets
- Social Engineering (Persuasion and Deception) Scenarios: See
datasets/orchestrators/red_teaming/persuasion_deception
anddatasets/orchestrators/role_play/persuasion_script.yaml
- Multilingual Vulnerability dataset from "A Framework to Assess Multilingual Vulnerabilities of LLMs"
Converters
- Enhancements to the
AsciiSmugglerConverter
by adding support for two methods for encoding hidden data (embedding directly in a Unicode character (default: 😊) and appending hidden data to visible text). ZalgoConverter
: Adds Unicode characters to text to make it appear "glitchy"ToxicSentenceGeneratorConverter
: Generate toxic sentence starters based on seed prompts- FIX: Remove JSON Instructions for
TranslationConverter
to address intermittent failures due to JSON parsing issues and non-consistent responses from endpoints.
Orchestrators
- [BREAKING] Rename
MultiTurnAttackResult
toOrchestratorResult
as part of a bigger refactor to tack objectives and results. - FIX: Keep Conversation ID in PromptSendingOrchestrator if it is provided
- FIX: Remove Harm-Specific Prevention from
CrescendoOrchestrator
Scorers
- Generic Scorer with Flexible Inputs:
SelfAskGeneralScorer
inpyrit/score/general_scorer.py
. It can be configured to use different scoring types (e.g. True/False, float) and can format the prompt using a system prompt and a format string. - Criteria-Based Scorer (used with
SelfAskScaleScorer
): Provides evaluation criteria that is specific to a given objective. CompositeScorer
: Combines multiple True/False Results into a single True/False Result
Dependencies
- Moves
jupyter
andipykernel
from required into an optional [dev] dependency. If you need to use Jupyter notebooks with PyRIT, you'll need to install using methods outlined here. - Moves
azure-cognitiveservices-speech
from required into an optional [speech] dependency.
Other
- Added custom file name support to allows for saving data (image, audio, video, etc.) to storage under a custom name.
- Custom Retry Decorator:
pyrit_custom_result_retry
to retry a function if a certain condition is true. This augments existing retry decorators which retry functions based on exception criteria. - Optimizations and various bug fixes to
.devcontainer
Full list of changes
- [FEAT] New Generic Scorer with Flexible Inputs by @jbolor21 in #816
- MAINT post-v0.8.2.dev0 release updates by @romanlutz in #861
- DOC: add LM Studio support note to the user guide by @paulinek13 in #863
- MAINT: Make integration tests run outside of repository and various fixes by @jsong468 in #862
- FEAT: Add Custom File Name Support to Data Serializer by @nina-msft in #868
- FEAT: Add Custom Retry Decorator: pyrit_custom_result_retry by @nina-msft in #869
- FEAT: optimized .devcontainer by @bashirpartovi in #871
- DOC: Fix Up Multi Turn Target Docs & OpenAI Dalle/TTS Target Docstring by @nina-msft in #870
- DOC: improve accessibility of the contributor guide flowchart by @paulinek13 in #866
- FIX: fixed the extension directory for vscode by @bashirpartovi in #872
- FIX jupyter set as dev dependency by @afogel in #857
- MAINT enhanced initialization and caching for devcontainer by @bashirpartovi in #873
- FIX: fixed indexing and conda cache for devcontainer by @bashirpartovi in #876
- FIX: Resolve mypy pre-commit error in chat_message_normalizer_tokenizer by @nina-msft in #875
- MAINT: HTTPTarget Improvements by @rlundeen2 in #879
- FEAT: Smuggling arbitrary data through an emoji by @KutalVolkan in #842
- DOC fix markdown link by @dennis-rall in #880
- FEAT Persuasion and Deception Scenarios by @whackswell in #878
- FIX: Update
re.split
calls to usemaxsplit
keyword argument by @emmanuel-ferdman in #885 - BREAKING FEAT: orchestrator result by @rlundeen2 in #886
- FEAT: Added Multilingual Vulnerability Dataset by @devesh-2002 in #834
- FIX keep conversation ID in PromptSendingOrchestrator if it's passed in by @romanlutz in #889
- FEAT Adding into Criteria based scoring by @eugeniavkim in #874
- FIX fixed msodbcsql dep for devcontainer by @bashirpartovi in #895
- MAINT: Remove Azure Speech SDK as Required Dependency by @nina-msft in #896
- FIX pip upgrade issue on windows by @bashirpartovi in #901
- FEAT: Zalgo Converter by @elisetreit in #883
- FEAT: Composite Scorer by @rlundeen2 in #898
- FIX: XPIA Notebook Env Variable Fix by @jbolor21 in #899
- FIX: bug where scorer_type is not set in AzureContentFilterScorer by @rlundeen2 in #902
- MAINT: Generic Scorer Notebook Reorganizing by @jbolor21 in #904
- MAINT Refactor question answer orchestrator as prompt orchestrator by @AdrGav941 in #894
- FEAT: Toxic Sentence Generator by @0xm00n in #893
- FIX Removed JSON instructions for Translation Converter by @bashirpartovi in #910
- FIX Removing harm specific prevention for Crescendo Orchestrator @eugeniavkim in #911
- FIX IndexError with RealtimeTarget @bashirpartovi in #914
- DOC Updates to '11. Releasing PyRIT' documentation @nina-msft
New Contributors
- @afogel made their first contribution in #857
- @dennis-rall made their first contribution in #880
- @whackswell made their first contribution in #878
- @emmanuel-ferdman made their first contribution in #885
- @devesh-2002 made their first contribution in #834
- @elisetreit made their first contribution in #883
- @0xm00n made their first contribution in #893
Full Changelog: v0.8.1...v0.9.0
v0.8.1
What's Changed
- We have a new cookbook on Precomputing turns for orchestrators
OpenAIChatTarget
s now have an argumentis_json_supported
to allow specifying if theresponse_format
request header should be set. This is supported by OpenAI, but not by several other providers that otherwise follow the OpenAI API.- There is now a Docker image for PyRIT users! Check out the steps outlined in the docker/README to try it out and feel free to provide feedback in GitHub issues or on Discord.
- The Tom-and-Jerry jailbreak template was added!
- When using AAD/Entra auth with
OpenAITarget
, the target auto-refreshes the auth token periodically now. This addresses a bug where the token would get stale after a period of time. - We also addressed bugs that resulted in exceptions from triggered content filters and empty exception which should lead to a smoother experience.
Full list of changes
- MAINT post-v0.8.0 release update by @romanlutz in #837
- MAINT: Making JSON support configurable with OpenAIChatTargets by @rlundeen2 in #833
- FEAT: Add Dockerized PyRIT with Jupyter Notebook Support by @ErdemOzgen in #784
- FEAT: add Tom-and-Jerry jailbreak by @hagsmand in #838
- DOC: Adding cookbook around prepending turns by @rlundeen2 in #840
- FIX: Small fix in cookbook by @jsong468 in #849
- FIX catch content_filter with 200s instead of 500s by @romanlutz in #850
- FIX: Amended dockerfile and requirements.txt to unblock ADO pipelines by @jsong468 in #853
- FIX add zero width and insert punctuation converters to init.py file by @AnnaRevutsky in #848
- FIX: AAD Auth refresh bug with OpenAITargets by @rlundeen2 in #855
- FIX handle empty exception message in validation by @romanlutz in #859
New Contributors
- @ErdemOzgen made their first contribution in #784
- @AnnaRevutsky made their first contribution in #848
Full Changelog: v0.8.0...v0.8.1
v0.8.0
What's Changed
Targets:
- HTTPTarget now supports rate limiting
- Some users encountered errors in Azure OpenAI when hitting content filter errors using error code 500. PyRIT now catches content filter responses with both error codes 400 (as before) and 500 (new) and returns a clean response record.
Datasets:
fetch_babelscape_alert_dataset
had a bug causing it to be limited to a single category even when users specified both. This is now fixed!- added
fetch_red_team_social_bias_dataset
- added
fetch_darkbench_dataset
- added
fetch_mlcommons_ailuminate_demo_dataset
Converters:
- added
UnicodeReplacementConverter
- added
sneaky_bits
option toAsciiSmugglerConverter
in theencoding_mode
argument. Theunicode_tags
argument is now removed and replaced by more options inencoding_mode
(i.e.,unicode_tags
,unicode_tags_control
, andsneaky_bits
).
Scanner: A basic version was introduced in v0.7.0 that supported only sending single-turn prompts. v0.8.0 expands on this with support for most multi-turn orchestrators (incl. adversarial chat targets and scorers) and memory. This feature is still considered experimental and may change considerably in the following versions.
Other:
- support for Python 3.13 in addition to 3.10-3.12.
- For single-piece responses, we now have a convenient
get_value()
method. - PyRIT used to print warnings that torch isn't installed (unless the corresponding extra was installed). This was caused by
transformers
and is now turned off as it doesn't serve any purpose. - In previous versions, PyRIT started supporting
.env.local
as an override to the.env
file for endpoint secrets. However, when using this outside of the normal repository structure (e.g., when running PyRIT without cloning this repo) the code failed to discover.env.local
in the current working directory. This is now fixed.
Full list of changes
- [DevContainer] Provide a uniform development environment by @bashirpartovi in #787
- FEAT: Add Rate Limit Support for HTTP Target by @nina-msft in #786
- DOC Updating contribution docs by @bashirpartovi in #788
- MAINT support python 3.13 by @AdrGav941 in #779
- FIX: fixed dev container permission issue by @bashirpartovi in #789
- FEAT: simplify extraction of converted values from responses by @paulinek13 in #783
- MAINT: improve organization of dataset fetch functions (refactoring) by @paulinek13 in #785
- FEAT: Added cross-platform compatibility and needed language support for toml and docker by @bashirpartovi in #797
- MAINT: Update release version to 0.7.1.dev0 by @jsong468 in #800
- FIX: prevent data overwrite in
fetch_babelscape_alert_dataset
by @paulinek13 in #799 - DOC contributor guide flowchart, small text updates, and add Roakey to README by @romanlutz in #798
- DOC: clarify OpenAITarget targets httpx_client_kwargs timeout settings by @clod81 in #801
- FIX: Add exception on response parsing when call to Openrouter.ai by @hagsmand in #796
- FIX make sure conversation IDs are not sent out as UUIDs to the database by @ayeganov in #723
- FEAT support adversarial_chat and scoring in scanner to enable automated multi-turn-orchestrators by @romanlutz in #706
- FIX move misplaced test file to tests/unit/converter by @romanlutz in #794
- FEAT: Added Red Team Social Bias dataset by @MoolmanM in #714
- DOC improve API reference for auth, cli, common, chat_message_normalizer by @romanlutz in #793
- FEAT: UnicodeReplacementConverter by @nina-msft in #803
- FIX: Updating pre-commit to fix build issues by @rlundeen2 in #810
- MAINT: Making test_connect more resilient by @rlundeen2 in #806
- [FIX] fix bad domain by @mgstate in #815
- [FIX] Integration test fixes: add hugging face token in notebook and fix test_fetch_datasets by @jsong468 in #819
- FEAT: Added memory config to scanner by @bashirpartovi in #808
- FEAT: add DarkBench dataset by @paulinek13 in #821
- MAINT: improving build/test time by @bashirpartovi in #820
- FIX handle Azure OpenAI content_filter errors with HTTP status code 500 by @romanlutz in #825
- FIX turn off transformers warning by @romanlutz in #829
- TEST: Adding integration test for content filters by @rlundeen2 in #830
- MAINT: Separating integration test local .env by @rlundeen2 in #817
- FEAT: add MLCommons AILuminate v1.0 DEMO Prompt Set by @paulinek13 in #828
- FIX find .env.local in current working directory by @romanlutz in #832
- BREAKING FEAT: Sneaky Bits - Advanced Data Smuggling Techniques by @KutalVolkan in #827
- FEAT add ps-fuzz prompts by @ryanjieh in #823
New Contributors
- @bashirpartovi made their first contribution in #787
- @clod81 made their first contribution in #801
- @hagsmand made their first contribution in #796
- @MoolmanM made their first contribution in #714
- @mgstate made their first contribution in #815
- @ryanjieh made their first contribution in #823
Full Changelog: v0.7.0...v0.8.0
v0.7.0
What's Changed
Targets:
- [BREAKING] OpenAIChatTarget has become more generalized to more broadly support OpenAI-compatible models. See the blog describing the changes here!
- If
api_version
is set to None when instantiatingOpenAITarget
objects, it will not be added as a query parameter to requests. - Added Google Gemini example environment variables to .env_example and added integration tests for Gemini/OpenAIChatTargets
Converters:
- [New] AddImageVideoConverter: PyRIT's first video converter! it allows users to add an image to a video in at a specified position. More video converters to come!
- [New] InsertPunctuationConverter: Inserts various punctuation into a prompt to test model robustness to perturbations.
Orchestrators:
- [New] ManyShotJailbreakOrchestrator: Prepend a faux dialogue between a human and an AI assistant within a single prompt for the target.
- [New] [BREAKING] ContextComplianceOrchestrator: Update the context to prime an
objective_chat_target
to answer. The context is set using instructions defined incontext_description_instructions_path
, along with anadversarial_chat
to generate the first turns to send. - [BREAKING] RolePlayOrchestrator improvements: Refactored for greater code re-use
- FlipAttackOrchestrator improvement: Allow for additional converters applied after the flip attack
Memory:
- Multimodal Seed Prompts Encoding Metadata: Adding non-text seed prompts to the database will automatically have metadata populated, including
format
(png, wav, etc.) and things likebitrate
andduration
for audio and video seed prompts. SeedPrompt
Duplicates: Duplicate seed prompts within the same dataset (identicaldataset_name
) will no longer be uploaded to memory.- Using Configured Paths for Multimodal Seed Prompts: Multimodal
SeedPrompt
file paths within .yaml files no longer use relative paths that break based on where the .yaml files are accessed. Instead, configured paths (located inpaths.py
) are used. - [BREAKING] Removed calls to disposing memory engines in Orchestrator and Prompt Target objects and replaces it with the
atexit
andweakref
methods of cleanup in the Memory interface to ensure cleanup on process exit. Orchestrators and targets no longer support the context manager protocol. - Added get_values() method to the
SeedPromptDataset
class to simplify prompt values extraction from datasets. Optional filtering to retrieve the first and/or last N values has also been implemented.
Scorers:
- [New] HumanInTheLoopScorerGradio: Create scores from manual human input by running the Gradio interface in a separate process and adds the scores to the database. For now, the possible scores that users can give are "safe" and "unsafe."
Datasets:
- [New] Added new fetch function for Aya Red-Teaming Dataset
- [New] Added Pliny's prompts from the l1b3rt4s repo as templates
- [New] Added the Babelscape ALERT dataset
- Added support for filtering based on harm categories for PKU-SafeRLHF and AdvBench datasets
Misc:
- Other changes include various maintenance improvements and bug fixes, addition of integration tests, website enhancements, dependency updates, and doc improvements.
Full list of changes
- FIX unblock test pipelines by skipping certain tests on Ubuntu and adding Windows additionally by @romanlutz in #727
- MAINT: Update release version to 0.6.1.dev0 by @nina-msft in #731
- MAINT: Upgrading DuckDB by @jbolor21 in #712
- [FEAT][MAINT][4019] Make multi-modal easier to configure in seedprompt files by @shivenchawla in #696
- FEAT: set favicon for the website by @paulinek13 in #717
- FEAT: simplify extracting prompt values by @paulinek13 in #718
- FEAT: add a fetch function for Aya Red-teaming Dataset by @paulinek13 in #713
- MAINT update Roakey image to have transparent background by @romanlutz in #735
- FEAT Moonshot Attack Module: Insert Punctuation Attack by @u7780339 in #475
- FEAT: include scored_prompt_id in orchestrator_identifier of the system prompt by @NicolePell in #725
- FEAT: Create many shot jailbreak orchestrator by @AdrGav941 in #709
- MAINT pre-commit hook to remove notebook header from notebooks by @jbolor21 in #737
- FEAT Add Encoding Data to Multimodal Seed Prompts by @jsong468 in #740
- FEAT added Pliny's prompts from the l1b3rt4s repo as templates by @joaodunas in #710
- FEAT Adding babelscape dataset by @Jarro01X in #738
- FIX: Upgrading Packages by @rlundeen2 in #741
- FIX: Increasing pipeline timout by @rlundeen2 in #743
- FEAT PyRIT to not upload duplicate seed-prompts by @shivenchawla in #742
- MAINT: Azure SQL Integration Test Misc. Updates by @nina-msft in #745
- FIX Small bug fixes (renaming file, editing MANIFEST) by @jsong468 in #746
- [BREAKING] FEAT: OpenAI Generalization Improvements by @rlundeen2 in #747
- FEAT: Add
example_count
field to ManyShotJailbreakOrchestrator by @nina-msft in #748 - DOC: Blog: A More Generalized OpenAIChatTarget by @rlundeen2 in #751
- DOC: Updating git docs by @rlundeen2 in #753
- FIX: Fixing integration tests broken with OpenAIChatTarget Update by @rlundeen2 in #755
- FEAT Video Converter: Adding Images to Videos by @jbolor21 in #702
- FIX: Adding back static js by @rlundeen2 in #761
- [BREAKING] FEAT: RolePlayOrchestrator Improvements by @rlundeen2 in #758
- [BREAKING] FIX: Dispose Memory in Memory vs Class Objects by @nina-msft in #752
- MAINT clean up dependencies by @romanlutz in #757
- FEAT Adding converter support to many shot jailbreak orchestrator by @AdrGav941 in #760
- FIX: Default API Version for TTS Target by @jbolor21 in #749
- [BREAKING] FEAT: Adding Context Compliance Orchestrator by @rlundeen2 in #763
- DOC: Add Instructions for Tagging Breaking Changes in PR Template by @nina-msft in #765
- FEAT: support filtering based on harm categories for PKU-SafeRLHF dataset by @paulinek13 in #756
- DOC Update CCA Documentation for Clarity by @eugeniavkim in #773
- DOC: Update OpenAI Environment Variable Names in Documentation by @nina-msft in #776
- FEAT: add harm categories to AdvBench Dataset by @paulinek13 in #732
- FIX: Allow api_version to be set to None when instantiating OpenAITarget objects by @LeoVrana in #764
- MAINT standardize Hugging Face token environment variable, add integration tests for Google Gemini and Open AI by @romanlutz in #778
- FEAT: Gradio HiTL Scorer by @mart123p in #722
- DOC: clarify OpenAIChatTarget usage with Ollama by @jsdlm in #777
- FIX: small edits to make integration tests pass by @jsong468 in #780
- MAINT add notice generation to component governance by @romanlutz in #781
- MAINT update NOTICE file by @romanlutz in #782
New Contributors
- @u7780339 made their first contribution in #475
- @NicolePell made their first contribution in #725
- @joaodunas made their first contribution in #710
- @Jarro01X made their first contribution in #738
- @LeoVrana made their first contribution in #764
Full Changelog: releases/v0.6.0...releases/v0.7.0
v0.6.0
What's Changed
- Cookbooks are live, and replace our How To Guide! Cookbooks try to tackle a problem and use the components that work best, instead of our typical documentation which illustrates that many pieces of PyRITs are swappable.
Cookbooks:
Targets:
- OllamaChatTarget: Implement ability to forward custom parameters directly to the HTTP client
- HuggingFaceChatTarget: Adds optional keywords
device_map
,torch_dtype
andattn_implementation
- [New] PlaywrightTarget: Interact with web applications using Playwright. This is particularly useful for testing interactions with web interfaces like chatbots.
- [New] RealtimeTarget: Send and receive audio with the Realtime API.
- [New] GroqChatTarget: Interact with Groq's OpenAI-compatible API.
Converters:
- [New] ANSI Escape Code Converter:
AnsiAttackConverter
- [New] BinaryConverter: Convert input text into binary with configurable bits per character
- PDFConverter: Updates to support templated and non-templated PDF generation & enabling text injection into existing PDFs
- [New] TextToHexConverter: Convert text to hexadecimal encoded utf-8 string
- Add easier querying for converter-supported input/output types
Orchestrators:
- RedTeamingOrchestrator & CrescendoOrchestrator now support prepended conversations. You can set a system prompt on the objective target using this feature, or provide conversation history as context to continue execution from a specific point.
- ScoringOrchestrator: Add ability to score responses using filters.
- PromptSendingOrchestrator: Set Skip Criteria to specify which prompts to skip being sent to the target with this orchestrator.
- [New] RolePlayingOrchestrator: Single-turn orchestrator which prepends some prompts which describe fictional scenarios to attempt and elicit harmful responses
- XPIAOrchestrator: Fix to BlobNotFound exception
Memory: - [BREAKING] All notebooks must explicitly initialize Central Memory through a new
initialize_pyrit()
function: #616. This puts ownership into the hands of the user to set where your prompts will be stored. Read more here: Memory - Ability to add memory labels on a per-prompt level, specifically useful in Multimodal scenarios
- Conversation Scores now available when exporting Prompt Data
- Filter Data by various queries (e.g. prompt ID, orchestrator ID, labels, etc) using
get_prompt_request_pieces()
- Consolidated method to Export Conversations using Filters:
export_conversations()
- SeedPrompts: Support for Multimodal Seed Prompts
- [BREAKING]
NormalizerRequestPieces
replaced withSeedPrompts
: #648
Scorers:
- Add tasks by default to scorers to improve scorer accuracy
Misc:
- Other changes include various maintenance improvements and bug fixes, addition of integration tests, new blog posts, and doc improvements.
Full list of changes
- MAINT Update release version to 0.5.3.dev0 by @rdheekonda in #592
- DOC: Multi-turn docs and blog post by @rlundeen2 in #593
- DOC: Fixing title by @rlundeen2 in #594
- MAINT: Update Memory Doc and Other Small Fixes by @jsong468 in #587
- FEAT Passing HTTP client kwargs from OllamaChatTarget by @rlundeen2 in #596
- MAINT: Refactoring Single-Turn by @rlundeen2 in #598
- DOC: Clarifying OpenAI docs by @rlundeen2 in #600
- FEAT - Adding optional kwargs to huggingface chat target by @perezbecker in #602
- FEAT: Ansi Escape Code Converter by @KutalVolkan in #597
- MAINT Update gcg_attack.py by @Tiger-Du in #606
- MAINT empty integration tests pipeline by @romanlutz in #603
- MAINT update integration-tests trigger to work with PRs by @romanlutz in #610
- FEAT: Playwright target by @AlexRRR in #583
- MAINT Add support for Local Multimodal Input Prompts When Using AzureSQLMemory by @rdheekonda in #613
- MAINT: Add Integration Test Directory + Refusal Scorer Eval Integration Test by @jsong468 in #605
- FEAT: Add Prepending Conversation Support to RedTeamingOrchestrator and CrescendoOrchestrator by @nina-msft in #578
- FIX: Adding SHA256 hashes to responses by @rlundeen2 in #615
- FEAT: binary converter by @AlexRRR in #611
- FIX: Update pyproject.toml for new versions for httpx, respx and openai by @jsong468 in #623
- FEAT Adding labels for individual prompts by @jbolor21 in #624
- FEAT Add Scores to Data Export with PromptRequestPiece data by @eugeniavkim in #617
- FEAT: Prompt Memory Consolidation and Filters by @rlundeen2 in #625
- FEAT: PDF Converter Updates by @KutalVolkan in #622
- FIX: small edits to populate_prompt_piece_scores by @jsong468 in #626
- DOC: Updating contributor docs by @rlundeen2 in #627
- FEAT Consolidate Export Conversations into one method by @eugeniavkim in #628
- FEAT: Adding tasks to scorers by @rlundeen2 in #629
- FIX: sort_request_pieces bug by @rlundeen2 in #631
- FEAT: Allowing header SeedPrompt configuration by @rlundeen2 in #630
- FEAT: Add Support for Multimodal Seed Prompts and Update Data Type Serializer by @rdheekonda in #632
- FEAT: Explicitly Initialize Central Memory + Remove Defaults by @nina-msft in #616
- FIX Refactor to join queries for entries and scores by @eugeniavkim in #635
- MAINT: Cleanup Import Naming for initialize_pyrit func by @nina-msft in #636
- FEAT: Score Responses by Filters in ScoringOrchestrator by @nina-msft in #639
- MAINT infrastructure for integration tests by @romanlutz in #612
- MAINT: Add JSON Mode for Supported Targets and Scorers by @rdheekonda in #640
- DOC: Zero Day Quest blog post by @rlundeen2 in #643
- MAINT: Add Import Sorting (isort) Pre-Commit Hook by @nina-msft in #644
- FIX: Rerun Output for Audio Converter Notebook by @nina-msft in #645
- MAINT: Add Import Sorting for Docs and Jupyter Notebooks (isort/nbqa-isort) by @nina-msft in #646
- TEST: Converter Notebook Integration Tests by @nina-msft in #647
- FEAT: Replacing NormalizerRequestPieces with SeedPrompts by @rlundeen2 in #648
- MAINT: Remove Azure SQL Example from Audio Converters Notebook by @nina-msft in #649
- FIX: adding hashes to retrieved PromptRequestPiece by @rlundeen2 in #652
- DOC: Clarifying PromptTargets from PromptChatTargets by @rlundeen2 in #658
- DOC update
pyrit.common
API reference by @paulinek13 in #657 - FEAT - Realtime Target by @jbolor21 in #638
- MAINT: Updating get_seed_prompt_groups to include individual seed_prompts by @rlundeen2 in #651
- DOC: Deleting extra doc by @rlundeen2 in #663
- FIX: Fixing circular import by @rlundeen2 in #665
- DOC Cleaning up Datasets and adding documentation for datasets and seed prompts by @eugeniavkim in #660
- DOC Adding NCC HTTPTarget Blog post by @jbolor21 in #664
- TEST Integration Tests for Target Notebooks by @jbolor21 in #667
- FEAT: Enhance PDFConverter to support text injection into existing PDFs by @KutalVolkan in #641
- FIX Target Integration test rename by @jbolor21 in #675
- FEAT: Adding Skip Criteria and Sending Prompts Cookbook by @rlundeen2 in #668
- FIX: http target bug by @ayeganov in #674
- FEAT add value hash columns and calc hash when committing seed prompt to memory by @jorisdg in #659
- TEST: Integration Tests for Python Notebooks (Auxiliary Attacks, Datasets, Memory) by @nina-msft in #670
- FIX: PDF Converter and Cookbook integration test by @rlundeen2 in #680
- FEAT: adding hex code converter (#666) by @millashin in #681
- FIX: Converter PDF Integration Build Pipeline by @rlundeen2 in #683
- TEST In...
v0.5.2
What's Changed
- Pinned the httpx version to 0.27.2 and refactored the codebase to ensure compatibility.
- Fixed AzureSQLMemory authentication issues by adding token refresh, pool recycling, and pre-ping mechanisms.
- Redesigned PAIR attack technique to function as a specialized instance of TAP orchestrator, streamlining architecture.
- Added support for local Hugging Face model checkpoints.
Full list of changes
- [DOC] Updating README by @rlundeen2 in #579
- Fix Azure SQL Authentication Errors: Add Token Refresh, Pool Recycling, and Pre-Ping by @rdheekonda in #576
- FEAT: add support for local model checkpoints and trust_remote_code in HuggingFaceChatTarget by @KutalVolkan in #574
- FEAT: Refactor PAIR to be a special instance of TAP by @rlundeen2 in #580
- FIX: httpx proxy arg fix, pinned httpx version by @jsong468 in #589
- FIX: Not raising exceptions on None responses by @rlundeen2 in #590
- Fix Test Prompt Response Error Values by @rdheekonda in #591
Full Changelog: v0.5.0...v0.5.2
v0.5.0
What's Changed
-
PyRIT now has a website
-
We've been working on standardizing orchestrators in terms of naming and functionality:
- The endpoint (of type
PromptTarget
) that PyRIT attacks will be referred to asobjective_target
. - The endpoint (of type
PromptChatTarget
) that helps us craft attacks will be referred to asadversarial_chat
. - Beyond that, we've settled on a common interface for multi-turn orchestrators with a shared result object.
- Instead of an
attack_strategy
arg we require a file path calledadversarial_chat_system_prompt_path
to make the connection to theadversarial_chat
target clearer. Some orchestrators have a default for this, of course. - The initial prompt to the
adversarial_chat
is now calledadversarial_chat_seed_prompt
to also help with clarity and connection toadversarial_chat
- Sometimes we use multiple scorers. For that reason,
objective_scorer
will be the scorer that decides if the objective has been achieved. Other scorers have similarly specific names, e.g.,on_topic_scorer
in theCrescendoOrchestrator
- The new standard name for all orchestrators to execute an attack is
run_attack_async
.
The standardization is not fully completed yet but will continue in future releases. So far,
CrescendoOrchestrator
,TreeOfAttacksWithPruningOrchestrator
, andRedTeamingOrchestrator
have been adjusted. - The endpoint (of type
-
Support for a centralized database using Azure SQL as an optional alternative to a local DuckDB database.
-
Introduced (multi-modal)
SeedPrompt
s andSeedPromptDataset
s as a starting point for red teaming ops with integration to our databases. -
New orchestrators and auxiliary attacks:
FuzzerOrchestrator
with 5 template converters- GCG support via Azure ML pipelines to optimize adversarial suffixes
- FlipAttackOrchestrator
-
New targets:
- HuggingFaceChatTarget
- HTTPTarget
- Open AI and Azure Open AI targets were refactored to simplify the logic. They now share a common interface
OpenAITarget
and you can decide between Azure vs. Open AI usingis_azure_target=True
orFalse
.
-
New datasets:
- HarmBench
- PKU-SafeRLHF
- wmdp-bio, wmdp-chem, and wmdp-cyber (now fetchable from the original data source)
- AdvBench
- Decoding Trust Stereotypes
- LLM-LAT/harmful-dataset
- tdc23 red teaming dataset
- TrustAIRLab/forbidden_question_set
- LibrAI 'Do Not Answer' Dataset
-
New converters:
- QRCodeConverter
- AzureSpeechAudioToTextConverter
- URLConverter
- HumanInTheLoopConverter
- ColloquialWordswapConverter
- UnicodeConfusableConverter (updated with new functionality)
- CharSwapGenerator
- MaliciousQuestionGeneratorConverter
- AsciiSmugglerConverter
- MathPromptConverter
- AudioFrequencyConverter
- ZeroWidthConverter
- DiacriticConverter
-
New scorers:
- SelfAskRefusalScorer
- HumanInTheLoopScorer
- InsecureCodeScorer
-
We generally use a
.env
file to configure details of endpoints that PyRIT needs to execute. A new.env.local
override file allow for further customization. -
Finally, PyRIT now comes with several extras that you can install using
pip install pyrit[<extra>]
dev
includes developer dependencies that you shouldn't need unless you plan on contributing to the project.torch
includes just pytorch which is needed for some targets (e.g. Hugging Face) or auxiliary attacks (e.g., GCG) but not core functionality. This allows you to choose whether you want to install it.gcg
includes extra dependencies that are only needed for running GCG. Since this requires dedicated compute (ideally with GPU) you can choose whether it is required for you.all
includes all of the above.
Full list of changes
- MAINT Update release version to 0.4.1.dev0 by @rdheekonda in #342
- [FEAT] QRCodeConverter by @jsong468 in #339
- [MAINT] Delete output_filename arg in image/text and text/image converters by @jsong468 in #344
- MAINT Update Release Instructions by @rdheekonda in #345
- FEAT: Add Likert scoring definition and prompt templates for persuasion and deception by @saphirqi7 in #307
- [FEAT] Add "task" to the scoring memory entry by @jsong468 in #349
- FEAT: Add fetch function for datasets from HarmBench #270 by @KutalVolkan in #341
- FEAT Add SQL Entra Auth for Azure SQL Server by @elgertam in #330
- [MAINT] Fix typos in OllamaChatTarget by @riedgar-ms in #357
- [FEAT] Azure Speech Audio to Text Converter by @jsong468 in #352
- FEAT: Add Rate Limit (RPM) Threshold Parameter to Prompt Targets by @nina-msft in #331
- FIX: correct type of the top_p argument in various PromptTarget classes by @s-zanella in #366
- FEAT Add ability to fetch PKU-SafeRLHF Data by @enrajka in #374
- FEAT: Refusal Scorer by @rlundeen2 in #371
- FEAT Add ability to fetch wmdp-bio, wmdp-chem, and wmdp-cyber datasets by @mshirsekar1 in #380
- TEST skip failing auth test after the new azure.identity version was released by @romanlutz in #387
- FEAT Added AdvBench dataset by @enrajka in #383
- FEAT: Fuzzer orchestrator by @gseetha04 in #360
- FIX Crescendo Bug and Improve Scorer Metaprompt Handling by @rdheekonda in #389
- FEAT: Add Centralized DB Support Using Azure by @rdheekonda in #379
- FIX: Updating memory and fixing bugs by @rlundeen2 in #394
- FEAT: Handling duplicate memory for PromptRequestPiece/Score entries by @jsong468 in #369
- [FEAT] Decoding Trust Stereotypes Dataset by @jsong468 in #385
- FEAT Centralized DB Support for Azure Speech Converters by @rdheekonda in #402
- FEAT add additional template converters for fuzzer orchestrator (crossover, similar, rephrase) by @roeybc in #378
- DOC: Update Custom Targets Demo Docs by @nina-msft in #404
- FEAT New URL Converter by @jbolor21 in #399
- [FEAT] HumanInTheLoop Converter by @jsong468 in #401
- DOC: Updating RTO example to use gpt4o for scoring by @rlundeen2 in #408
- MAINT: Crescendo and Score Refactor by @rlundeen2 in #405
- FEAT: Colloquial Wordswap Attack by @eugeniavkim in #406
- FEAT emoji jailbreak by @romanlutz in #314
- MAINT: Add Refusal docs and Filter logic by @rlundeen2 in #431
- DOC: Moving rate limiting to target by @rlundeen2 in #433
- FEAT: optimized huggingface model support by @KutalVolkan in #354
- DOC Enhance Azure SQL Database Setup and Permissions Documentation by @rdheekonda in #434
- FIX Azure SQL DB Permissions by @rdheekonda in #440
- FIX: Handle JSON markdown format exceptions by @meisman-ms in #435
- FEAT: Add ability to send prepend to the conversation in PromptSendingOrchestrator by @rlundeen2 in #441
- FEAT: Homoglyph Attack by @KutalVolkan in #407
- FEAT: Charswap Attack by @KutalVolkan in #403
- Add Python option for generate docs scripts by @sf-msft in #375
- FEAT: Violent Durian Attack Strategy by @KutalVolkan in #398
- FEAT GCG algorithm and AML pipeline by @blakebullwinkel in #381
- MAINT: Adding original values as score metadata for Azure Safety and Likert Scorers by @rlundeen2 in #445
- [DOC] Note on notebooks by @riedgar-ms in #460
- FIX: Fixing pre-commit check_links by @rlundeen2 in #462
- FEAT: Adding Flip Attack by @rlundeen2 in #456
- [FIX] Allow AAD Auth for AzureContentFilterScorer by @riedgar-ms in #455
- FEAT: Adding New Generic HTTP Target by @jbolor21 in #446
- MAINT: Rounds in CrescendoOrchestrator are now "Turns" by @jsong468 in #470
- DOC Add doc changes for database setup by @eugeniavkim in #476
- FEAT: OpenAI Target Refactor by @rlundeen2 in #466
- DOC: Edit Image Text Converter Docs by @jbolor21 in #477
- FEAT: Malicious Question Generator by @KutalVolkan in #397
- FIX: Changed AzureSpeechTextToAudioConverter input_type to text and added converter input_supported tests by @jsong468 in #472
- FEAT added ascii smuggler converter by @gio-msft in #479
- DOC Fix Invalid MD File Referenced in Deploy HF Model to Azure ML Module by @rdheekonda in https://...
v0.4.0
What's Changed
- New Advanced Attack Techniques: Expanded orchestrators with advanced attack techniques, including PAIR, tree of attacks, and crescendo strategies.
- New Targets: Crucible target, Prompt Shield Target, Azure OpenAI GPT-4o target
- New Converters: Added Tense, Emoji, image to text, and Character Space converters.
- New Scorers: Scale Scorer, Prompt Shield, and True/False Inverter Scorer
- Automatic Scoring & Memory Labels: Introduced automatic scoring in the PromptSendingOrchestrator. Added support for scoring with user-provided memory labels.
- Delegation SAS Authentication: Supported delegation SAS authentication for secure interactions with Azure Blob Storage targets.
- Improved Resiliency: Enhanced the resiliency of targets, converters, and orchestrators with robust error handling mechanisms.
- Bug Fixes & Performance: Various bug fixes, added support for Python 3.12, speedup unit tests
- Fetch functionality: Introduced functionality to fetch adversarial datasets, such as SecLists, XStest etc.,
- Updated Demo Codes: Replaced demo code examples with the GPT-4o target.
Full List of Changes
- FIX: Fixing policheck bug by @rlundeen2 in #261
- release v0.3.0 by @jbolor21 in #265
- DOC: Adding Guidance on Incorporating Research by @rlundeen2 in #268
- FEAT: Adding Tense Converter by @rlundeen2 in #273
- [FEAT] Add Scoring to PromptSendingOrchestrator by @nina-msft in #262
- FIX Fixed mypy Type Failures by @elgertam in #269
- FEAT: Adding Crucible Target by @rlundeen2 in #277
- FIX ValueError with Azure TTS Target in Single Turn Conversation Using PromptSendingOrchestrator by @nina-msft in #278
- FEAT: Converter Tokens by @rlundeen2 in #279
- [FIX] Add flake8-copyright check to pre-commit hooks by @nina-msft in #281
- FIX Exclude Morse Converter from Flake8 Precommit by @nina-msft in #284
- [DRAFT] [FIX] Replace Orchestrator ID with UUID by @nina-msft in #285
- DOC update citation for past tense paper by @romanlutz in #288
- FEAT Add scale scorer by @romanlutz in #274
- FEAT Add Delegation SAS-Based Auth, Update Storage Plugins, and Async Blob Download by @rdheekonda in #286
- FEAT add (back) Gandalf scorer by @romanlutz in #287
- MAINT clean up copyright by @romanlutz in #297
- FEAT: Add Error Handling to AML Chat Target by @nina-msft in #299
- FIX: bug with multi-modal image responses by @rlundeen2 in #301
- MAINT: Improving some LLM Converters by @rlundeen2 in #300
- [FIX][Issue #302] update language version enforcement to fix black-pre-commit installation incompatibility by @shivenchawla in #303
- FEAT return ID in conversation duplication code by @romanlutz in #296
- [FEAT] Implement PAIR by @dlmgary in #255
- FEAT add float scale threshold scorer by @romanlutz in #294
- FEAT: Add GPT4-o chat target by @shivenchawla in #293
- FEAT: Adding Emoji Converter by @rlundeen2 in #306
- DOC: Doc Reorg by @rlundeen2 in #304
- MAINT: Removing asyncio sleep by @rlundeen2 in #309
- MAINT add support for Python 3.12 and fix tests that started breaking by @romanlutz in #305
- FEAT Add print_conversation method to prompt sending orchestrator by @romanlutz in #312
- FEAT Add many-shot jailbreaking feature implementation by @KutalVolkan in #254
- FEAT: Add tree of attacks with pruning by @salmazainana in #210
- FEAT Add Space Converter by @rdheekonda in #316
- FEAT Add Flexible Memory Labels and Scoring to Orchestrators by @rdheekonda in #315
- FEAT: Crescendo Orchestrator by @SafwanA02 in #275
- Feat: Adding multi-turn promptSendingOrchestrator by @rlundeen2 in #317
- DOC Fix README.md link by @romanlutz in #319
- MAINT: Fixing data serializer ability to properly raise errors by @rlundeen2 in #318
- FEAT: Add fetch function for SecLists AI LLM Bias Testing datasets (#267) by @KutalVolkan in #280
- FEAT: Adding true_false inverter scorer by @rlundeen2 in #321
- FIX: fixing check links by @rlundeen2 in #323
- FEAT: Add Exception Handling to Azure TTS Target by @nina-msft in #322
- DOC - replacing gpt4 with gpt4o in example notebooks by @jsong468 in #313
- [MAINT] Changing Examples from stop signs by @jbolor21 in #325
- FEAT Prompt Shield by @ValbuenaVC in #271
- FEAT: add xstest dataset by @KutalVolkan in #320
- [FEAT] Created add_image_text_converter and unit tests by @jsong468 in #328
- DOC: Adding Notebook to document re-sending previous prompts by @rlundeen2 in #332
- MAINT: speeding up crescendo tests by @rlundeen2 in #333
- FIX Move pillow from dev to core dependency by @rdheekonda in #334
- FIX add sample image classifier file by @jbolor21 in #336
- FEAT: Add deterministic flag and custom substitutions to LeetspeakConverter by @KutalVolkan in #329
- MAINT Remove Duplicate Module by @rdheekonda in #337
- MAINT Restructure pyrit.models module and prune by @romanlutz in #338
- [MAINT] Speeding up unit tests by @jbolor21 in #335
- FIX Crescendo backtrack with same orchestrator ID and handling responses with markdown syntax by @romanlutz in #340
New Contributors
- @shivenchawla made their first contribution in #303
- @KutalVolkan made their first contribution in #254
- @salmazainana made their first contribution in #210
- @jsong468 made their first contribution in #313
- @ValbuenaVC made their first contribution in #271
Full Changelog: v0.3.0...v0.4.0
v0.3.0
What's Changed
- New and improved scorers! Many new scorers have been added, and scorers can now be swapped out and made generic.
- Many new attack techniques and variations have been introduced. These include skeleton key, most of GPTFuzz, adding text to images, repeated token attack, cipherchat, shorten/expand, tone, CodeChameleon, and more. A total of 13 new converters have been added!
- Framework improvements:
- Ability to duplicate conversations for reuse (this makes implementation easier for attacks like PAIR/TAP/crescendo).
- Converters can be added to LLM responses.
- All framework calls are now async and parallelizable.
- Error handling and intelligent automatic retries in targets (e.g., for network errors) and converters/scorers (e.g., for JSON deserialization).
Full list of Changes
- FEAT: Refactoring and Standardizing Scores and Scorers by @rlundeen2 in #190
- FIX: Making RESULTS_PATH be simple in pip packages by @rlundeen2 in #191
- FIX: Minor Self-Ask Scorer Improvements by @rlundeen2 in #194
- FEAT: Adding Scores to the Database by @rlundeen2 in #195
- MAINT use context manager in XPIA notebook by @romanlutz in #198
- FEAT: Update score_async to add score to database by @rlundeen2 in #200
- FEAT support duplicating memory when cloning orchestrators by @romanlutz in #177
- MAINT: Likert Scoring Tweaks to Reduce False Positives by @rlundeen2 in #201
- FEAT add CSV support by @romanlutz in #197
- FEAT: Adding Human in the Loop Scorer by @rlundeen2 in #202
- FEAT: Azure content filter scorer by @cseifert1 in #206
- FEAT Adding Image Converter: add text on image by @jbolor21 in #205
- FEAT: Score Prompts Orchestrator by @rlundeen2 in #208
- MAINT: Deprecated send_prompt methods by @mart123p in #204
- FEAT Add image generation example with red teaming orchestrator and unify existing orchestrator definitions by @romanlutz in #189
- FEAT: self ask conversation objective and verifier scorer for crescendo by @cseifert1 in #209
- FEAT: Centralize Exception Handling and Implement in GPTv Target by @rdheekonda in #207
- MAINT Making Prompt Converters Async by @jbolor21 in #211
- Update .env_example Typo "Azure Open AI"→"Azure OpenAI" by @hyoshioka0128 in #214
- MAINT: Small scoring updates by @rlundeen2 in #215
- MAINT: Adding pretty print functionality and small RTO updates by @rlundeen2 in #217
- DOC: Re-organizing documentation by @rlundeen2 in #219
- FEAT: Add Suffix Converter by @NaijingGuo in #212
- MAINT: Updating GPT-V to use new exception guide by @rlundeen2 in #220
- FEAT: Add nesting and prepend/append jailbreaks from papers by @jl8771 in #216
- MAINT Adding Error Handling to OpenAIChatInterface by @jbolor21 in #218
- MAINT Add Exception Handling to DALLE Target by @rdheekonda in #221
- FEAT: Add repeated token attack converter by @jl8771 in #224
- MAINT Resolve Install Issues and Add Multiline Text Wrapping in AddTextImageConverter by @rdheekonda in #230
- MAINT: PromptRequestPiece SHA setting update by @rlundeen2 in #231
- FEAT: Implements Crescendo-style attack based on system prompt. by @dlmgary in #237
- MAINT add notebook version disclaimer by @romanlutz in #234
- FEAT: Adding Converters to Output by @rlundeen2 in #236
- DOC: Reorganizing MemoryDocs by @rlundeen2 in #239
- Added complex code jailbreak template by @petebryan in #238
- FEAT: Add prompt converters for atbash, caesar, morse and cipherchat from paper by @jl8771 in #223
- MAINT add test instructions to release guide by @romanlutz in #232
- FIX: Fixing doc links by @rlundeen2 in #245
- FEAT: Adding Master Key Jailbreak by @SafwanA02 in #248
- MAINT Adding Error Handling Code for converters by @jbolor21 in #247
- FIX: Fixing score conversation history by @rlundeen2 in #251
- FEAT: Add shorten/expand converters by @jl8771 in #246
- FEAT: Add CodeChameleon converter by @jl8771 in #240
- FEAT: Adding Noise and Tone Converters by @rlundeen2 in #252
- FEAT: Add persuasion converter with 5 persuasion techniques by @jl8771 in #253
- FEAT Implementation of SQL Server connectivity by @elgertam in #227
- MAINT Error Handling for Scorers by @jbolor21 in #256
- FIX: Skeleton Key Orchestrator by @SafwanA02 in #260
- MAINT upgrading AOAI version by @jbolor21 in #264
New Contributors
- @mart123p made their first contribution in #204
- @hyoshioka0128 made their first contribution in #214
- @jl8771 made their first contribution in #216
- @SafwanA02 made their first contribution in #248
- @elgertam made their first contribution in #227
Full Changelog: v0.2.1...v0.3.0
v0.2.1
What's Changed
- added user authentication support for AOAI Chat Targets
- request validation in targets
- support for exporting conversations from the memory
Full list of changes
- Updating Release to 0.2.1.dev0 by @rlundeen2 in #181
- FEAT Add User AuthN Support to AOAI Chat Targets by @nina-msft in #182
- MAINT Add Request Validation for All Prompt Targets by @rdheekonda in #184
- FEAT Export Conversation by Orchestrator ID by @nina-msft in #183
Full Changelog: v0.2.0...v0.2.1