Skip to content

Stop AI agents breaking on bad data - Open source data quality validation framework for reliable agent workflows

License

Notifications You must be signed in to change notification settings

adri-standard/adri

ADRI – Agent Data Readiness Index

Protect AI workflows from bad data with one line of code.

ADRI is a small Python library that enforces data quality before data reaches an AI agent step. It turns data assumptions into executable data contracts, and applies them automatically at runtime.

No platform. No services. Runs locally in your project.

from adri import adri_protected

@adri_protected(contract="customer_data", data_param="data")
def process_customers(data):
    # Your agent logic here
    return results

What it is

ADRI provides:

  • A decorator to guard a function or agent step
  • A CLI for setup and inspection
  • A reusable library of contract templates

Install & set up

pip install adri
adri setup

What happens when you run it

First successful run

  • ADRI inspects the input data
  • Creates a data contract (stored as YAML)
  • Saves local artifacts for debugging/inspection

Subsequent runs

  • Incoming data is checked against the contract
  • ADRI calculates quality scores across 5 dimensions
  • Based on your settings, it either:
    • allows execution, or
    • blocks execution (raises)

How ADRI works (high level)

ADRI Flow Diagram

In plain English: ADRI sits between your code and its data, checking quality before letting data through. Good data passes, bad data gets blocked.


Use it in code

from adri import adri_protected
import pandas as pd

@adri_protected(contract="customer_data", data_param="customer_data")
def analyze_customers(customer_data):
    """Your AI agent logic."""
    print(f"Analyzing {len(customer_data)} customers")
    return {"status": "complete"}

# First run with good data
customers = pd.DataFrame({
    "id": [1, 2, 3],
    "email": ["user1@example.com", "user2@example.com", "user3@example.com"],
    "signup_date": ["2024-01-01", "2024-01-02", "2024-01-03"]
})

analyze_customers(customers)  # ✅ Runs, auto-generates contract

What happened:

  1. Function executed successfully
  2. ADRI analyzed the data structure
  3. Generated a YAML contract under your project
  4. Future runs validate against that contract

Future runs with bad data:

bad_customers = pd.DataFrame({
    "id": [1, 2, None],  # Missing ID
    "email": ["user1@example.com", "invalid-email", "user3@example.com"],  # Bad email
    # Missing signup_date column
})

analyze_customers(bad_customers)  # ❌ Raises exception with quality report

Quick links

Protection modes

# Raise mode (default) - blocks bad data by raising an exception
@adri_protected(contract="data", data_param="data", on_failure="raise")

# Warn mode - logs warning but continues execution
@adri_protected(contract="data", data_param="data", on_failure="warn")

# Continue mode - silently continues
@adri_protected(contract="data", data_param="data", on_failure="continue")

Contract templates (start fast)

ADRI includes reusable contract templates for common domains and AI workflows.

Business domains

AI frameworks

Generic templates

Contributing

Use cases

ADRI works with any data format. Sample data files are included for common scenarios:

API Data Validation

Protect your API integrations with structural validation.

Multi-Agent Workflows

Validate context passed between agents in CrewAI, AutoGen, etc.

RAG Pipelines

Ensure documents have correct structure before indexing.

License

Apache 2.0. See LICENSE.


Built with ❤️ by Thomas Russell at Verodat.

One line of code. Local enforcement. Reliable agents.

About

Stop AI agents breaking on bad data - Open source data quality validation framework for reliable agent workflows

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages