Skip to content

Rewrite azure-ai-evaluation skill to use azure-ai-projects SDK (V2 API)#105

Merged
thegovind merged 3 commits intomicrosoft:mainfrom
imatiach-msft:rewrite-azure-ai-evaluation-to-v2
Feb 6, 2026
Merged

Rewrite azure-ai-evaluation skill to use azure-ai-projects SDK (V2 API)#105
thegovind merged 3 commits intomicrosoft:mainfrom
imatiach-msft:rewrite-azure-ai-evaluation-to-v2

Conversation

@imatiach-msft
Copy link
Contributor

@imatiach-msft imatiach-msft commented Feb 6, 2026

Summary

This PR deprecates the azure-ai-evaluation-py skill and merges its functionality into azure-ai-projects-py using the current V2 API.

Background

Per guidance from April:

"azure-ai-evaluation is v1 API (not compatible with nextgen and eventually will be deprecated)" "don't use or reference azure-ai-evaluation anywhere"

Changes

Deleted

  • .github/skills/azure-ai-evaluation-py/ - Entire folder deleted (deprecated SDK)

Enhanced azure-ai-projects-py skill

  • references/evaluation.md - Completely rewritten with comprehensive evaluation patterns
  • references/built-in-evaluators.md - New file with complete evaluator reference (Quality, Safety, Agent, NLP)
  • references/custom-evaluators.md - New file with code-based and prompt-based evaluator patterns
  • scripts/run_batch_evaluation.py - New CLI tool for batch evaluations
  • SKILL.md - Updated reference files list, renamed product to "Microsoft Foundry"

Key API Pattern (V2)

   from azure.ai.projects import AIProjectClient

   with AIProjectClient(endpoint=endpoint, credential=credential) as project_client:
       openai_client = project_client.get_openai_client()

       eval_object = openai_client.evals.create(
           name="My Evaluation",
           testing_criteria=[{
               "type": "azure_ai_evaluator",
               "evaluator_name": "builtin.coherence",
               "data_mapping": {"query": "{{item.query}}", "response": "{{item.response}}"}
           }]
       )
       run = openai_client.evals.runs.create(eval_id=eval_object.id, data_source=data_source)

References

BREAKING CHANGE: Replaces deprecated azure-ai-evaluation SDK with azure-ai-projects SDK

Key changes:
- Replace all azure-ai-evaluation imports with azure-ai-projects SDK
- Use AIProjectClient + openai_client.evals.* API instead of evaluate()
- Built-in evaluators now use 'builtin.' prefix (e.g., builtin.coherence)
- Data mapping syntax changed from \ to {{item.field}}
- Custom evaluators use CodeBasedEvaluatorDefinition/PromptBasedEvaluatorDefinition
- Agent evaluations use {{sample.output_text}} for response mapping

Updated files:
- SKILL.md: Complete rewrite with new SDK patterns
- references/built-in-evaluators.md: New evaluator discovery and usage
- references/custom-evaluators.md: Code-based and prompt-based patterns
- references/acceptance-criteria.md: Updated acceptance tests
- scripts/run_batch_evaluation.py: CLI tool using new SDK

References:
- Azure AI Projects samples: sdk/ai/azure-ai-projects/samples/evaluations/
- Guidance: 'Don't use or reference azure-ai-evaluation anywhere'
…zure-ai-evaluation-py

Changes:
- Delete azure-ai-evaluation-py folder entirely (deprecated SDK)
- Enhance azure-ai-projects-py with comprehensive evaluation content:
  - Replace references/evaluation.md with comprehensive version
  - Add references/built-in-evaluators.md (complete evaluator reference)
  - Add references/custom-evaluators.md (code/prompt-based patterns)
  - Add scripts/run_batch_evaluation.py (CLI tool)
  - Update SKILL.md reference files list
- Rename 'Azure AI Foundry' to 'Microsoft Foundry' in docs

The azure-ai-evaluation SDK is deprecated per guidance:
'Don't use or reference azure-ai-evaluation anywhere'
Copy link
Collaborator

@thegovind thegovind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thegovind thegovind merged commit 36b1daa into microsoft:main Feb 6, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants