fix(week6): Add comprehensive moderation filtering to resolve fine-tuning eval errors #723

bilallamal07 · 2025-10-14T14:45:35Z

Summary

Fixes critical issue where OpenAI fine-tuning jobs fail during post-training safety evaluations (refusals_v3) due to sensitive content in product descriptions.

Problem

When running the Week 6 Day 5 fine-tuning exercise, users encounter this error:

Error while running moderation eval refusals_v3 for snapshot 
ft:gpt-4o-mini-2024-07-18:personal:pricer:CQxxxx
Error while running eval for category hate/threatening

Root Cause: The Amazon product dataset contains items with sensitive keywords (weapon, knife, tactical, combat, etc.) that trigger OpenAI's post-training safety checks.

Solution

1. Updated Notebook (`day5.ipynb`)

Added check_moderation() function that:

Implements two-stage filtering (keyword pre-filter + OpenAI Moderation API)
Provides detailed reporting of flagged items
Returns clean items ready for fine-tuning

2. Standalone Scripts

fix_moderation.py: Batch filtering script with 25+ sensitive keywords
test_moderation.py: JSONL verification utility

3. Documentation

DAY5_MODERATION_FIX_README.md: PR-focused documentation
MODERATION_FIX_README.md: Technical deep-dive

Results

Before Fix

Training: 200 examples
Validation: 50 examples
Status: ❌ Failed during post-training moderation

After Fix

Training: 190 examples (10 filtered)
Validation: 48 examples (2 filtered)
Status: ✅ Successfully completed and deployed

Testing

Verified with successful fine-tuning job:

Job ID: ftjob-moQGns3ajsS5UWIxxxxx
Model: ft:gpt-4o-mini-2024-07-18:personal:pricer:CQUNxxxx
Confirmed via OpenAI completion email

Benefits

Prevents users from wasting time/money on failed fine-tuning jobs
Demonstrates best practices for OpenAI safety evaluations
Provides reusable tools for content filtering
No breaking changes to original notebook structure

Files Changed

✏️ week6/day5.ipynb - Added moderation function
✨ week6/fix_moderation.py - New filtering script
✨ week6/test_moderation.py - New verification utility
📚 week6/DAY5_MODERATION_FIX_README.md - New documentation
📚 week6/MODERATION_FIX_README.md - New technical docs

Compatibility

✅ Python 3.8+
✅ OpenAI Python SDK v1.0+
✅ No breaking changes
✅ Works with W&B integration

Community Contribution - Week 6 Day 5 Moderation Fix by @bilallamal07

…ning Fixes critical issue where fine-tuning jobs fail during post-training safety evaluations due to sensitive content in product descriptions. Changes: - Add check_moderation() function to day5.ipynb for content filtering - Implement two-stage filtering (keyword pre-filter + OpenAI Moderation API) - Create standalone fix_moderation.py script for batch processing - Add test_moderation.py utility for JSONL verification - Include comprehensive documentation in DAY5_MODERATION_FIX_README.md and MODERATION_FIX_README.md Results: - Training examples: 190 (10 filtered from 200) - Validation examples: 48 (2 filtered from 50) - Fine-tuning jobs now pass refusals_v3 safety evaluation - Successfully deployed model: ft:gpt-4o-mini-2024-07-18:personal:pricer This contribution prevents users from encountering moderation failures and provides reusable tools for content filtering in fine-tuning workflows.

ed-donner · 2025-10-14T15:14:18Z

Oh gosh - would you be OK to move this to community-contributions folder? I'm grateful to have this change, and I will make this update to the main repo at some point, but in the meantime it's best not to affect the main repo where possible..

bilallamal07 · 2025-10-15T08:06:00Z

Oh gosh - would you be OK to move this to community-contributions folder? I'm grateful to have this change, and I will make this update to the main repo at some point, but in the meantime it's best not to affect the main repo where possible..

Hi, Ed
Thanks for pointing this out. The PR submitted outside the community contribution process was unintentional. I’ll ensure all future updates align with the community contribution guidelines moving forward.

I appreciate your support and guidance!
Best regards,

bilallamal07 changed the title ~~fix(week6): Add comprehensive moderation filtering for OpenAI fine-tuning~~ fix(week6): Add comprehensive moderation filtering to resolve fine-tuning eval errors Oct 14, 2025

Move bilallamal07-week6-moderation-fix to community-contributions folder

be48308

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(week6): Add comprehensive moderation filtering to resolve fine-tuning eval errors #723

fix(week6): Add comprehensive moderation filtering to resolve fine-tuning eval errors #723

bilallamal07 commented Oct 14, 2025

Uh oh!

ed-donner commented Oct 14, 2025

Uh oh!

bilallamal07 commented Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix(week6): Add comprehensive moderation filtering to resolve fine-tuning eval errors #723

Are you sure you want to change the base?

fix(week6): Add comprehensive moderation filtering to resolve fine-tuning eval errors #723

Conversation

bilallamal07 commented Oct 14, 2025

Summary

Problem

Solution

1. Updated Notebook (day5.ipynb)

2. Standalone Scripts

3. Documentation

Results

Before Fix

After Fix

Testing

Benefits

Files Changed

Compatibility

Uh oh!

ed-donner commented Oct 14, 2025

Uh oh!

bilallamal07 commented Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. Updated Notebook (`day5.ipynb`)