watchmen: cyrillic homoglyph rejection (closes adversarial-corpus gap)#329
Merged
Conversation
019cf3d to
eec73d2
Compare
PR #323 adversarial corpus documented two gaps in actor validation: cyrillic_C_homoglyph and cyrillic_a_homoglyph. NFKC + casefold does not fold Cyrillic→Latin because the scripts are semantically distinct, so attacks like 'Сlaude' (U+0421) bypassed the INTERNAL_ACTORS check. Fix: hand-picked _CYRILLIC_CONFUSABLES map applied after casefold. Tight scope by design — covers documented bypass cases, does not pull in full Unicode confusables.txt. Same adversarial corpus now shows 12/12 caught (was 10/12). Real before/after evidence on the same instrument that flagged the gap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The original test (PR #323 adversarial corpus) was named 'test_cyrillic_homoglyphs_currently_pass' and explicitly documented the gap with a docstring noting 'When confusables-detection ships, this test should flip to assertRaises(ValueError) and that flip is the proof the gap closed.' The flip didn't land in the cyrillic-homoglyph fix because the test lived in test_ablation_runner.py rather than test_unicode_actor_bypass.py where the new positive-case tests went. Caught when CI surfaced 4 failing test runs across 3.10/3.11/3.12/3.12-sklearn after the rebase. Behavior preserved: 8 tests in TestWatchmenAdversarial all pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1812730 to
ed413af
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR #323 adversarial corpus documented two gaps: cyrillic_C_homoglyph and cyrillic_a_homoglyph. NFKC + casefold does not fold Cyrillic→Latin because the scripts are semantically distinct.
Fix:
_CYRILLIC_CONFUSABLESmap applied after casefold in_validate_actor. Hand-picked subset, not fullconfusables.txt— keeps false-positive surface narrow.Before/after on the same adversarial instrument:
Real evidence loop closed: corpus flagged the gap → fix lands → same corpus shows full coverage. Aletheia round-2 anti-tautology discipline satisfied (the measurement instrument and the validator are independent code paths).
Tests: 7 new parametrized homoglyph cases + 2 discipline-pinning tests in
tests/test_unicode_actor_bypass.py. All 32 file tests pass.