Skip to content

watchmen: cyrillic homoglyph rejection (closes adversarial-corpus gap)#329

Merged
AetherLogosPrime-Architect merged 2 commits into
mainfrom
cyrillic-homoglyph-fix
May 8, 2026
Merged

watchmen: cyrillic homoglyph rejection (closes adversarial-corpus gap)#329
AetherLogosPrime-Architect merged 2 commits into
mainfrom
cyrillic-homoglyph-fix

Conversation

@AetherLogosPrime-Architect
Copy link
Copy Markdown
Owner

PR #323 adversarial corpus documented two gaps: cyrillic_C_homoglyph and cyrillic_a_homoglyph. NFKC + casefold does not fold Cyrillic→Latin because the scripts are semantically distinct.

Fix: _CYRILLIC_CONFUSABLES map applied after casefold in _validate_actor. Hand-picked subset, not full confusables.txt — keeps false-positive surface narrow.

Before/after on the same adversarial instrument:

  • Before: 10/12 caught (cyrillic_C and cyrillic_a were the open gaps)
  • After: 12/12 caught

Real evidence loop closed: corpus flagged the gap → fix lands → same corpus shows full coverage. Aletheia round-2 anti-tautology discipline satisfied (the measurement instrument and the validator are independent code paths).

Tests: 7 new parametrized homoglyph cases + 2 discipline-pinning tests in tests/test_unicode_actor_bypass.py. All 32 file tests pass.

DivineOS Agent and others added 2 commits May 8, 2026 14:50
PR #323 adversarial corpus documented two gaps in actor validation:
cyrillic_C_homoglyph and cyrillic_a_homoglyph. NFKC + casefold does
not fold Cyrillic→Latin because the scripts are semantically distinct,
so attacks like 'Сlaude' (U+0421) bypassed the INTERNAL_ACTORS check.

Fix: hand-picked _CYRILLIC_CONFUSABLES map applied after casefold.
Tight scope by design — covers documented bypass cases, does not
pull in full Unicode confusables.txt.

Same adversarial corpus now shows 12/12 caught (was 10/12). Real
before/after evidence on the same instrument that flagged the gap.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The original test (PR #323 adversarial corpus) was named
'test_cyrillic_homoglyphs_currently_pass' and explicitly documented the
gap with a docstring noting 'When confusables-detection ships, this test
should flip to assertRaises(ValueError) and that flip is the proof the
gap closed.'

The flip didn't land in the cyrillic-homoglyph fix because the test
lived in test_ablation_runner.py rather than test_unicode_actor_bypass.py
where the new positive-case tests went. Caught when CI surfaced 4 failing
test runs across 3.10/3.11/3.12/3.12-sklearn after the rebase.

Behavior preserved: 8 tests in TestWatchmenAdversarial all pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@AetherLogosPrime-Architect AetherLogosPrime-Architect merged commit 1bf97ac into main May 8, 2026
6 checks passed
@AetherLogosPrime-Architect AetherLogosPrime-Architect deleted the cyrillic-homoglyph-fix branch May 13, 2026 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant