Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion divine/nostr-kafka-bridge/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,8 @@
# Divine clients use different reason vocabularies. Normalize to canonical
# values that SML rules can match consistently.
#
# Canonical values: csam, nudity, spam, impersonation, illegal, harassment, other
# Canonical values: csam, nudity, violence, ai_generated, spam, impersonation,
# illegal, harassment, other
# Mobile maps csam -> 'illegal' and sexual content -> 'nudity' per NIP-56.
# Web passes raw reasons (csam, harassment, sexual-content, etc.).
_REASON_ALIASES = {
Expand All @@ -54,6 +55,12 @@
# Other
'false-information': 'other',
'NS-other': 'other',
# MOD namespace labels from moderation-service kind 1984 reports.
# These are the raw l-tag values: NS (Not Safe), VI (Violence), AI (AI-generated).
# The bridge receives them lowercased after strip().lower() in _normalize_report_reason.
'ns': 'nudity',
'vi': 'violence',
'ai': 'ai_generated',
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'ai': 'ai_generated' is a very short alias key. Any client that happens to send a freeform reason of literally ai will now be rewritten. That's almost certainly fine given the deliberate vocab of the clients we know about, but it's a lossy mapping that's easy to overlook. A one-line comment clarifying that this mapping is MOD-namespace-specific (and why there's no risk of collision with other clients) would make the assumption explicit.

}


Expand Down
30 changes: 25 additions & 5 deletions divine/plugins/src/services/zendesk_sink.py
Original file line number Diff line number Diff line change
Expand Up @@ -105,11 +105,31 @@ def _create_ticket(self, verdict: str, result: ExecutionResult) -> None:
logger.exception(f'Failed to create Zendesk ticket for verdict={verdict}')

def _log_resolution(self, verdict: str, result: ExecutionResult) -> None:
"""Log resolution verdicts. Resolving existing tickets requires
searching by event ID, which needs the Zendesk search API and
a tag/field convention for linking tickets to Nostr events.
Not implemented yet -- would need to match the relay-manager
pattern (zendesk_tickets D1 table maps event_id to ticket_id).
"""Log resolution verdicts. Ticket resolution is not yet implemented.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expanded architecture notes are useful, but this change is unrelated to the PR title (MOD namespace normalization / rule matching). Consider splitting into a separate docs-only commit/PR, or at minimum flagging in the PR description that the ZendeskSink change is scoped bundled. Not a blocker — the content is accurate and self-contained.


Architecture for when this is needed:
----------------------------------------
Closing tickets requires mapping Nostr event IDs to Zendesk ticket IDs.
The relay-manager solves this with a `zendesk_tickets` D1 table
(event_id -> ticket_id). Osprey runs in GKE, not CF Workers, so the
equivalent is a Postgres table in the osprey DB:

CREATE TABLE zendesk_tickets (
event_id TEXT PRIMARY KEY,
ticket_id INTEGER NOT NULL,
created_at TIMESTAMPTZ DEFAULT now()
);

Implementation steps:
1. In `_create_ticket`: after a successful API call, INSERT into
zendesk_tickets (event_id from result, ticket_id from response).
2. In `_log_resolution`: SELECT ticket_id WHERE event_id = <event_id>,
then PATCH /api/v2/tickets/{ticket_id}.json with status='solved'.
3. Inject the DB connection via __init__ (same pattern as
PostgresLabelsService in labels_service.py).

For now, log and continue. Tickets accumulate but cause no operational
harm -- moderators can close them manually.
"""
action_name = result.action.action_name if result.action else 'unknown'
logger.info(f'Resolution verdict: {verdict} action={action_name} (ticket resolution not yet implemented)')
Expand Down
3 changes: 2 additions & 1 deletion divine/rules/rules/reports/auto_hide.sml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@
# Automatically acts on reports from trusted reporters for CSAM and NSFW content.
#
# Report reasons are normalized by the bridge. Canonical values:
# csam, nudity, spam, impersonation, illegal, harassment, other
# csam, nudity, violence, ai_generated, spam, impersonation, illegal,
# harassment, other
#
# Mobile sends 'illegal' for CSAM (NIP-56 mapping), which the bridge
# can't distinguish from violence/copyright 'illegal'. We match both
Expand Down
43 changes: 27 additions & 16 deletions divine/rules/rules/reports/moderation_service.sml
Original file line number Diff line number Diff line change
@@ -1,20 +1,31 @@
# Divine Moderation Service Auto-Ban (kind 1984 reports)
# Divine Moderation Service Signal (kind 1984 reports)
#
# Handles kind 1984 events published by moderation-service for automated
# classifications AND human moderator overrides. Both use NOSTR_PRIVATE_KEY
# and the MOD namespace with labels NS/VI/AI.
# AI classifications. Uses NOSTR_PRIVATE_KEY with MOD namespace labels NS/VI/AI.
# The bridge normalizes these to 'nudity', 'violence', 'ai_generated'.
#
# This is one of two paths for moderation-service output into Osprey:
# - Kind 1984 (this file): automated AI flags + human override reports
# - Kind 1985 (content/label_routing.sml): human-verified label events
# This rule is a SIGNAL only -- it flags content for human review but does
# not enforce bans directly. Two reasons:
#
# The kind 1984 reports use the MOD namespace. Content JSON includes
# scores, type, and source ('ai' or 'human-moderator').
# 1. moderation-service kind 1984 events use ['p', sha256] (video hash, not
# a real pubkey) and have no 'e' tag, so ReportedEventId is empty and
# ReportedPubkey is a sha256. BanNostrEvent with those identifiers would
# fail or produce incorrect bans.
#
# NOTE: ReportReason values below still don't match the actual MOD
# labels (NS, VI, AI). These need alignment with the kind 1984 tag
# structure. The rule currently won't match because it checks for
# 'ai_generated' etc. but the reports use 'NS', 'VI', 'AI'.
# 2. Enforcement with real Nostr identifiers is handled by ai_classification.sml
# (which operates on actual video events and calls the moderation API directly)
# and label_routing.sml (which fires on kind 1985 human-verified decisions).
#
# This path is one of two for moderation-service output into Osprey:
# - Kind 1984 (this file): automated AI signal, routes to human review
# - Kind 1985 (content/label_routing.sml): human-verified decisions, enforces
#
# Naming note: the rule is still called `ModerationServiceBan` to match the
# existing `ModerationServiceBan` column in the `osprey.osprey_events`
# ClickHouse schema. Rule names become ClickHouse columns, so renaming the
# rule without a coordinated ALTER TABLE breaks every output sink flush.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The schema-compat reason for keeping the ModerationServiceBan name is good, but the follow-up (coordinated ALTER TABLE + rename) is deferred with no tracking link. Please reference an issue number or TODO owner so this doesn't become a forgotten semantic mismatch. Rule name → column name coupling is the kind of thing a future author will trip over when they try to rename it cleanly.

# The semantics have changed (signal-only, no ban) but the name stays until
# we land a paired iac-coreconfig column rename.

Import(
rules=[
Expand All @@ -27,15 +38,15 @@ ModerationServiceBan = Rule(
when_all=[
Kind == 1984,
HasLabel(entity=Pubkey, label='moderation_service'),
ReportReason in ['ai_generated', 'deepfake', 'self_harm', 'offensive'],
ReportReason in ['nudity', 'violence', 'ai_generated'],
],
description='Divine moderation service flagged content for permanent ban',
description='Divine moderation service flagged content for human review (signal only, name retained for ClickHouse schema compatibility)',
)

WhenRules(
rules_any=[ModerationServiceBan],
then=[
BanNostrEvent(event_id=ReportedEventId, pubkey=ReportedPubkey, reason='Content flagged by moderation service'),
DeclareVerdict(verdict='auto_ban'),
LabelAdd(entity=EventId, label='ai_classified'),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth calling out in the rule comment: LabelAdd(entity=EventId, label='ai_classified') attaches the label to the 1984 report event itself (moderation-service's own event), not to the video it's reporting — because the content event ID isn't recoverable from this tag structure. That's consistent with the 'signal only' framing, but readers might assume EventId refers to the content being flagged. One clarifying line would prevent confusion.

DeclareVerdict(verdict='flag_for_review'),
],
)
Loading