-
Notifications
You must be signed in to change notification settings - Fork 567
feat(content_safety): add support to auto select multilingual refusal bot messages #1530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
| DEFAULT_REFUSAL_MESSAGES: Dict[str, str] = { | ||
| "en": "I'm sorry, I can't respond to that.", | ||
| "es": "Lo siento, no puedo responder a eso.", | ||
| "zh": "抱歉,我无法回应。", | ||
| "de": "Es tut mir leid, darauf kann ich nicht antworten.", | ||
| "fr": "Je suis désolé, je ne peux pas répondre à cela.", | ||
| "hi": "मुझे खेद है, मैं इसका जवाब नहीं दे सकता।", | ||
| "ja": "申し訳ありませんが、それには回答できません。", | ||
| "ar": "عذراً، لا أستطيع الرد على ذلك.", | ||
| "th": "ขออภัย ฉันไม่สามารถตอบได้", | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we later had other multilingual rails, would we be repeating this mechanism in each rail? Or just the set of supported languages per rail? I don't think we need to do it now (since we don't have other multilingual rails to test it), but we should be aware of what refactoring would be needed to move the below language detection to a shared level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, we can relax this constraint later and allow users more flexibility. Once we need to support other models or other types of rails (beyond content safety) that require multilingual responses, we can:
- Move the
detect_languageaction fromlibrary/content_safety/actions.pyto a shared location (nemoguardrails/actions/) making it available to all rails - It was possible to also introduce a Colang level abstraction like
bot refuse to respond $multilang=true, could be done easily for Colang 2.0, but I think it is better if we don't add new Colang features for now.
I agree, for now, keeping it scoped to content safety keeps the implementation focused.
| try: | ||
| from fast_langdetect import detect | ||
|
|
||
| result = detect(text, k=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does fast-langdetect ever return a full locale with dialect, like en-US versus en? I don't see it in the docs, but I do see some upper/lowercase inconsistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair point, thanks for raising it. Just took a closer look: the fast-langdetect source code and fastText model behavior.
- fast-langdetect README mentions BCP-47 tags like
"zh-cn","pt-br" - but the fastText lid.176.bin model uses simple ISO 639 codes:
zh,pt,en, etc. - fast-langdetect source simply strips
__label__prefix from fastText output, no regional mapping is applied
validated with actual test:
>>> detect("抱歉,我无法处理该请求", k=2)
[{'lang': 'zh', 'score': 0.80}, {'lang': 'ta', 'score': 0.08}]returns "zh", NOT "zh-cn".
So no regional variant handling needed.
|
This looks really good @Pouyanpi ! I have a few comments:
Not needed in this PR, but I'm thinking of RAG prompts where we have LLM instructions, user query, and relevant context chunks are all in a flattened prompt. These prompts can be pretty long (up to 7k tokens in some cases). This isn't needed for this PR, but I would be interested in a follow-on where we sample part of a prompt before running classification on the sample (e.g. 200 chars). This would be an optional config field. Customers would then have a knob to trade off accuracy vs latency for language detection. |
I've included them temp/lang-detect-benchmark branch to make review easier. If you find it easier I will do.
Yes
Yes, I would like to avoid adding colang level features as much as possible
Done! updated the description.
fast-langdetect already does the truncation by default but indeed we can give that flexibility to the users :
|
Why wouldn't we merge them into develop? It's best practice in ML to make any results reproducible, for which we need the input datasets and scripts. The datasets are public and linked above. I'd imagine we'll have to re-run evals for new languages as they're added to the content-safety and other models. So we'll run this script periodically.
Was that measured at a concurrency of 1? Having a 100% overhead for each language inference is a lot higher than I'd expect. We don't need to fix it in this PR.
+1
Could you check? I didn't see any length description.
Could you add optional Pydantic fields for any of these values it makes sense to expose to users? Looking at the config I think |
…age support Detect user input language and return refusal messages in the same language when content safety rails block unsafe content. Supports 9 languages: English, Spanish, Chinese, German, French, Hindi, Japanese, Arabic, and Thai.
Add configurable parameters for language detection: - max_text_length: Control maximum input text length for detection - normalize_text: Toggle text normalization before detection - cache_dir: Specify custom cache directory for detection models Updated MultilingualConfig with new optional fields and modified _detect_language to use LangDetector with custom configuration instead of the simple detect function.
1a6ea08 to
4339ac2
Compare
|
Thanks @tgasser-nv! Yes you're right, here is how it looks (didn't save the change):
I've extended the configuration but I'm not sure if we actually need to do it (I think better to revert its commit). Do we actually need these config options exposed?
Maybe the cleanest solution is: don't expose these options at all and keep _detect_language(text) simple. If a power user really needs custom settings, they can:
the Regarding the evaluation scripts and datasets: let's address that in a follow-up PR to keep this one focused. We can follow up in separate PRs once we establish a clear pattern for where these should live (e.g., scripts/benchmarks/, eval/, etc.) and how they should be maintained. |
Greptile OverviewGreptile SummaryThis PR adds automatic language detection to content safety refusal messages, returning responses in the user's detected language across 9 supported languages (English, Spanish, Chinese, German, French, Hindi, Japanese, Arabic, Thai). Key Changes:
Implementation Quality:
|
| Filename | Score | Overview |
|---|---|---|
| nemoguardrails/library/content_safety/actions.py | 5/5 | Added language detection action with proper error handling, fallbacks, and integration with config |
| nemoguardrails/library/content_safety/flows.co | 5/5 | Integrated multilingual refusal messages into content safety flows for both input and output checks |
| nemoguardrails/rails/llm/config.py | 5/5 | Added well-structured configuration models for multilingual content safety with clear documentation |
| pyproject.toml | 5/5 | Added optional fast-langdetect dependency with proper extras configuration |
| tests/test_content_safety_actions.py | 5/5 | Comprehensive test coverage for language detection, refusal messages, and edge cases |
Sequence Diagram
sequenceDiagram
participant User
participant ContentSafetyFlow
participant ContentSafetyCheck
participant Config
participant DetectLanguageAction
participant LangDetector
participant Bot
User->>ContentSafetyFlow: Input message
ContentSafetyFlow->>ContentSafetyCheck: Check safety (input/output)
ContentSafetyCheck-->>ContentSafetyFlow: {allowed: false, policy_violations: [...]}
alt multilingual enabled
ContentSafetyFlow->>Config: Check multilingual.enabled
Config-->>ContentSafetyFlow: enabled=true
ContentSafetyFlow->>DetectLanguageAction: detect_language(user_message)
DetectLanguageAction->>Config: Get multilingual config
Config-->>DetectLanguageAction: custom_messages, max_text_length, normalize_text, cache_dir
DetectLanguageAction->>LangDetector: detect(text)
LangDetector-->>DetectLanguageAction: detected_lang (or None)
DetectLanguageAction->>DetectLanguageAction: Fallback to 'en' if None or unsupported
DetectLanguageAction->>DetectLanguageAction: Get refusal message (custom or default)
DetectLanguageAction-->>ContentSafetyFlow: {language: lang, refusal_message: message}
ContentSafetyFlow->>Bot: Send refusal_message
else multilingual disabled
ContentSafetyFlow->>Bot: Send default "refuse to respond"
end
Bot-->>User: Refusal message (in detected language)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
9 files reviewed, no comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
9 files reviewed, no comments

Description
Detect user input language and return refusal messages in the same language when content safety rails block unsafe content. Supports 9 languages: English, Spanish, Chinese, German, French, Hindi, Japanese, Arabic, and Thai.
Language Detection Benchmark Results
Datasets Used
Chinese samples in Nemotron are all REDACTED; Chinese coverage validated via papluca dataset.
Prompt Length Analysis (characters)
Note: fast-langdetect truncates input at 80 characters by default (
max_input_length=80), so longer prompts are effectively evaluated on their first 80 chars.Overall Accuracy comparison
Latency comparison (μs)
Per Language Accuracy (fast-langdetect)
Per-Language Accuracy (lingua)
Why fast-langdetect?
https://github.com/LlmKira/fast-langdetect
Error analysis
Most errors occur with:
The action correctly falls back to English (en) for unsupported detected languages.
Benchmark Scripts
checkout to temp/lang-detect-benchmark branch
Located in eval/language_detection/:
make sure to have datasets and pandas installed: