Skip to content

Conversation

@Pouyanpi
Copy link
Collaborator

@Pouyanpi Pouyanpi commented Dec 2, 2025

Description

Detect user input language and return refusal messages in the same language when content safety rails block unsafe content. Supports 9 languages: English, Spanish, Chinese, German, French, Hindi, Japanese, Arabic, and Thai.

Language Detection Benchmark Results

Datasets Used

Dataset Description Samples Languages
papluca language-identification 40,500 9 (all supported)
nemotron NVIDIA Nemotron-Safety-Guard-Dataset-v3 336,283 8 (missing zh*)

Chinese samples in Nemotron are all REDACTED; Chinese coverage validated via papluca dataset.

Prompt Length Analysis (characters)

Dataset Min Max Mean P25 P50 P75 P90 P95 P99
papluca 2 3,657 129.8 50 97 162 258 351 627
nemotron 1 20,750 303.5 51 111 331 625 1,072 3,004

Note: fast-langdetect truncates input at 80 characters by default (max_input_length=80), so longer prompts are effectively evaluated on their first 80 chars.


Overall Accuracy comparison

Dataset Samples fast-langdetect lingua detect_language action
papluca 40,500 99.71% 99.79% 99.71%
nemotron 336,283 99.35% 99.46% 99.42%

Latency comparison (μs)

Dataset fast-langdetect Avg fast-langdetect P95 lingua Avg lingua P95 Action Avg Action P95
papluca 12.12 15.54 116.21 205.29 25.77 28.75
nemotron 11.53 15.50 162.59 377.92 26.25 28.71

Per Language Accuracy (fast-langdetect)

Language papluca nemotron
ar (Arabic) 98.87% 99.63%
de (German) 99.93% 99.39%
en (English) 100.00% 99.03%
es (Spanish) 100.00% 99.04%
fr (French) 99.98% 99.25%
hi (Hindi) 98.76% 99.60%
ja (Japanese) 100.00% 99.61%
th (Thai) 99.93% 99.29%
zh (Chinese) 99.93% N/A

Per-Language Accuracy (lingua)

Language papluca nemotron
ar (Arabic) 99.84% 99.75%
de (German) 100.00% 99.55%
en (English) 99.93% 99.00%
es (Spanish) 99.98% 99.43%
fr (French) 99.82% 99.35%
hi (Hindi) 98.80% 99.81%
ja (Japanese) 100.00% 99.69%
th (Thai) 99.78% 99.12%
zh (Chinese) 99.93% N/A

Why fast-langdetect?

https://github.com/LlmKira/fast-langdetect

  1. MIT license and Creative Commons Attribution-Share-Alike License 3.0.
  2. comparable accuracy: within 0.1-0.5% of lingua across all datasets (99.35% vs 99.46% on 336k samples)
  3. 10-14x faster: average latency ~12μs vs ~140μs
  4. simpler integration: single lightweight dependency
  5. no cold start issues: unlike lingua which requires model building
  6. no dependency issue in future

Error analysis

Most errors occur with:

  • short text (single words): insufficient context for detection
  • mixed language content: text containing English within non-English context
  • similar language confusion: Spanish vs Galician, Hindi vs Marathi, Arabic vs Persian

The action correctly falls back to English (en) for unsupported detected languages.


Benchmark Scripts

checkout to temp/lang-detect-benchmark branch

Located in eval/language_detection/:
make sure to have datasets and pandas installed:

poetry run pip install pandas datasets
# run all benchmarks
poetry run python eval/language_detection/run_benchmarks.py

# Or run individually
poetry run python eval/language_detection/benchmark.py --dataset papluca --mode action --report eval/language_detection/reports/

@codecov
Copy link

codecov bot commented Dec 2, 2025

Codecov Report

❌ Patch coverage is 94.73684% with 3 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
nemoguardrails/library/content_safety/actions.py 93.75% 3 Missing ⚠️

📢 Thoughts on this report? Let us know!

Comment on lines +227 to +238
DEFAULT_REFUSAL_MESSAGES: Dict[str, str] = {
"en": "I'm sorry, I can't respond to that.",
"es": "Lo siento, no puedo responder a eso.",
"zh": "抱歉,我无法回应。",
"de": "Es tut mir leid, darauf kann ich nicht antworten.",
"fr": "Je suis désolé, je ne peux pas répondre à cela.",
"hi": "मुझे खेद है, मैं इसका जवाब नहीं दे सकता।",
"ja": "申し訳ありませんが、それには回答できません。",
"ar": "عذراً، لا أستطيع الرد على ذلك.",
"th": "ขออภัย ฉันไม่สามารถตอบได้",
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we later had other multilingual rails, would we be repeating this mechanism in each rail? Or just the set of supported languages per rail? I don't think we need to do it now (since we don't have other multilingual rails to test it), but we should be aware of what refactoring would be needed to move the below language detection to a shared level.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, we can relax this constraint later and allow users more flexibility. Once we need to support other models or other types of rails (beyond content safety) that require multilingual responses, we can:

  1. Move the detect_language action from library/content_safety/actions.py to a shared location (nemoguardrails/actions/) making it available to all rails
  2. It was possible to also introduce a Colang level abstraction like bot refuse to respond $multilang=true, could be done easily for Colang 2.0, but I think it is better if we don't add new Colang features for now.

I agree, for now, keeping it scoped to content safety keeps the implementation focused.

try:
from fast_langdetect import detect

result = detect(text, k=1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does fast-langdetect ever return a full locale with dialect, like en-US versus en? I don't see it in the docs, but I do see some upper/lowercase inconsistency.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair point, thanks for raising it. Just took a closer look: the fast-langdetect source code and fastText model behavior.

  • fast-langdetect README mentions BCP-47 tags like "zh-cn", "pt-br"
  • but the fastText lid.176.bin model uses simple ISO 639 codes: zh, pt, en, etc.
  • fast-langdetect source simply strips __label__ prefix from fastText output, no regional mapping is applied

validated with actual test:

>>> detect("抱歉,我无法处理该请求", k=2)
[{'lang': 'zh', 'score': 0.80}, {'lang': 'ta', 'score': 0.08}]

returns "zh", NOT "zh-cn".

So no regional variant handling needed.

@tgasser-nv
Copy link
Collaborator

tgasser-nv commented Dec 3, 2025

This looks really good @Pouyanpi ! I have a few comments:

  • Could you commit the evaluation scripts as well in the final PR (for reproducibility?)
  • What does the "Action" column in the latency report refer to? Is this the latency end-to-end when fast-langdetect embedded in a Guardrails action? It approximately doubles the mean and p95.
    * Is it possible to customize refusal texts in Colang-only, or does it need a Python change? Just saw this is in the RailsConfig, that's perfect.
  • Could you calculate percentiles of prompt-length (ideally in tokens but characters is fine too) for each of the datasets?

Not needed in this PR, but I'm thinking of RAG prompts where we have LLM instructions, user query, and relevant context chunks are all in a flattened prompt. These prompts can be pretty long (up to 7k tokens in some cases). This isn't needed for this PR, but I would be interested in a follow-on where we sample part of a prompt before running classification on the sample (e.g. 200 chars). This would be an optional config field. Customers would then have a knob to trade off accuracy vs latency for language detection.

@Pouyanpi
Copy link
Collaborator Author

Pouyanpi commented Dec 10, 2025

  • Could you commit the evaluation scripts as well in the final PR (for reproducibility?)

I've included them temp/lang-detect-benchmark branch to make review easier. If you find it easier I will do.
But we don't intend to merge those in develop, right?

  • What does the "Action" column in the latency report refer to? Is this the latency end-to-end when fast-langdetect embedded in a Guardrails action? It approximately doubles the mean and p95.

Yes

* Is it possible to customize refusal texts in Colang-only, or does it need a Python change? Just saw this is in the RailsConfig, that's perfect.

Yes, I would like to avoid adding colang level features as much as possible

  • Could you calculate percentiles of prompt-length (ideally in tokens but characters is fine too) for each of the datasets?

Done! updated the description.

Not needed in this PR, but I'm thinking of RAG prompts where we have LLM instructions, user query, and relevant context chunks are all in a flattened prompt. These prompts can be pretty long (up to 7k tokens in some cases). This isn't needed for this PR, but I would be interested in a follow-on where we sample part of a prompt before running classification on the sample (e.g. 200 chars). This would be an optional config field. Customers would then have a knob to trade off accuracy vs latency for language detection.

fast-langdetect already does the truncation by default but indeed we can give that flexibility to the users :

  • max_input_length=80 characters (configurable)

@tgasser-nv
Copy link
Collaborator

  • Could you commit the evaluation scripts as well in the final PR (for reproducibility?)

I've included them temp/lang-detect-benchmark branch to make review easier. If you find it easier I will do. But we don't intend to merge those in develop, right?

Why wouldn't we merge them into develop? It's best practice in ML to make any results reproducible, for which we need the input datasets and scripts. The datasets are public and linked above. I'd imagine we'll have to re-run evals for new languages as they're added to the content-safety and other models. So we'll run this script periodically.

  • What does the "Action" column in the latency report refer to? Is this the latency end-to-end when fast-langdetect embedded in a Guardrails action? It approximately doubles the mean and p95.

Yes

Was that measured at a concurrency of 1? Having a 100% overhead for each language inference is a lot higher than I'd expect. We don't need to fix it in this PR.

* Is it possible to customize refusal texts in Colang-only, or does it need a Python change? Just saw this is in the RailsConfig, that's perfect.

Yes, I would like to avoid adding colang level features as much as possible

+1

  • Could you calculate percentiles of prompt-length (ideally in tokens but characters is fine too) for each of the datasets?

Done! updated the description.

Could you check? I didn't see any length description.

Not needed in this PR, but I'm thinking of RAG prompts where we have LLM instructions, user query, and relevant context chunks are all in a flattened prompt. These prompts can be pretty long (up to 7k tokens in some cases). This isn't needed for this PR, but I would be interested in a follow-on where we sample part of a prompt before running classification on the sample (e.g. 200 chars). This would be an optional config field. Customers would then have a knob to trade off accuracy vs latency for language detection.

fast-langdetect already does the truncation by default but indeed we can give that flexibility to the users :

  • max_input_length=80 characters (configurable)

Could you add optional Pydantic fields for any of these values it makes sense to expose to users? Looking at the config I think normalize_input, max_input_length, and model are all fields users might care about

…age support

Detect user input language and return refusal messages in the same
language when content safety rails block unsafe content. Supports 9
languages: English, Spanish, Chinese, German, French, Hindi, Japanese,
Arabic, and Thai.
Add configurable parameters for language detection:
- max_text_length: Control maximum input text length for detection
- normalize_text: Toggle text normalization before detection
- cache_dir: Specify custom cache directory for detection models

Updated MultilingualConfig with new optional fields and modified
_detect_language to use LangDetector with custom configuration
instead of the simple detect function.
@Pouyanpi Pouyanpi force-pushed the feat/multilang-content-safety branch from 1a6ea08 to 4339ac2 Compare December 12, 2025 12:09
@Pouyanpi
Copy link
Collaborator Author

Thanks @tgasser-nv! Yes you're right, here is how it looks (didn't save the change):

image

I've extended the configuration but I'm not sure if we actually need to do it (I think better to revert its commit).
most users will just do multilingual.enabled: true and use library defaults. The config options (max_text_length, normalize_text, cache_dir) are edge cases.

Do we actually need these config options exposed?

  • cache_dir -> users can set FTLANG_CACHE env var
  • normalize_text -> library default (true) is almost always correct
  • max_text_length -> library default (80) is optimized for accuracy

Maybe the cleanest solution is: don't expose these options at all and keep _detect_language(text) simple.

If a power user really needs custom settings, they can:

  1. Use FTLANG_CACHE env var for cache location
  2. Or we add config options later when there's a real need

the model field looks library specific and we are better of not including it in the config. I found cache_dir might be interesting but we can document FTLANG_CACHE. Do you think we should keep the new config options or revert?

Regarding the evaluation scripts and datasets: let's address that in a follow-up PR to keep this one focused.
I think it's better to keep this type of analysis work in dedicated branches for now.

We can follow up in separate PRs once we establish a clear pattern for where these should live (e.g., scripts/benchmarks/, eval/, etc.) and how they should be maintained.

@Pouyanpi Pouyanpi marked this pull request as ready for review December 12, 2025 12:18
@Pouyanpi Pouyanpi marked this pull request as draft December 12, 2025 12:18
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Dec 12, 2025

Greptile Overview

Greptile Summary

This PR adds automatic language detection to content safety refusal messages, returning responses in the user's detected language across 9 supported languages (English, Spanish, Chinese, German, French, Hindi, Japanese, Arabic, Thai).

Key Changes:

  • Added detect_language action using fast-langdetect library for language detection with 99%+ accuracy
  • Integrated language detection into content safety flows (both v1 and v2) with opt-in configuration
  • Created configurable MultilingualConfig with customizable refusal messages, text normalization, and caching options
  • Added comprehensive test coverage including edge cases, error handling, and configuration validation
  • Provided example configuration demonstrating the feature

Implementation Quality:

  • Proper error handling with graceful fallbacks (ImportError, detection failures default to English)
  • Optional dependency properly configured in pyproject.toml with multilingual extras
  • Well-documented configuration fields with clear defaults
  • Consistent implementation across both flow versions (flows.co and flows.v1.co)
  • No breaking changes - feature is opt-in via configuration

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • The implementation demonstrates excellent software engineering practices with comprehensive test coverage (99%+ accuracy benchmarking), proper error handling with fallbacks, graceful degradation when dependencies are missing, and clear configuration design. The feature is opt-in with no breaking changes, and the library choice (fast-langdetect) is well-justified with benchmarking data showing 10-14x performance improvement over alternatives while maintaining comparable accuracy.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
nemoguardrails/library/content_safety/actions.py 5/5 Added language detection action with proper error handling, fallbacks, and integration with config
nemoguardrails/library/content_safety/flows.co 5/5 Integrated multilingual refusal messages into content safety flows for both input and output checks
nemoguardrails/rails/llm/config.py 5/5 Added well-structured configuration models for multilingual content safety with clear documentation
pyproject.toml 5/5 Added optional fast-langdetect dependency with proper extras configuration
tests/test_content_safety_actions.py 5/5 Comprehensive test coverage for language detection, refusal messages, and edge cases

Sequence Diagram

sequenceDiagram
    participant User
    participant ContentSafetyFlow
    participant ContentSafetyCheck
    participant Config
    participant DetectLanguageAction
    participant LangDetector
    participant Bot

    User->>ContentSafetyFlow: Input message
    ContentSafetyFlow->>ContentSafetyCheck: Check safety (input/output)
    ContentSafetyCheck-->>ContentSafetyFlow: {allowed: false, policy_violations: [...]}
    
    alt multilingual enabled
        ContentSafetyFlow->>Config: Check multilingual.enabled
        Config-->>ContentSafetyFlow: enabled=true
        ContentSafetyFlow->>DetectLanguageAction: detect_language(user_message)
        DetectLanguageAction->>Config: Get multilingual config
        Config-->>DetectLanguageAction: custom_messages, max_text_length, normalize_text, cache_dir
        DetectLanguageAction->>LangDetector: detect(text)
        LangDetector-->>DetectLanguageAction: detected_lang (or None)
        DetectLanguageAction->>DetectLanguageAction: Fallback to 'en' if None or unsupported
        DetectLanguageAction->>DetectLanguageAction: Get refusal message (custom or default)
        DetectLanguageAction-->>ContentSafetyFlow: {language: lang, refusal_message: message}
        ContentSafetyFlow->>Bot: Send refusal_message
    else multilingual disabled
        ContentSafetyFlow->>Bot: Send default "refuse to respond"
    end
    
    Bot-->>User: Refusal message (in detected language)
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

9 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@Pouyanpi Pouyanpi marked this pull request as ready for review December 12, 2025 12:23
@Pouyanpi Pouyanpi self-assigned this Dec 12, 2025
@Pouyanpi Pouyanpi added this to the 0.20.0 milestone Dec 12, 2025
@Pouyanpi Pouyanpi added the enhancement New feature or request label Dec 12, 2025
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

9 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants