Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Nov 13, 2025

📄 1,026% (10.26x) speedup for is_model_allowed_by_pattern in litellm/proxy/auth/auth_checks.py

⏱️ Runtime : 21.6 milliseconds 1.92 milliseconds (best of 5 runs)

📝 Explanation and details

The optimization adds compiled regex caching to eliminate redundant regex compilation overhead. Instead of calling re.match() with a string pattern (which compiles the regex on every call), the optimized version:

  1. Pre-compiles patterns using re.compile() and stores them in a module-level cache _pattern_cache
  2. Reuses compiled patterns for repeated wildcard patterns, avoiding expensive regex compilation

Key Performance Impact:

  • The line profiler shows regex compilation (re.compile) now only happens 545 times instead of 4,086 times (87% reduction)
  • Each re.match() call becomes significantly faster when using pre-compiled patterns
  • Overall runtime improved from 21.6ms to 1.92ms (1026% speedup)

Why This Works:
Regex compilation is computationally expensive, involving parsing the pattern string and building a finite state machine. When the same wildcard patterns are used repeatedly (common in auth scenarios), caching the compiled regex objects eliminates this repeated overhead.

Real-World Benefits:
Based on the function references, is_model_allowed_by_pattern is called within loops in _model_matches_any_wildcard_pattern_in_list(), making it a hot path for model authorization checks. The test results show particularly dramatic improvements (3000%+ speedups) for complex patterns with multiple wildcards or special characters, which are common in model naming schemes like "bedrock/", "openai/gpt-", etc.

Test Case Performance:

  • Simple patterns (single wildcard): 100-200% speedup
  • Complex patterns with multiple wildcards or special chars: 3000-15000% speedup
  • Large-scale tests with repeated pattern usage: 5000%+ speedup

The optimization is most effective when the same wildcard patterns are used multiple times across authorization checks, which is the typical usage pattern in proxy authentication systems.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 4094 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import re

# imports
import pytest  # used for our unit tests
from litellm.proxy.auth.auth_checks import is_model_allowed_by_pattern

# unit tests

# --- Basic Test Cases ---

def test_exact_match_with_wildcard():
    # Wildcard at the end matches any suffix
    codeflash_output = is_model_allowed_by_pattern("bedrock/anthropic.claude-3-5-sonnet-20240620", "bedrock/*") # 3.74μs -> 1.93μs (93.9% faster)
    # Wildcard at start matches any prefix
    codeflash_output = is_model_allowed_by_pattern("openai/gpt-4", "*gpt-4") # 47.0μs -> 1.11μs (4142% faster)
    # Wildcard only pattern matches everything
    codeflash_output = is_model_allowed_by_pattern("any/model", "*") # 2.27μs -> 720ns (215% faster)
    # Wildcard in the middle
    codeflash_output = is_model_allowed_by_pattern("openai/gpt-4-32k", "openai/*-32k") # 37.3μs -> 997ns (3642% faster)
    # Multiple wildcards
    codeflash_output = is_model_allowed_by_pattern("foo/bar/baz", "foo/*/baz") # 1.96μs -> 819ns (139% faster)

def test_no_wildcard_exact_match():
    # No wildcard, should always return False (per implementation)
    codeflash_output = not is_model_allowed_by_pattern("bedrock/anthropic.claude-3-5-sonnet-20240620", "bedrock/anthropic.claude-3-5-sonnet-20240620") # 484ns -> 552ns (12.3% slower)
    codeflash_output = not is_model_allowed_by_pattern("openai/gpt-4", "openai/gpt-4") # 327ns -> 320ns (2.19% faster)
    codeflash_output = not is_model_allowed_by_pattern("foo/bar", "foo/bar") # 235ns -> 226ns (3.98% faster)
    codeflash_output = not is_model_allowed_by_pattern("", "") # 209ns -> 213ns (1.88% slower)

def test_basic_non_matches():
    # Wildcard doesn't match
    codeflash_output = not is_model_allowed_by_pattern("openai/gpt-4", "bedrock/*") # 3.15μs -> 1.58μs (99.2% faster)
    codeflash_output = not is_model_allowed_by_pattern("bedrock/anthropic.claude-3-5-sonnet-20240620", "openai/*") # 1.55μs -> 734ns (112% faster)
    codeflash_output = not is_model_allowed_by_pattern("foo/bar/baz", "foo/*/qux") # 1.56μs -> 910ns (71.2% faster)
    codeflash_output = not is_model_allowed_by_pattern("foo/bar/baz", "bar/*") # 44.6μs -> 629ns (6983% faster)

# --- Edge Test Cases ---

def test_empty_strings():
    # Empty pattern with wildcard matches only empty model
    codeflash_output = is_model_allowed_by_pattern("", "*") # 3.16μs -> 1.62μs (95.0% faster)
    # Empty pattern without wildcard never matches
    codeflash_output = not is_model_allowed_by_pattern("", "") # 321ns -> 296ns (8.45% faster)
    # Model is empty, pattern is not
    codeflash_output = not is_model_allowed_by_pattern("", "foo/*") # 1.79μs -> 1.00μs (78.2% faster)
    # Pattern is empty, model is not
    codeflash_output = not is_model_allowed_by_pattern("foo", "") # 172ns -> 160ns (7.50% faster)

def test_only_wildcard_pattern():
    # Pattern is just "*", should match any model string
    codeflash_output = is_model_allowed_by_pattern("anything", "*") # 2.92μs -> 1.71μs (70.3% faster)
    codeflash_output = is_model_allowed_by_pattern("123", "*") # 1.33μs -> 772ns (71.9% faster)
    codeflash_output = is_model_allowed_by_pattern("", "*") # 930ns -> 561ns (65.8% faster)

def test_special_characters_in_model_and_pattern():
    # Model and pattern with regex special chars
    codeflash_output = is_model_allowed_by_pattern("foo.bar", "foo.*") # 3.22μs -> 1.94μs (66.3% faster)
    codeflash_output = is_model_allowed_by_pattern("foo+bar", "foo*bar") # 1.99μs -> 1.10μs (80.2% faster)
    codeflash_output = not is_model_allowed_by_pattern("foo/bar", "foo.bar*") # 52.3μs -> 939ns (5465% faster)
    # Pattern with multiple wildcards and special chars
    codeflash_output = is_model_allowed_by_pattern("foo.bar.baz", "foo*bar*baz") # 42.1μs -> 960ns (4290% faster)
    # Model with numbers and underscores
    codeflash_output = is_model_allowed_by_pattern("foo_123_bar", "foo*_bar") # 31.2μs -> 624ns (4892% faster)
    # Pattern with escaped characters
    codeflash_output = not is_model_allowed_by_pattern("foo[bar]", "foo*bar*") # 33.6μs -> 715ns (4599% faster)

def test_wildcard_at_various_positions():
    # Wildcard at start
    codeflash_output = is_model_allowed_by_pattern("gpt-4", "*gpt-4") # 3.36μs -> 1.90μs (77.2% faster)
    codeflash_output = is_model_allowed_by_pattern("openai/gpt-4", "*gpt-4") # 1.66μs -> 845ns (96.4% faster)
    # Wildcard at end
    codeflash_output = is_model_allowed_by_pattern("openai/gpt-4", "openai/*") # 1.59μs -> 881ns (80.6% faster)
    # Wildcard in the middle
    codeflash_output = is_model_allowed_by_pattern("foo123bar", "foo*bar") # 1.35μs -> 877ns (53.6% faster)
    # Multiple wildcards
    codeflash_output = is_model_allowed_by_pattern("foo123bar456baz", "foo*bar*baz") # 1.35μs -> 806ns (68.0% faster)
    # Non-matching
    codeflash_output = not is_model_allowed_by_pattern("foo123baz", "foo*bar*baz") # 1.09μs -> 700ns (55.6% faster)

def test_pattern_with_multiple_stars():
    # Multiple '*' wildcards
    codeflash_output = is_model_allowed_by_pattern("foo123bar456baz", "foo*bar*baz") # 3.28μs -> 1.83μs (79.4% faster)
    codeflash_output = is_model_allowed_by_pattern("fooXYZbarABCbaz", "foo*bar*baz") # 1.53μs -> 930ns (64.8% faster)
    codeflash_output = not is_model_allowed_by_pattern("fooXYZbazABCbar", "foo*bar*baz") # 1.28μs -> 804ns (58.8% faster)

def test_pattern_is_only_star():
    # Only star should match anything, including empty string
    codeflash_output = is_model_allowed_by_pattern("", "*") # 3.04μs -> 1.52μs (101% faster)
    codeflash_output = is_model_allowed_by_pattern("abc", "*") # 1.45μs -> 722ns (100% faster)
    codeflash_output = is_model_allowed_by_pattern("123", "*") # 1.05μs -> 603ns (73.5% faster)

def test_model_and_pattern_are_identical_but_pattern_has_star():
    # Pattern with star should match model if star is replaced by empty string
    codeflash_output = is_model_allowed_by_pattern("foo", "f*o") # 49.3μs -> 1.61μs (2970% faster)
    codeflash_output = is_model_allowed_by_pattern("foo", "*foo*") # 37.7μs -> 944ns (3893% faster)
    codeflash_output = is_model_allowed_by_pattern("foo", "foo*") # 2.24μs -> 822ns (172% faster)
    codeflash_output = is_model_allowed_by_pattern("foo", "*foo") # 29.2μs -> 829ns (3426% faster)
    # Should not match if pattern structure doesn't fit
    codeflash_output = not is_model_allowed_by_pattern("foo", "f*oo*bar") # 34.9μs -> 619ns (5538% faster)

# --- Large Scale Test Cases ---

def test_large_number_of_models_and_patterns():
    # Test matching a large list of models against a pattern with a wildcard
    pattern = "prefix/*/suffix"
    models = [f"prefix/{i}/suffix" for i in range(1000)]
    for m in models:
        codeflash_output = is_model_allowed_by_pattern(m, pattern) # 946μs -> 466μs (103% faster)
    # Test that models not matching the pattern fail
    for i in range(1000):
        codeflash_output = not is_model_allowed_by_pattern(f"prefix/{i}/not_suffix", pattern) # 846μs -> 425μs (99.1% faster)


def test_performance_many_patterns():
    # Test performance with many different patterns
    models = [f"model_{i}" for i in range(500)]
    patterns = [f"model_{i}*" for i in range(500)]
    for model, pattern in zip(models, patterns):
        codeflash_output = is_model_allowed_by_pattern(model, pattern) # 14.6ms -> 268μs (5317% faster)
    # Negative case: model does not match pattern
    for i in range(500):
        codeflash_output = not is_model_allowed_by_pattern(f"foo_{i}", patterns[i]) # 417μs -> 194μs (115% faster)

def test_large_wildcard_middle():
    # Wildcard in the middle of a long string
    model = "prefix_" + "x" * 400 + "_suffix"
    pattern = "prefix_*_suffix"
    codeflash_output = is_model_allowed_by_pattern(model, pattern) # 53.9μs -> 3.24μs (1563% faster)
    # Non-matching case
    model = "prefix_" + "x" * 400 + "_not_suffix"
    codeflash_output = not is_model_allowed_by_pattern(model, pattern) # 2.92μs -> 1.25μs (134% faster)

# --- Mutation-sensitive cases ---

def test_mutation_sensitive_cases():
    # If implementation changes, these should fail
    # Wildcard should match zero or more characters
    codeflash_output = is_model_allowed_by_pattern("foo", "f*o") # 3.60μs -> 1.96μs (83.8% faster)
    codeflash_output = is_model_allowed_by_pattern("fo", "f*o") # 1.33μs -> 796ns (67.2% faster)
    codeflash_output = is_model_allowed_by_pattern("fbaro", "f*o") # 1.10μs -> 622ns (77.3% faster)
    codeflash_output = not is_model_allowed_by_pattern("fbar", "f*o") # 1.03μs -> 676ns (53.1% faster)
    # Wildcard should not match across slashes if not present
    codeflash_output = is_model_allowed_by_pattern("foo/bar", "foo/*") # 42.9μs -> 1.21μs (3444% faster)
    codeflash_output = not is_model_allowed_by_pattern("foo/bar/baz", "foo/*") # 1.83μs -> 689ns (165% faster)
    # Wildcard does not match empty string if pattern is not just "*"
    codeflash_output = not is_model_allowed_by_pattern("", "foo*") # 29.7μs -> 648ns (4484% faster)
    codeflash_output = not is_model_allowed_by_pattern("", "foo/*") # 1.28μs -> 461ns (179% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import re

# imports
import pytest  # used for our unit tests
from litellm.proxy.auth.auth_checks import is_model_allowed_by_pattern

# unit tests

# ------------------------------
# 1. Basic Test Cases
# ------------------------------

def test_exact_match_with_wildcard():
    # Pattern with wildcard at end
    codeflash_output = is_model_allowed_by_pattern("bedrock/anthropic.claude-3-5-sonnet-20240620", "bedrock/*") # 53.5μs -> 2.06μs (2494% faster)
    # Pattern with wildcard at start
    codeflash_output = is_model_allowed_by_pattern("anthropic.claude-3-5-sonnet-20240620", "*sonnet-20240620") # 41.4μs -> 1.24μs (3251% faster)
    # Pattern with wildcard in middle
    codeflash_output = is_model_allowed_by_pattern("openai/gpt-4-0125", "openai/*-0125") # 35.0μs -> 1.13μs (2992% faster)
    # Pattern with only wildcard
    codeflash_output = is_model_allowed_by_pattern("openai/gpt-4-0125", "*") # 23.8μs -> 780ns (2956% faster)

def test_non_matching_patterns():
    # No wildcard, should always return False
    codeflash_output = is_model_allowed_by_pattern("bedrock/anthropic.claude-3-5-sonnet-20240620", "bedrock/anthropic") # 527ns -> 595ns (11.4% slower)
    # Wildcard in pattern, but doesn't match model
    codeflash_output = is_model_allowed_by_pattern("openai/gpt-4-0125", "bedrock/*") # 2.75μs -> 1.31μs (110% faster)
    # Wildcard at end, but prefix doesn't match
    codeflash_output = is_model_allowed_by_pattern("bedrock/anthropic.claude", "openai/*") # 44.0μs -> 836ns (5165% faster)

def test_multiple_wildcards():
    # Multiple wildcards in pattern
    codeflash_output = is_model_allowed_by_pattern("foo/bar/baz", "foo/*/baz") # 47.2μs -> 2.01μs (2242% faster)
    codeflash_output = is_model_allowed_by_pattern("foo/bar/baz/qux", "foo/*/baz/*") # 40.4μs -> 1.30μs (2996% faster)
    # Wildcards don't match
    codeflash_output = is_model_allowed_by_pattern("foo/bar/baz", "foo/*/qux") # 30.2μs -> 719ns (4097% faster)

def test_empty_model_and_pattern():
    # Empty model and/or pattern
    codeflash_output = is_model_allowed_by_pattern("", "*") # 3.33μs -> 1.48μs (125% faster)
    codeflash_output = is_model_allowed_by_pattern("", "") # 226ns -> 304ns (25.7% slower)
    codeflash_output = is_model_allowed_by_pattern("foo", "") # 165ns -> 167ns (1.20% slower)
    codeflash_output = is_model_allowed_by_pattern("", "foo*") # 1.71μs -> 836ns (104% faster)

# ------------------------------
# 2. Edge Test Cases
# ------------------------------

def test_pattern_with_special_regex_chars():
    # Pattern with regex special characters that should be treated literally except '*'
    codeflash_output = is_model_allowed_by_pattern("foo.bar", "foo.*") # 45.5μs -> 2.01μs (2161% faster)
    codeflash_output = is_model_allowed_by_pattern("foo[bar]", "foo[*]") # 37.9μs -> 964ns (3831% faster)
    codeflash_output = is_model_allowed_by_pattern("foo(bar)", "foo(*)") # 42.7μs -> 1.01μs (4134% faster)
    # Should not match if pattern doesn't align
    codeflash_output = is_model_allowed_by_pattern("foo.bar", "foo?*") # 32.2μs -> 916ns (3410% faster)

def test_model_with_special_chars():
    # Model with special regex characters, pattern with wildcard
    codeflash_output = is_model_allowed_by_pattern("foo$bar^baz", "foo*$baz") # 48.9μs -> 1.88μs (2495% faster)
    codeflash_output = is_model_allowed_by_pattern("foo$bar^baz", "*^baz") # 32.8μs -> 919ns (3467% faster)
    codeflash_output = is_model_allowed_by_pattern("foo$bar^baz", "foo$*baz") # 30.2μs -> 737ns (3995% faster)

def test_pattern_with_only_wildcards():
    # Only wildcards in pattern
    codeflash_output = is_model_allowed_by_pattern("anything", "*") # 3.29μs -> 1.50μs (120% faster)
    codeflash_output = is_model_allowed_by_pattern("anything", "**") # 40.3μs -> 877ns (4498% faster)
    codeflash_output = is_model_allowed_by_pattern("anything", "***") # 33.5μs -> 818ns (4000% faster)
    # Model is empty, pattern is all wildcards
    codeflash_output = is_model_allowed_by_pattern("", "*") # 1.69μs -> 459ns (268% faster)

def test_pattern_with_adjacent_wildcards():
    # Adjacent wildcards should behave as a single large wildcard
    codeflash_output = is_model_allowed_by_pattern("foo/bar/baz", "foo/**/baz") # 51.7μs -> 2.37μs (2086% faster)
    codeflash_output = is_model_allowed_by_pattern("foo/baz", "foo/**/baz") # 2.44μs -> 668ns (265% faster)
    codeflash_output = is_model_allowed_by_pattern("foo/baz", "foo/*/baz") # 1.25μs -> 596ns (110% faster)

def test_unicode_and_case_sensitivity():
    # Unicode in model and pattern
    codeflash_output = is_model_allowed_by_pattern("foo/ßar", "foo/*") # 3.48μs -> 1.74μs (100% faster)
    codeflash_output = is_model_allowed_by_pattern("foo/ßar", "foo/ß*") # 35.9μs -> 1.09μs (3196% faster)
    # Case sensitivity
    codeflash_output = is_model_allowed_by_pattern("Foo/Bar", "foo/*") # 1.87μs -> 603ns (210% faster)

def test_pattern_with_escaped_star():
    # Star in model, literal in pattern
    codeflash_output = is_model_allowed_by_pattern("foo*bar", "foo*bar") # 47.1μs -> 1.69μs (2691% faster)
    codeflash_output = is_model_allowed_by_pattern("foo*bar", "foo\\*bar") # 36.6μs -> 871ns (4102% faster)

def test_long_model_and_pattern():
    # Long model and pattern strings
    model = "a" * 100 + "/" + "b" * 100
    pattern = "a" * 100 + "/*"
    codeflash_output = is_model_allowed_by_pattern(model, pattern) # 110μs -> 2.41μs (4491% faster)
    pattern = "a" * 99 + "/*"
    codeflash_output = is_model_allowed_by_pattern(model, pattern) # 94.0μs -> 1.24μs (7510% faster)

# ------------------------------
# 3. Large Scale Test Cases
# ------------------------------


def test_large_model_and_pattern():
    # Very long model and pattern with wildcard
    prefix = "a" * 500
    suffix = "b" * 499
    model = prefix + suffix
    pattern = prefix + "*"
    codeflash_output = is_model_allowed_by_pattern(model, pattern) # 397μs -> 5.67μs (6915% faster)
    # Pattern too short, should not match
    pattern = prefix[:-1] + "*"
    codeflash_output = is_model_allowed_by_pattern(model, pattern) # 363μs -> 4.04μs (8883% faster)

def test_large_number_of_models():
    # Test a pattern against many models
    pattern = "foo/bar*"
    models = [f"foo/bar{i}" for i in range(1000)]
    # All should match
    for model in models:
        codeflash_output = is_model_allowed_by_pattern(model, pattern) # 924μs -> 447μs (107% faster)
    # Add one that shouldn't match
    codeflash_output = is_model_allowed_by_pattern("foo/baz0", pattern) # 912ns -> 585ns (55.9% faster)

def test_performance_with_large_inputs():
    # Test performance with large model and pattern (not exceeding 1000 chars)
    model = "x" * 999
    pattern = "x" * 998 + "*"
    codeflash_output = is_model_allowed_by_pattern(model, pattern) # 720μs -> 5.88μs (12172% faster)
    pattern = "x" * 997 + "*"
    codeflash_output = is_model_allowed_by_pattern(model, pattern) # 690μs -> 4.34μs (15815% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-is_model_allowed_by_pattern-mhx24r7h and push.

Codeflash Static Badge

The optimization adds **compiled regex caching** to eliminate redundant regex compilation overhead. Instead of calling `re.match()` with a string pattern (which compiles the regex on every call), the optimized version:

1. **Pre-compiles patterns** using `re.compile()` and stores them in a module-level cache `_pattern_cache`
2. **Reuses compiled patterns** for repeated wildcard patterns, avoiding expensive regex compilation

**Key Performance Impact:**
- The line profiler shows regex compilation (`re.compile`) now only happens 545 times instead of 4,086 times (87% reduction)
- Each `re.match()` call becomes significantly faster when using pre-compiled patterns
- Overall runtime improved from 21.6ms to 1.92ms (1026% speedup)

**Why This Works:**
Regex compilation is computationally expensive, involving parsing the pattern string and building a finite state machine. When the same wildcard patterns are used repeatedly (common in auth scenarios), caching the compiled regex objects eliminates this repeated overhead.

**Real-World Benefits:**
Based on the function references, `is_model_allowed_by_pattern` is called within loops in `_model_matches_any_wildcard_pattern_in_list()`, making it a hot path for model authorization checks. The test results show particularly dramatic improvements (3000%+ speedups) for complex patterns with multiple wildcards or special characters, which are common in model naming schemes like "bedrock/*", "openai/gpt-*", etc.

**Test Case Performance:**
- Simple patterns (single wildcard): 100-200% speedup
- Complex patterns with multiple wildcards or special chars: 3000-15000% speedup
- Large-scale tests with repeated pattern usage: 5000%+ speedup

The optimization is most effective when the same wildcard patterns are used multiple times across authorization checks, which is the typical usage pattern in proxy authentication systems.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 November 13, 2025 06:38
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant