⚡️ Speed up function `is_model_allowed_by_pattern` by 1,026% #447

codeflash-ai · 2025-11-13T06:38:00Z

📄 1,026% (10.26x) speedup for `is_model_allowed_by_pattern` in `litellm/proxy/auth/auth_checks.py`

⏱️ Runtime : 21.6 milliseconds → 1.92 milliseconds (best of 5 runs)

📝 Explanation and details

The optimization adds compiled regex caching to eliminate redundant regex compilation overhead. Instead of calling re.match() with a string pattern (which compiles the regex on every call), the optimized version:

Pre-compiles patterns using re.compile() and stores them in a module-level cache _pattern_cache
Reuses compiled patterns for repeated wildcard patterns, avoiding expensive regex compilation

Key Performance Impact:

The line profiler shows regex compilation (re.compile) now only happens 545 times instead of 4,086 times (87% reduction)
Each re.match() call becomes significantly faster when using pre-compiled patterns
Overall runtime improved from 21.6ms to 1.92ms (1026% speedup)

Why This Works:
Regex compilation is computationally expensive, involving parsing the pattern string and building a finite state machine. When the same wildcard patterns are used repeatedly (common in auth scenarios), caching the compiled regex objects eliminates this repeated overhead.

Real-World Benefits:
Based on the function references, is_model_allowed_by_pattern is called within loops in _model_matches_any_wildcard_pattern_in_list(), making it a hot path for model authorization checks. The test results show particularly dramatic improvements (3000%+ speedups) for complex patterns with multiple wildcards or special characters, which are common in model naming schemes like "bedrock/", "openai/gpt-", etc.

Test Case Performance:

Simple patterns (single wildcard): 100-200% speedup
Complex patterns with multiple wildcards or special chars: 3000-15000% speedup
Large-scale tests with repeated pattern usage: 5000%+ speedup

The optimization is most effective when the same wildcard patterns are used multiple times across authorization checks, which is the typical usage pattern in proxy authentication systems.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 4094 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

🌀 Generated Regression Tests and Runtime

import re

# imports
import pytest  # used for our unit tests
from litellm.proxy.auth.auth_checks import is_model_allowed_by_pattern

# unit tests

# --- Basic Test Cases ---

def test_exact_match_with_wildcard():
    # Wildcard at the end matches any suffix
    codeflash_output = is_model_allowed_by_pattern("bedrock/anthropic.claude-3-5-sonnet-20240620", "bedrock/*") # 3.74μs -> 1.93μs (93.9% faster)
    # Wildcard at start matches any prefix
    codeflash_output = is_model_allowed_by_pattern("openai/gpt-4", "*gpt-4") # 47.0μs -> 1.11μs (4142% faster)
    # Wildcard only pattern matches everything
    codeflash_output = is_model_allowed_by_pattern("any/model", "*") # 2.27μs -> 720ns (215% faster)
    # Wildcard in the middle
    codeflash_output = is_model_allowed_by_pattern("openai/gpt-4-32k", "openai/*-32k") # 37.3μs -> 997ns (3642% faster)
    # Multiple wildcards
    codeflash_output = is_model_allowed_by_pattern("foo/bar/baz", "foo/*/baz") # 1.96μs -> 819ns (139% faster)

def test_no_wildcard_exact_match():
    # No wildcard, should always return False (per implementation)
    codeflash_output = not is_model_allowed_by_pattern("bedrock/anthropic.claude-3-5-sonnet-20240620", "bedrock/anthropic.claude-3-5-sonnet-20240620") # 484ns -> 552ns (12.3% slower)
    codeflash_output = not is_model_allowed_by_pattern("openai/gpt-4", "openai/gpt-4") # 327ns -> 320ns (2.19% faster)
    codeflash_output = not is_model_allowed_by_pattern("foo/bar", "foo/bar") # 235ns -> 226ns (3.98% faster)
    codeflash_output = not is_model_allowed_by_pattern("", "") # 209ns -> 213ns (1.88% slower)

def test_basic_non_matches():
    # Wildcard doesn't match
    codeflash_output = not is_model_allowed_by_pattern("openai/gpt-4", "bedrock/*") # 3.15μs -> 1.58μs (99.2% faster)
    codeflash_output = not is_model_allowed_by_pattern("bedrock/anthropic.claude-3-5-sonnet-20240620", "openai/*") # 1.55μs -> 734ns (112% faster)
    codeflash_output = not is_model_allowed_by_pattern("foo/bar/baz", "foo/*/qux") # 1.56μs -> 910ns (71.2% faster)
    codeflash_output = not is_model_allowed_by_pattern("foo/bar/baz", "bar/*") # 44.6μs -> 629ns (6983% faster)

# --- Edge Test Cases ---

def test_empty_strings():
    # Empty pattern with wildcard matches only empty model
    codeflash_output = is_model_allowed_by_pattern("", "*") # 3.16μs -> 1.62μs (95.0% faster)
    # Empty pattern without wildcard never matches
    codeflash_output = not is_model_allowed_by_pattern("", "") # 321ns -> 296ns (8.45% faster)
    # Model is empty, pattern is not
    codeflash_output = not is_model_allowed_by_pattern("", "foo/*") # 1.79μs -> 1.00μs (78.2% faster)
    # Pattern is empty, model is not
    codeflash_output = not is_model_allowed_by_pattern("foo", "") # 172ns -> 160ns (7.50% faster)

def test_only_wildcard_pattern():
    # Pattern is just "*", should match any model string
    codeflash_output = is_model_allowed_by_pattern("anything", "*") # 2.92μs -> 1.71μs (70.3% faster)
    codeflash_output = is_model_allowed_by_pattern("123", "*") # 1.33μs -> 772ns (71.9% faster)
    codeflash_output = is_model_allowed_by_pattern("", "*") # 930ns -> 561ns (65.8% faster)

def test_special_characters_in_model_and_pattern():
    # Model and pattern with regex special chars
    codeflash_output = is_model_allowed_by_pattern("foo.bar", "foo.*") # 3.22μs -> 1.94μs (66.3% faster)
    codeflash_output = is_model_allowed_by_pattern("foo+bar", "foo*bar") # 1.99μs -> 1.10μs (80.2% faster)
    codeflash_output = not is_model_allowed_by_pattern("foo/bar", "foo.bar*") # 52.3μs -> 939ns (5465% faster)
    # Pattern with multiple wildcards and special chars
    codeflash_output = is_model_allowed_by_pattern("foo.bar.baz", "foo*bar*baz") # 42.1μs -> 960ns (4290% faster)
    # Model with numbers and underscores
    codeflash_output = is_model_allowed_by_pattern("foo_123_bar", "foo*_bar") # 31.2μs -> 624ns (4892% faster)
    # Pattern with escaped characters
    codeflash_output = not is_model_allowed_by_pattern("foo[bar]", "foo*bar*") # 33.6μs -> 715ns (4599% faster)

def test_wildcard_at_various_positions():
    # Wildcard at start
    codeflash_output = is_model_allowed_by_pattern("gpt-4", "*gpt-4") # 3.36μs -> 1.90μs (77.2% faster)
    codeflash_output = is_model_allowed_by_pattern("openai/gpt-4", "*gpt-4") # 1.66μs -> 845ns (96.4% faster)
    # Wildcard at end
    codeflash_output = is_model_allowed_by_pattern("openai/gpt-4", "openai/*") # 1.59μs -> 881ns (80.6% faster)
    # Wildcard in the middle
    codeflash_output = is_model_allowed_by_pattern("foo123bar", "foo*bar") # 1.35μs -> 877ns (53.6% faster)
    # Multiple wildcards
    codeflash_output = is_model_allowed_by_pattern("foo123bar456baz", "foo*bar*baz") # 1.35μs -> 806ns (68.0% faster)
    # Non-matching
    codeflash_output = not is_model_allowed_by_pattern("foo123baz", "foo*bar*baz") # 1.09μs -> 700ns (55.6% faster)

def test_pattern_with_multiple_stars():
    # Multiple '*' wildcards
    codeflash_output = is_model_allowed_by_pattern("foo123bar456baz", "foo*bar*baz") # 3.28μs -> 1.83μs (79.4% faster)
    codeflash_output = is_model_allowed_by_pattern("fooXYZbarABCbaz", "foo*bar*baz") # 1.53μs -> 930ns (64.8% faster)
    codeflash_output = not is_model_allowed_by_pattern("fooXYZbazABCbar", "foo*bar*baz") # 1.28μs -> 804ns (58.8% faster)

def test_pattern_is_only_star():
    # Only star should match anything, including empty string
    codeflash_output = is_model_allowed_by_pattern("", "*") # 3.04μs -> 1.52μs (101% faster)
    codeflash_output = is_model_allowed_by_pattern("abc", "*") # 1.45μs -> 722ns (100% faster)
    codeflash_output = is_model_allowed_by_pattern("123", "*") # 1.05μs -> 603ns (73.5% faster)

def test_model_and_pattern_are_identical_but_pattern_has_star():
    # Pattern with star should match model if star is replaced by empty string
    codeflash_output = is_model_allowed_by_pattern("foo", "f*o") # 49.3μs -> 1.61μs (2970% faster)
    codeflash_output = is_model_allowed_by_pattern("foo", "*foo*") # 37.7μs -> 944ns (3893% faster)
    codeflash_output = is_model_allowed_by_pattern("foo", "foo*") # 2.24μs -> 822ns (172% faster)
    codeflash_output = is_model_allowed_by_pattern("foo", "*foo") # 29.2μs -> 829ns (3426% faster)
    # Should not match if pattern structure doesn't fit
    codeflash_output = not is_model_allowed_by_pattern("foo", "f*oo*bar") # 34.9μs -> 619ns (5538% faster)

# --- Large Scale Test Cases ---

def test_large_number_of_models_and_patterns():
    # Test matching a large list of models against a pattern with a wildcard
    pattern = "prefix/*/suffix"
    models = [f"prefix/{i}/suffix" for i in range(1000)]
    for m in models:
        codeflash_output = is_model_allowed_by_pattern(m, pattern) # 946μs -> 466μs (103% faster)
    # Test that models not matching the pattern fail
    for i in range(1000):
        codeflash_output = not is_model_allowed_by_pattern(f"prefix/{i}/not_suffix", pattern) # 846μs -> 425μs (99.1% faster)


def test_performance_many_patterns():
    # Test performance with many different patterns
    models = [f"model_{i}" for i in range(500)]
    patterns = [f"model_{i}*" for i in range(500)]
    for model, pattern in zip(models, patterns):
        codeflash_output = is_model_allowed_by_pattern(model, pattern) # 14.6ms -> 268μs (5317% faster)
    # Negative case: model does not match pattern
    for i in range(500):
        codeflash_output = not is_model_allowed_by_pattern(f"foo_{i}", patterns[i]) # 417μs -> 194μs (115% faster)

def test_large_wildcard_middle():
    # Wildcard in the middle of a long string
    model = "prefix_" + "x" * 400 + "_suffix"
    pattern = "prefix_*_suffix"
    codeflash_output = is_model_allowed_by_pattern(model, pattern) # 53.9μs -> 3.24μs (1563% faster)
    # Non-matching case
    model = "prefix_" + "x" * 400 + "_not_suffix"
    codeflash_output = not is_model_allowed_by_pattern(model, pattern) # 2.92μs -> 1.25μs (134% faster)

# --- Mutation-sensitive cases ---

def test_mutation_sensitive_cases():
    # If implementation changes, these should fail
    # Wildcard should match zero or more characters
    codeflash_output = is_model_allowed_by_pattern("foo", "f*o") # 3.60μs -> 1.96μs (83.8% faster)
    codeflash_output = is_model_allowed_by_pattern("fo", "f*o") # 1.33μs -> 796ns (67.2% faster)
    codeflash_output = is_model_allowed_by_pattern("fbaro", "f*o") # 1.10μs -> 622ns (77.3% faster)
    codeflash_output = not is_model_allowed_by_pattern("fbar", "f*o") # 1.03μs -> 676ns (53.1% faster)
    # Wildcard should not match across slashes if not present
    codeflash_output = is_model_allowed_by_pattern("foo/bar", "foo/*") # 42.9μs -> 1.21μs (3444% faster)
    codeflash_output = not is_model_allowed_by_pattern("foo/bar/baz", "foo/*") # 1.83μs -> 689ns (165% faster)
    # Wildcard does not match empty string if pattern is not just "*"
    codeflash_output = not is_model_allowed_by_pattern("", "foo*") # 29.7μs -> 648ns (4484% faster)
    codeflash_output = not is_model_allowed_by_pattern("", "foo/*") # 1.28μs -> 461ns (179% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

import re

# imports
import pytest  # used for our unit tests
from litellm.proxy.auth.auth_checks import is_model_allowed_by_pattern

# unit tests

# ------------------------------
# 1. Basic Test Cases
# ------------------------------

def test_exact_match_with_wildcard():
    # Pattern with wildcard at end
    codeflash_output = is_model_allowed_by_pattern("bedrock/anthropic.claude-3-5-sonnet-20240620", "bedrock/*") # 53.5μs -> 2.06μs (2494% faster)
    # Pattern with wildcard at start
    codeflash_output = is_model_allowed_by_pattern("anthropic.claude-3-5-sonnet-20240620", "*sonnet-20240620") # 41.4μs -> 1.24μs (3251% faster)
    # Pattern with wildcard in middle
    codeflash_output = is_model_allowed_by_pattern("openai/gpt-4-0125", "openai/*-0125") # 35.0μs -> 1.13μs (2992% faster)
    # Pattern with only wildcard
    codeflash_output = is_model_allowed_by_pattern("openai/gpt-4-0125", "*") # 23.8μs -> 780ns (2956% faster)

def test_non_matching_patterns():
    # No wildcard, should always return False
    codeflash_output = is_model_allowed_by_pattern("bedrock/anthropic.claude-3-5-sonnet-20240620", "bedrock/anthropic") # 527ns -> 595ns (11.4% slower)
    # Wildcard in pattern, but doesn't match model
    codeflash_output = is_model_allowed_by_pattern("openai/gpt-4-0125", "bedrock/*") # 2.75μs -> 1.31μs (110% faster)
    # Wildcard at end, but prefix doesn't match
    codeflash_output = is_model_allowed_by_pattern("bedrock/anthropic.claude", "openai/*") # 44.0μs -> 836ns (5165% faster)

def test_multiple_wildcards():
    # Multiple wildcards in pattern
    codeflash_output = is_model_allowed_by_pattern("foo/bar/baz", "foo/*/baz") # 47.2μs -> 2.01μs (2242% faster)
    codeflash_output = is_model_allowed_by_pattern("foo/bar/baz/qux", "foo/*/baz/*") # 40.4μs -> 1.30μs (2996% faster)
    # Wildcards don't match
    codeflash_output = is_model_allowed_by_pattern("foo/bar/baz", "foo/*/qux") # 30.2μs -> 719ns (4097% faster)

def test_empty_model_and_pattern():
    # Empty model and/or pattern
    codeflash_output = is_model_allowed_by_pattern("", "*") # 3.33μs -> 1.48μs (125% faster)
    codeflash_output = is_model_allowed_by_pattern("", "") # 226ns -> 304ns (25.7% slower)
    codeflash_output = is_model_allowed_by_pattern("foo", "") # 165ns -> 167ns (1.20% slower)
    codeflash_output = is_model_allowed_by_pattern("", "foo*") # 1.71μs -> 836ns (104% faster)

# ------------------------------
# 2. Edge Test Cases
# ------------------------------

def test_pattern_with_special_regex_chars():
    # Pattern with regex special characters that should be treated literally except '*'
    codeflash_output = is_model_allowed_by_pattern("foo.bar", "foo.*") # 45.5μs -> 2.01μs (2161% faster)
    codeflash_output = is_model_allowed_by_pattern("foo[bar]", "foo[*]") # 37.9μs -> 964ns (3831% faster)
    codeflash_output = is_model_allowed_by_pattern("foo(bar)", "foo(*)") # 42.7μs -> 1.01μs (4134% faster)
    # Should not match if pattern doesn't align
    codeflash_output = is_model_allowed_by_pattern("foo.bar", "foo?*") # 32.2μs -> 916ns (3410% faster)

def test_model_with_special_chars():
    # Model with special regex characters, pattern with wildcard
    codeflash_output = is_model_allowed_by_pattern("foo$bar^baz", "foo*$baz") # 48.9μs -> 1.88μs (2495% faster)
    codeflash_output = is_model_allowed_by_pattern("foo$bar^baz", "*^baz") # 32.8μs -> 919ns (3467% faster)
    codeflash_output = is_model_allowed_by_pattern("foo$bar^baz", "foo$*baz") # 30.2μs -> 737ns (3995% faster)

def test_pattern_with_only_wildcards():
    # Only wildcards in pattern
    codeflash_output = is_model_allowed_by_pattern("anything", "*") # 3.29μs -> 1.50μs (120% faster)
    codeflash_output = is_model_allowed_by_pattern("anything", "**") # 40.3μs -> 877ns (4498% faster)
    codeflash_output = is_model_allowed_by_pattern("anything", "***") # 33.5μs -> 818ns (4000% faster)
    # Model is empty, pattern is all wildcards
    codeflash_output = is_model_allowed_by_pattern("", "*") # 1.69μs -> 459ns (268% faster)

def test_pattern_with_adjacent_wildcards():
    # Adjacent wildcards should behave as a single large wildcard
    codeflash_output = is_model_allowed_by_pattern("foo/bar/baz", "foo/**/baz") # 51.7μs -> 2.37μs (2086% faster)
    codeflash_output = is_model_allowed_by_pattern("foo/baz", "foo/**/baz") # 2.44μs -> 668ns (265% faster)
    codeflash_output = is_model_allowed_by_pattern("foo/baz", "foo/*/baz") # 1.25μs -> 596ns (110% faster)

def test_unicode_and_case_sensitivity():
    # Unicode in model and pattern
    codeflash_output = is_model_allowed_by_pattern("foo/ßar", "foo/*") # 3.48μs -> 1.74μs (100% faster)
    codeflash_output = is_model_allowed_by_pattern("foo/ßar", "foo/ß*") # 35.9μs -> 1.09μs (3196% faster)
    # Case sensitivity
    codeflash_output = is_model_allowed_by_pattern("Foo/Bar", "foo/*") # 1.87μs -> 603ns (210% faster)

def test_pattern_with_escaped_star():
    # Star in model, literal in pattern
    codeflash_output = is_model_allowed_by_pattern("foo*bar", "foo*bar") # 47.1μs -> 1.69μs (2691% faster)
    codeflash_output = is_model_allowed_by_pattern("foo*bar", "foo\\*bar") # 36.6μs -> 871ns (4102% faster)

def test_long_model_and_pattern():
    # Long model and pattern strings
    model = "a" * 100 + "/" + "b" * 100
    pattern = "a" * 100 + "/*"
    codeflash_output = is_model_allowed_by_pattern(model, pattern) # 110μs -> 2.41μs (4491% faster)
    pattern = "a" * 99 + "/*"
    codeflash_output = is_model_allowed_by_pattern(model, pattern) # 94.0μs -> 1.24μs (7510% faster)

# ------------------------------
# 3. Large Scale Test Cases
# ------------------------------


def test_large_model_and_pattern():
    # Very long model and pattern with wildcard
    prefix = "a" * 500
    suffix = "b" * 499
    model = prefix + suffix
    pattern = prefix + "*"
    codeflash_output = is_model_allowed_by_pattern(model, pattern) # 397μs -> 5.67μs (6915% faster)
    # Pattern too short, should not match
    pattern = prefix[:-1] + "*"
    codeflash_output = is_model_allowed_by_pattern(model, pattern) # 363μs -> 4.04μs (8883% faster)

def test_large_number_of_models():
    # Test a pattern against many models
    pattern = "foo/bar*"
    models = [f"foo/bar{i}" for i in range(1000)]
    # All should match
    for model in models:
        codeflash_output = is_model_allowed_by_pattern(model, pattern) # 924μs -> 447μs (107% faster)
    # Add one that shouldn't match
    codeflash_output = is_model_allowed_by_pattern("foo/baz0", pattern) # 912ns -> 585ns (55.9% faster)

def test_performance_with_large_inputs():
    # Test performance with large model and pattern (not exceeding 1000 chars)
    model = "x" * 999
    pattern = "x" * 998 + "*"
    codeflash_output = is_model_allowed_by_pattern(model, pattern) # 720μs -> 5.88μs (12172% faster)
    pattern = "x" * 997 + "*"
    codeflash_output = is_model_allowed_by_pattern(model, pattern) # 690μs -> 4.34μs (15815% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-is_model_allowed_by_pattern-mhx24r7h and push.

The optimization adds **compiled regex caching** to eliminate redundant regex compilation overhead. Instead of calling `re.match()` with a string pattern (which compiles the regex on every call), the optimized version: 1. **Pre-compiles patterns** using `re.compile()` and stores them in a module-level cache `_pattern_cache` 2. **Reuses compiled patterns** for repeated wildcard patterns, avoiding expensive regex compilation **Key Performance Impact:** - The line profiler shows regex compilation (`re.compile`) now only happens 545 times instead of 4,086 times (87% reduction) - Each `re.match()` call becomes significantly faster when using pre-compiled patterns - Overall runtime improved from 21.6ms to 1.92ms (1026% speedup) **Why This Works:** Regex compilation is computationally expensive, involving parsing the pattern string and building a finite state machine. When the same wildcard patterns are used repeatedly (common in auth scenarios), caching the compiled regex objects eliminates this repeated overhead. **Real-World Benefits:** Based on the function references, `is_model_allowed_by_pattern` is called within loops in `_model_matches_any_wildcard_pattern_in_list()`, making it a hot path for model authorization checks. The test results show particularly dramatic improvements (3000%+ speedups) for complex patterns with multiple wildcards or special characters, which are common in model naming schemes like "bedrock/*", "openai/gpt-*", etc. **Test Case Performance:** - Simple patterns (single wildcard): 100-200% speedup - Complex patterns with multiple wildcards or special chars: 3000-15000% speedup - Large-scale tests with repeated pattern usage: 5000%+ speedup The optimization is most effective when the same wildcard patterns are used multiple times across authorization checks, which is the typical usage pattern in proxy authentication systems.

codeflash-ai bot requested a review from mashraf-222 November 13, 2025 06:38

codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up function `is_model_allowed_by_pattern` by 1,026% #447

⚡️ Speed up function `is_model_allowed_by_pattern` by 1,026% #447

Uh oh!

codeflash-ai bot commented Nov 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up function is_model_allowed_by_pattern by 1,026% #447

Are you sure you want to change the base?

⚡️ Speed up function is_model_allowed_by_pattern by 1,026% #447

Uh oh!

Conversation

codeflash-ai bot commented Nov 13, 2025

📄 1,026% (10.26x) speedup for is_model_allowed_by_pattern in litellm/proxy/auth/auth_checks.py

📝 Explanation and details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up function `is_model_allowed_by_pattern` by 1,026% #447

⚡️ Speed up function `is_model_allowed_by_pattern` by 1,026% #447

📄 1,026% (10.26x) speedup for `is_model_allowed_by_pattern` in `litellm/proxy/auth/auth_checks.py`