Skip to content

Conversation

@aldehir
Copy link

@aldehir aldehir commented Jan 1, 2026

I was testing ggml-org/llama.cpp#18353 and test-chat is unbearably slow on MSVC (always has been).

The cause is MSVC's std::regex engine. The regex_search() calls take a considerable amount of time during tokenization. I added some profiling and tested with the Apertus-8B-Instruct.jinja template:

[Profile] parseExpression: 4.93307 s (102 calls)
[Profile] non_text_regex: 0.0153389 s (319 calls)
[Profile] block_keyword_regex: 0.0048339 s (216 calls)
[Profile] block_close_regex: 0.0068446 s (216 calls)
[Profile] block_open_regex: 0.0817274 s (535 calls)
[Profile] expr_close_regex: 0.0035717 s (102 calls)
[Profile] expr_open_regex: 0.0604591 s (637 calls)
[Profile] comment_regex: 3.11919 s (639 calls)
[Profile] tokenize: 18.1967 s
Parsed template from: .\Apertus-8B-Instruct.jinja

18.2s to parse.

This PR adds std::regex_constants::match_continuous to the regex_search() calls. This flag instructs the regex engine to only match from the start of the iterator, which aligns with the existing match.position() == 0 check.

[Profile] parseExpression: 0.0225291 s (102 calls)
[Profile] non_text_regex: 0.013297 s (319 calls)
[Profile] block_keyword_regex: 0.0039072 s (216 calls)
[Profile] block_close_regex: 0.0052241 s (216 calls)
[Profile] block_open_regex: 0.0081572 s (535 calls)
[Profile] expr_close_regex: 0.0025207 s (102 calls)
[Profile] expr_open_regex: 0.0064239 s (637 calls)
[Profile] comment_regex: 0.0076846 s (639 calls)
[Profile] tokenize: 0.117085 s
Parsed template from: .\Apertus-8B-Instruct.jinja

0.12s, much better.

Copy link
Collaborator

@CISC CISC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The model download CI failures are a PITA, but LGTM.

@CISC CISC merged commit 158c01c into ochafik:main Jan 2, 2026
2 of 28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants