Skip to content

Conversation

@MadLittleMods
Copy link
Contributor

@MadLittleMods MadLittleMods commented Oct 23, 2025

Cheaper logcontext debug logs (pseudo_random_string(...))

Follow-up to #18966

During the weekly Backend team meeting, it was mentioned that random_string(...) was taking a significant amount of CPU on matrix.org. This makes sense as it relies on secrets.choice(...), a cryptographically secure function that is inherently computationally expensive. And since #18966, we're calling random_string(...) as part of a bunch of logcontext utilities.

Since we don't need cryptographically secure random strings for our debug logs, this PR is introducing a new pseudo_random_string(...) function that uses random.choice(...) which uses pseudo-random numbers that are "both fast and threadsafe".

Dev notes

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
    • Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
  • Code style is correct (run the linters)

return "".join(secrets.choice(_string_with_symbols) for _ in range(length))


def pseudo_random_string(length: int) -> str:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also considered naming this random_string_insecure_fast

🤷

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should call this insecure_random_string. Being pseudo-random doesn't necessarily imply insecure, that's what CSPRNGs are for, after all.

It's maybe not a huge deal, but given the sheer hazard of using the wrong type of random in the wrong place, I much prefer the clear and simple insecure label, because it brings your attention to an important (negative) caveat. In theory that could raise some alarm bells at a critical time during review.

On the other hand, I don't think it's important to say fast — if someone consciously thinks about the speed at PR review time, they can look it up. (But I'm also not against calling it 'fast', to be fair!)

Comment on lines 862 to 866
instance_id = pseudo_random_string(5)
calling_context = current_context()
logcontext_debug_logger.debug(
"run_in_background(%s): called with logcontext=%s", instance_id, calling_context
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As another alternative, we could gate the random string creation behind if logcontext_debug_logger.isEnabledFor(logging.DEBUG)

@MadLittleMods MadLittleMods marked this pull request as ready for review October 23, 2025 18:38
@MadLittleMods MadLittleMods requested a review from a team as a code owner October 23, 2025 18:39
return "".join(secrets.choice(_string_with_symbols) for _ in range(length))


def pseudo_random_string(length: int) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should call this insecure_random_string. Being pseudo-random doesn't necessarily imply insecure, that's what CSPRNGs are for, after all.

It's maybe not a huge deal, but given the sheer hazard of using the wrong type of random in the wrong place, I much prefer the clear and simple insecure label, because it brings your attention to an important (negative) caveat. In theory that could raise some alarm bells at a critical time during review.

On the other hand, I don't think it's important to say fast — if someone consciously thinks about the speed at PR review time, they can look it up. (But I'm also not against calling it 'fast', to be fair!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants