CouchDb memory leak when log level is set to `debug` #9826

dianabarsan · 2025-03-05T17:52:23Z

Describe the performance issue
A pattern emerged in one deployment that had a cluster of 3 CouchDb nodes, version 3.3.3 (medic version 4.15), where one of the CouchDb nodes would start consuming alarming amounts of RAM (peak reached at above 200 GB), up until the point that the node was evicted by Kubernetes.
This persisted after numerous restarts.

This turned out to be caused by setting log level on the troublesome node to debug, while the other nodes had the default CHT log level of info.

I have replicated this behavior locally, using a clustered CouchDb same version, newest version AND a vanilla CouchDb latest in single node.

This appears to occur ONLY if the logs are followed, which seems to be the case in our infrastructure, as the logs are streamed to Loki for observability purposes.
It also appears to be linked to having multiple validate_doc_update functions (which might not be directly correlated, just through the fact that they generate more log entries).

Describe the improvement you'd like
As this appears to be an issue in CouchDb itself.

Document that debug log level should not be used in production, unless for limited time.
Determine whether this is related to our validate_doc_update functions specifically, or any vdu function would yield the same result.
Follow up through CouchDb official channels to get a fix, or at least an answer of whether this is expected behavior.

To Reproduce
Steps to record the performance metrics:

Launch single node or clustered CHT core (any recent version).
Set CouchDb config log -> level to debug on one node.
Tail that node's output.
Start any document generating script (I used test-data-generator to create batches of 1_000_000 documents).
Run docker stats <your container> to see the gradual memory footprint increase

Environment

Instance: local, production, vanilla
App: CouchDb
Version: My guess is that any Couch v3 will display this behavior. this means CHT > 4.4 . I have not tested on CouchDb v2.

Additional context
Observed by @mrjones-plip in production. Thank you for your diligence!!

The text was updated successfully, but these errors were encountered:

mrjones-plip · 2025-03-05T20:48:00Z

Observed by @mrjones-plip in production

Noting this occurred Muso Mali's new 4.x cluster in EKS - see private repo issue

dianabarsan added the Type: Performance Make something faster label Mar 5, 2025

dianabarsan self-assigned this Mar 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CouchDb memory leak when log level is set to `debug` #9826

CouchDb memory leak when log level is set to `debug` #9826

dianabarsan commented Mar 5, 2025

mrjones-plip commented Mar 5, 2025

CouchDb memory leak when log level is set to debug #9826

CouchDb memory leak when log level is set to debug #9826

Comments

dianabarsan commented Mar 5, 2025

mrjones-plip commented Mar 5, 2025

CouchDb memory leak when log level is set to `debug` #9826

CouchDb memory leak when log level is set to `debug` #9826