perf(path): reduce RAM spikes in large directory scans #2213

soifou · 2025-10-16T10:31:40Z

No description provided.

soifou · 2025-10-16T10:36:17Z

Could you please give this new PR a shot when you have a chance? This should improve memory usage significantly.

To test it using lazyvim, add the branch corresponding to this PR:

return {
    'saghen/blink.cmp',
    branch = 'perf/path-scan-optimize',
    -- ...
}

Thanks in advance!

mohammadfs243 · 2025-10-16T12:36:07Z

Hi @soifou
I tested that, but the memory usage is still around 2GiB for my 1M files directory.

The other issue is the memroy leak as I mentioned in your other PR. Every time the completion is triggered, it adds up to memory usage.

soifou · 2025-10-16T12:47:53Z

Hmm... Are you sure you're testing it correctly?

Here is the RAM consumption before this PR:

2025-10-15_12-30-55.mp4

And with this PR:

2025-10-16_14-44-38.mp4

mohammadfs243 · 2025-10-16T13:02:31Z

I guess so, it's from my Lazy page in neovim:

But in the video you're repeatedly triggering the completion, in the first 3 runs they look the same.

soifou · 2025-10-16T13:28:32Z

Could you provide a similar video with the ram stats and the actions you do when you observe this increase?

mohammadfs243 · 2025-10-16T18:20:12Z

Unfortunately I can't record the screen at the moment, but it runs as follows:

I start typing the path until it reaches that directory containing 1.3M files.
It takes about 15s to display the suggestions list (the dir is on an HDD).
RAM usage goes roughly like this (watching the Ubuntu's system monitor)¹:
- 80MiB
- 350MiB
- 760MiB
- 1.2GiB
- 1.5GiB
- 2GiB

The RAM usage spike graph is not identical in each run, but I think it's because how system monitor works. ↩

soifou · 2025-10-16T18:42:04Z

Wait, when you refers to 1.3M, is it actually... 1,300,000 files ?? on HDD 😄 ??
My test was actually in 200k (200,000 files) on an SSD, it's not really the same.

If I get this right, I can now understand why this take so much time, we keep in memory all the files and doing the filtering on each keystroke.

Saghen · 2025-10-16T18:45:36Z

Yeah we should re-introduce the limit, maybe at 10k files? 5b4055e

soifou · 2025-10-16T19:00:49Z

Limiting does not solve the problem but unless a refactoring I guess this is an acceptable in-between solution! 10k is fine I guess (200k on SSD too 😄), I'll add that.

In any case, this changes seems still valid about the spotted memory leak, wdyt?

mohammadfs243 · 2025-10-16T19:01:06Z

@soifou It's not something that should happen often, but here I am 😄

I think @Saghen is right, there should be a limit on the number of suggestions loaded.

Add configurable `max_entries` and `chunk_size`, including validation and user notification when the limit is reached.

soifou · 2025-10-17T13:03:14Z

Quick follow-up, re-added that entry limit we talked about (default 10k) and made it configurable along with the chunk size for the brave souls who want to tweak it. Also added a user notification when the limit kicks in.

mohammadfs243 · 2025-10-17T20:04:54Z

@soifou It works pretty good. Thanks for your effort 👍

Just one minor thing; How does the sorting or making suggestions work? I expect the exact match be in the top spot, which isn't (there's a file named 36 in that directory):

PS: I think the memory leak is still there.

soifou · 2025-10-17T20:31:59Z

Since we are limiting the total number of entries and, as far as I know, the order during the scan can be arbitrary, this impacts which entries are available for suggestions. That could explain why you don't see the "36" file.

At what magnitude do you currently observe this memory leak?

mohammadfs243 · 2025-10-17T20:44:12Z

For the first invocation of suggestions the usage goes up to around 70MB, and if I recreate the suggestions it reaches ~105MB and stays there for further triggering of completion list. It's negligible and may be related to Lua's GC.

soifou · 2025-10-17T21:00:22Z

Yes, this is similar to what you can observe in my 2nd video. After the second iteration, the RAM consumption stabilizes.

Seems fine to me in this scenario.

mohammadfs243 · 2025-10-17T21:21:46Z

Yeah, not everyone has a 1.3M files directory and if they do it is only around 100MB 😄

Saghen · 2025-10-22T16:49:35Z

Sorry it took so long to review this. I don't think we should expose the chunk_size as it's more of an internal optimization, so users shouldn't need to change it. The collectgarbage looks good, although, perhaps we should only do it when the list is large, to avoid an unnecessary microstutter? I'm not sure I understand the lib._candidates change as anything referencing the old table, even after replacing lib._candidates with a new table, should keep the old table alive. What's the memory usage difference before and after that change?

soifou · 2025-10-23T09:17:32Z

Thanks for the feedback.

Since I made max_entries configurable, I assumed chunk_size should be as well, given the validation check (max_entries must be greater than chunk_size). I was more concerned about hardcoding that value during the validation process than tweaking. Do you think we should revert and make both internal again, as you originally had them?

Regarding lib._candidates, I thought I noticed an improvement during testing on my old machine, but while traveling with my laptop, I don’t see any significant difference anymore... I'll remove that.

perf(path): reduce RAM spikes in large directory scans

3b6b9b9

feat(path): re-introduce configurable limit for entries

a3580d6

Add configurable `max_entries` and `chunk_size`, including validation and user notification when the limit is reached.

This comment has been minimized.

Sign in to view

fix: use vim.print instead of vim.notify

71e455b

This comment has been minimized.

Sign in to view

perf(path): reduce RAM spikes in large directory scans #2213

Are you sure you want to change the base?

perf(path): reduce RAM spikes in large directory scans #2213

Conversation

soifou commented Oct 16, 2025

Uh oh!

soifou commented Oct 16, 2025

Uh oh!

mohammadfs243 commented Oct 16, 2025

Uh oh!

soifou commented Oct 16, 2025

Uh oh!

mohammadfs243 commented Oct 16, 2025

Uh oh!

soifou commented Oct 16, 2025

Uh oh!

mohammadfs243 commented Oct 16, 2025

Footnotes

Uh oh!

soifou commented Oct 16, 2025

Uh oh!

Saghen commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

soifou commented Oct 16, 2025

Uh oh!

mohammadfs243 commented Oct 16, 2025

Uh oh!

soifou commented Oct 17, 2025

Uh oh!

This comment has been minimized.

This comment has been minimized.

mohammadfs243 commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

soifou commented Oct 17, 2025

Uh oh!

mohammadfs243 commented Oct 17, 2025

Uh oh!

soifou commented Oct 17, 2025

Uh oh!

mohammadfs243 commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Saghen commented Oct 22, 2025

Uh oh!

soifou commented Oct 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Saghen commented Oct 16, 2025 •

edited

Loading

mohammadfs243 commented Oct 17, 2025 •

edited

Loading

mohammadfs243 commented Oct 17, 2025 •

edited

Loading