-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve readDir
performance
#7314
Comments
I'm a bit skeptical about Having said that, |
Technically this is already the case, but with a small constant factor.
It seems that we've overlooked this. "Implementation 2" seems to be already implemented then. The |
Yeah it is already linear time today. And even if Nixpkgs continues to not use it, it should still be better if we can make it better. Finally, if we ever do #4090 we can simply stat paths (or the lazy trees equivalent) when attributes are accessed, fully lazy again! Basically, lazy vs eager (and thus |
This issue has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/2022-11-21-nixpkgs-architecture-team-meeting-18/23393/1 |
Other than the size of |
The nixpkgs record (of thunks) is still constructed, which is linear in respect to the top level namespace. Reading all the dir entries and creating thunks is also linear with respect to the top level namespace.
And neither is this! We don't need to care what is inside each directory, that work is deferred in a thunk. |
This issue has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/2022-12-05-nixpkgs-architecture-team-meeting-20/23740/1 |
#7447 does the first part of this. |
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/readdir64.c;h=7952da5c27555e44f3970629f8cfd39cb42425f3;hb=HEAD#l42 shows that glibc We didn't get into empirical benchmarking that much today, but @gropotkin did say things seemed plenty fast on Linux in general, and the problem might be macOS only. We'll do more exploring over this I suppose next time. |
This issue has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/2022-12-12-nixpkgs-architecture-team-meeting-21/23967/1 |
I agree that (2) is not needed. I'm pretty sure my intent with the reference to the syscall documentation was just to show that it's possible to get the info efficiently. When |
Is your feature request related to a problem? Please describe.
builtins.readDir
currently performs stat calls for each directory entry. This is bad enough that @aakropotkin works aroundreadDir
with a generated file. A fastreadDir
may allow Nixpkgs to replaceall-packages.nix
over time; a project that the Nixpkgs Architecture Team is researching.Describe the solution you'd like
Implementation 1, posix compatible:
Fill the attrset with thunks that call
pathType
#3096Implementation 2, linux optimized:
Use
getdents
orgetdents64
Describe alternatives you've considered
Avoid
readDir
on large directories, generate a list of files. ThereadDir
function is not crucial for the architecture team proposal.Additional context
nixpkgs-architecture/simple-package-paths#17 (comment)
The text was updated successfully, but these errors were encountered: