Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazily parse index per-chromosome #108

Merged
merged 2 commits into from
Feb 28, 2025
Merged

Lazily parse index per-chromosome #108

merged 2 commits into from
Feb 28, 2025

Conversation

cmdcolin
Copy link
Collaborator

@cmdcolin cmdcolin commented Feb 27, 2025

This results in a reduction in memory usage for the web worker for large e.g. BAI files

Webworker heap snapshot:

Before this PR: very slow to even create a snapshot, 478Mb
After this PR: very fast to create a snapshot, 149Mb total

image
screenshot showing before and after snapshots

This PR does not try to cache the return value, so could be worth profiling whether that would be worth it, but i think the memory savings do help

@cmdcolin
Copy link
Collaborator Author

the BAM file used here has a 60Mb BAI file

@cmdcolin cmdcolin force-pushed the mem_usage_lazy_index branch from fb7354e to 275cda9 Compare February 27, 2025 21:59
@cmdcolin
Copy link
Collaborator Author

(this was somewhat of an exercise in understanding memory heap snapshots, as lead up to tackling memory leak from layout in GMOD/jbrowse-components#4860)

@cmdcolin cmdcolin force-pushed the mem_usage_lazy_index branch 2 times, most recently from 3d0ed98 to ec8b88c Compare February 28, 2025 02:53
@cmdcolin cmdcolin force-pushed the mem_usage_lazy_index branch from ec8b88c to 7f2974b Compare February 28, 2025 03:01
@cmdcolin cmdcolin merged commit 2559241 into master Feb 28, 2025
1 check passed
@cmdcolin cmdcolin deleted the mem_usage_lazy_index branch February 28, 2025 03:02
@cmdcolin
Copy link
Collaborator Author

this PR also gets fairly close to allowing "BAI indexes" because it creates the concept of parsing each chromosome via a byte offset into the file. an external bai indx can provide the byte offsets now

https://github.com/hammerlab/bai-indexer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant