-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JBrowse issue - Visualization of whole-contig alignment to a reference genome #4877
Comments
thanks for posting this. so, multi-part thing Part 0. A workaroundThis error does not happen in firefox, so could try that if really needed However, our desktop app uses Electron which is Chrome based, and has the issue Part 1. The error you ran intoThe error you are seeing is basically trying to allocate a array to create a string bigger than 512Mb. I have written about these in funny blog posts as we often run into these types of things trying to do big data stuff in the browser https://cmdcolin.github.io/posts/2021-08-15-map-limit That's because it's making basically the entirety of chromosome 1 get loaded into a memory, as a single string (which is utf16 btw, so two bytes per letter) So it's basically hitting that limit. If we just used a Uint8Array, then it can be 1 byte per character, and can be longer than 512Mb. We could possible even make seq a zero-copy view over the bam backing data. Part 2. Other referencesthis assembly-assembly alignment BAMs are tricky another thread from igv forum for same file Part 3. What JBrowse 2 likes to use for assembly-to-assembly alignmentFor assembly-to-assembly alignment, JBrowse actually likes PAF. PAF is a good output format for assembly-to-assembly alignment, and can be loaded as a "SyntenyTrack" in jbrowse which allows you to use it in the dotplot and 'synteny' views. It does not encode the whole SEQ field, so it is much smaller. Note that currently we do not have a way to encode sequence differences with PAF, but if we add "cs" tag support, then this would work #3378 |
and note: you can convert this via a chain of BAM->SAM->PAF using paftools :) https://www.biostars.org/p/479287/#9474845 I'll keep this open as there could be something that could help, but in general, i would probably say that BAM/CRAM is not the optimal assembly-to-assembly alignment format...PAF probably preferred. note that the PAF tracks can be loaded in a normal linear genome view also just like a BAM file, so it shouldn't have too much loss of functionality, and once cs tag is added, SNPs can be rendered on it too |
Hi, @cmdcolin, Thanks for the quick reply. I am looking into PAF files. Good to know that they can be loaded in the linear genome viewer as well. I tried to convert the BAM to a PAF, but I probably need to double check if I did all the steps correctly, as I am having some issue visualizing it with the Synteny viewer. |
feel free to let me know what issue you run into. the most common reason that a PAF file "doesn't show up" is because the assembly names are entered backwards (query vs target) |
Yes, the issue was more that, if the sample assembly is not in the assembly, list, then it selects twice the reference assembly for both query and target. You need to manually change the query assembly in the Settings. thanks for the support. Feel free to close when the BAM issue is resolved. |
will be tracked in GMOD/bam-js#106 |
Hi, I am trying to visualize whole-contig alignment to a reference genome, e.g., hg38. One example BAM file is https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/AshkenazimTrio/analysis/NIST_HG002_DraftBenchmark_defrabbV0.019-20241113/dipcall_output/GRCh38_HG2-T2TQ100-V1.1_dipcall-z2k.hap1.bam
I got this error from JBrowse, here is the stack trace:
The text was updated successfully, but these errors were encountered: