A library for working with LC/MS runs.
The following example works on ALL versions of mzXML and mzML (including support for compressed peak data).
require "ms/msrun" file = "file.mzXML" # works identical for "file.mzML" Ms::Msrun.open(file) do |ms| # Run level information: ms.start_time # in seconds, gives date for mzML ms.end_time # in seconds, returns nil in mzML ms.scan_count # number of scans ms.scan_count(1) # number of MS scans ms.scan_count(2) # number of MS/MS scans, etc. ms.parent_basename_noext # "file" (as recorded _in the xml_) ms.filename # "file.mzXML" # Random scan access (blazing fast) ms.scan(22) # a scan object # Complete scan access ms.each do |scan| scan.num # scan number scan.ms_level # ms_level scan.time # retention time in seconds scan.start_mz # the first m/z value, returns nil in mzML scan.end_mz # the last m/z value, returns nil in mzML # Precursor information pr = scan.precursor # an Ms::Precursor object pr.mz pr.intensity # does fast binary search if info not already given pr.parent # the parent scan pr.charge_states # Array of possible charge states # Spectral information spectrum = scan.spectrum spectrum.mzs # Array of m/z values spectrum.intensities # Array of m/z values spectrum.peaks do |mz, inten| puts "#{mz} #{inten}" # print each peak on own line end end # supports pre-filtering for faster access ## get just precursor info: ms.each(:ms_level => 2, :spectrum => false) {|scan| scan.precursor } ## get just level one spectra: ms.each(:ms_level => 1, :precursor => false) {|scan| scan.spectrum } end # Quicker way to get at the scans: Ms::Msrun.foreach("file.mzXML") {|scan| scan <do something> }
Can convert mzXML or mzML to mgf or ms2
Ms::Msrun.open(mzmlFile) do |ms| mgfFile = mzmlFile.chomp(".mzML") + ".ms2" ms.to_ms2(:output => mgfFile) end
Or it can be done through the command line program ms_to_search.rb
"usage: <file>.mz[XML | ML] ... <type>"
Other output formats can be included in future versions.
- Fast
-
Uses Nokogiri and a dash of regular expressions to achieve very fast random access of scans (also supports accessing all scans or subsets of scans).
- Unified
-
One interface for all formats.
- Lazy evaluation at scan and spectrum level
-
Scans are only read from IO when requested. Spectra are also decoded only when explicitly accessed.
- Extensively tested
-
To release, the parser must pass an extensive specification for each file version (a total of ~1500 tests).
- Long-term support
-
We will continue to support newer versions and fix any bugs or edge cases that are found. Please alert us of any mzXML or mzML file that is not parsed correctly.
gem install ms-msrun
See LICENSE