A set of libraries to handle peptide centric mass spectrometry calculations. Built to handle very complex peptidoforms in a sensible way. Centered around the following HUPO-PSI standards:
- ProForma A standard notation for proteo/peptidoforms allowing for highly complex definitions
- mzSpecLib A standard notation for spectral libraries
- mzPAF A standard notation of peak fragment annotation
- mzTab A standard notation for matched peptidoforms from database and de novo searches
For raw data centered HUPO-PSI standards support (eg mzML, USI) see mzdata.
- mzcore
- Read ProForma sequences (complete 2.0 specification supported: 'level 2-ProForma + top-down compliant + cross-linking compliant + glycans compliant + mass spectrum compliant')
- Extensive use of uom for compile time unit checking
- Exhaustively fuzz tested for reliability (using cargo-afl)
- Extensive support for glycans, including generating bitmap and vector images
- mzannotate
- Generate theoretical fragments with control over the fragmentation model from any ProForma peptidoform
- Generate theoretical fragments for chimeric spectra
- Generate theoretical fragments for cross-links (also disulfides)
- Generate theoretical fragments for modifications of unknown position
- Generate peptide backbone (a, b, c, x, y, and z) and satellite ion fragments (d, v, and w)
- Generate glycan fragments (B, Y, and internal fragments)
- Integrated with mzdata
- Read and write mzSpecLib and mzPAF
- Match spectra to the generated fragments
- Generate theoretical fragments with control over the fragmentation model from any ProForma peptidoform
- mzalign
- Align peptides based on mass
- Consecutive alignment of one sequence on a stretch of multiple sequences
- Indexed alignment for fast alignments for big datasets
- imgt
- Fast access to the IMGT database of antibody germlines
- mzident
- Reading of multiple identified peptide file formats (amongst others: mzTab, Fasta, MaxQuant, MSFragger, Novor, OPair, Peaks, and Sage)
- rustyms-py
- Python bindings are provided to several core components of the libraries. Go to the Python documentation for more information.
These are the main librares. This contains all source code, databases (Unimod etc) and example data.
Some examples on how to use the libraries provided here, see the readme file in the examples themselves for more details.
The harness to fuzz test the libraries for increased stability, see the readme for more details.
This Rust library provides python bindings (using pyO3) for rustyms.
Using the rustyms-generate-databases
the definitions for the databases can be updated. See the readme on the download locations for all databases. Then run cargo run -p rustyms-generate-databases
(from the root folder of this repository).
Using the rustyms-generate-imgt
the definitions for the germlines can be updated. Put the imgt.dat.Z file in the rustyms-generate-imgt/data
directory and unpack it (this can be downloaded from https://www.imgt.org/download/LIGM-DB/imgt.dat.Z). Then run cargo run -p rustyms-generate-imgt
(from the root folder of this repository).
Any contribution is welcome (especially adding/fixing documentation as that is very hard to do as main developer).