Skip to content

MS Annika is a crosslink search engine based on MS Amanda, aimed at identifying crosslinks of cleavable and non-cleavable crosslinkers from MS2 and MS3 spectra.

License

Notifications You must be signed in to change notification settings

hgb-bin-proteomics/MSAnnika

Repository files navigation

MS Annika

MS Annika is a crosslink search engine based on MS Amanda, aimed at identifying crosslinks of cleavable and non-cleavable crosslinkers from MS2 and MS3 spectra.

MS Annika supports identification of protein-protein crosslinks from a variety of different cross-linking reagents, including cleavable and non-cleavable crosslinkers, and various acquisition workflows such as CID/HCD/stepped-HCD MS2 and MS2-MS3 acquisition.

MS Annika features an efficient algorithm for crosslink identification that allows for analysis of proteome-wide studies of both cleavable and non-cleavable crosslinkers on standard computer hardware.

MS Annika comes as user-friendly plugin for the proteomics platform Proteome Discover with a graphical user interface for easy setup and extensive documentation for customization.

You can read more about MS Annika here:

Description URL
MS Annika 3.0 Publication: doi.org/10.1038/s42004-024-01386-x
MS Annika 2.0 Publication: doi.org/10.1021/acs.jproteome.3c00325
MS Annika 1.0 Publication: doi.org/10.1021/acs.jproteome.0c01000
IMP PD Nodes Website: ms.imp.ac.at/?action=ms-annika

This repository contains the latest release versions of MS Annika:

Version Download URL
Latest MS Annika 3.0 version for Proteome Discoverer 3.2 download
Latest MS Annika 3.0 version for Proteome Discoverer 3.1 download
Latest MS Annika 2.0 version for Proteome Discoverer 3.0 download
Latest MS Annika 2.0 version for Proteome Discoverer 2.5 download
Latest MS Annika 1.0 version for Proteome Discoverer 2.4 download
Latest MS Annika 1.0 version for Proteome Discoverer 2.3 download

A list of changes in every version can be found in HISTORY.md.

Important

Please note that only MS Annika 3.0 (and above) supports identification of non-cleavable crosslinks and only MS Annika 2.0 (and above) supports identification from MS2-MS3-based acquisition workflows!

Installation

MS Annika is a plug-in for the proteomics software Proteome Discover by Thermo Fisher Scientific. Installation of MS Annika requires:

  • Installation of Proteome Discoverer (can be downloaded for free from here)
    • (Requirements of Proteome Discoverer apply)
  • Installation of MS Annika via the installer available from this repository
    • There are no additional requirements other than Proteome Discoverer
    • Users will be asked to accept the MS Annika license agreement (MS Annika is licensed as freeware)
    • The typical installation of MS Annika does not take longer than 5 minutes

The tutorial also covers the installation in detail.

Usage

MS Annika makes use of the workflow interface in Proteome Discoverer, which should be straight-forward to use. Step by step instructions for people unfamiliar with Proteome Discoverer are given in the tutorial. The below sections also give an overview of parameters, results, example workflows and example data. Typically an analysis with MS Annika takes a few minutes for small samples and up to a few hours for larger samples and proteome-wide searches. Please also refer to the specific sections for Astral and timsTOF data if you are analyzing such data.

Parameters & Results

Please refer to the MS Annika User Manual for a detailed description of all MS Annika parameters as well as descriptions of all result tables. For further down-stream analysis of MS Annika results we recommend taking a look at MS Annika Extensions.

Tutorial

A tutorial of how to use MS Annika 3.0 can be found here: Text / Video

Example Files

Example files to try MS Annika 3.0 can either be downloaded from PRIDE or directly here:

  • Minimal example for a cleavable crosslink MS2 search: MGF + fasta
  • RAW file for a non-cleavable crosslink MS2 search: RAW
  • RAW file for a cleavable crosslink MS3 search: RAW

Example Workflows

Example workflows that can be used in Proteome Discoverer:

  • Proteome Discoverer 3.0 / 3.1 / 3.2:
    • DSS/BS3 MS2 search (CID, ETD, HCD, stepped HCD): pdAnalysis / zip
    • DSS/BS3 MS2 search (for large datasets and proteome-wide searches, CID, ETD, HCD, stepped HCD): pdAnalysis / zip
    • DSSO MS2 search (CID, ETD, HCD, stepped HCD): pdAnalysis / zip
    • DSSO MS2-MS3 search (MS3 recorded in the orbitrap): pdAnalysis / zip
    • DSSO MS2-MS3 search (MS3 recorded in the ion trap): pdAnalysis / zip
    • DSBSO MS2 search (CID, ETD, HCD, stepped HCD): pdAnalysis / zip
    • DSBSO MS2-MS3 search (MS3 recorded in the orbitrap): pdAnalysis / zip
    • DSBSO MS2-MS3 search (MS3 recorded in the ion trap): pdAnalysis / zip
  • Proteome Discoverer 2.5:
    • DSSO MS2 search (CID, ETD, HCD, stepped HCD): pdAnalysis / zip
    • DSSO MS2-MS3 search (MS3 recorded in the orbitrap): pdAnalysis / zip
    • DSSO MS2-MS3 search (MS3 recorded in the ion trap): pdAnalysis / zip

The provided workflows also require the installation of MS Amanda which can be downloaded here.

The general processing workflow for almost any crosslink search is depicted here.

For MS2 searches (CID, ETD, HCD, stepped HCD) it can also be beneficial to employ the IMP MS2 Spectrum Processor node, an example workflow for Proteome Discoverer 3.0 is given here:

  • DSSO MS2 search with IMP MS2 Spectrum Processor: pdAnalysis / zip

This workflow additionally requires the installation of the IMP MS2 Spectrum Processor node beforehand, which can be directly downloaded from here (Proteome Discoverer 3.0).

Note

When starting these workflows you might get a warning in Proteome Discoverer that certain parameters do not exist, even though all parameters are set in the workflow. This is because of different MS Annika versions that have different parameter sets. You can safely ignore these warnings!

Support for Astral Data

In order to process crosslinking data from Astral instruments we recommend using MS Annika 3.0 v3.0.5 or greater (e.g. latest). Although theoretically all MS Annika versions support Astral data, earlier versions require sufficient hardware for processing RAW files of more than 100 000 spectra (specifically enough memory, 128GB+ is recommended). MS Annika version 3.0.5+ is memory optimized and runs on standard commodity hardware. We also recommend disabling the following parameter:

  • MS Annika Detector Node:
    • Doublet Selection:
      • Try infer missing charge states: False (If this parameter is not visible, please check that Show Advanced Parameters is on).

Recommendations for Astral Data

We recommend running searches on Astral data in Proteome Discoverer 3.1 using MS Annika 3.0 v3.0.7. For your convenience, we also supply several analysis templates for Astral searches. Please note that all of these workflows additionally require the installation of MS Amanda which can be downloaded here or installed via the Proteome Discoverer Third-Party installer.

We generally also recommend deisotoping spectra e.g. via the IMP MS2 Spectrum Processor. The following workflows can be used with the IMP MS2 Spectrum Processor node:

Expected Performance on Astral Data

Giving an estimate on performance is hard without knowing the specific sample and data analysis hardware, but full human-proteome-wide searches should roughly take between 2h-3h per file on modern hardware.

Expand the table below for some benchmarks using data from this publication:

Expand for benchmark data!

Results

Filename Filesize Nr. Of MS2 Spectra Crosslinker IMP MS2 Spectrum Processor Protein DB Size Protein DB Description Analysis Mode Nr. Of Crosslinks @ 1% FDR Runtime Runtime Per File
20240926_Astral_Neo1_Mueller_MS_TechHub_IMP_THIDDIAXL001_Cas9_DSSO_500ng_FAIMS_001.raw 5.57 GB 176242 DSSO Yes 116 Cas9 + crapome Sequential 981 0h 52min 0h 52min
20240926_Astral_Neo1_Mueller_MS_TechHub_IMP_THIDDIAXL001_Cas9_DSSO_500ng_FAIMS_001.raw 5.57 GB 176242 DSSO No 116 Cas9 + crapome Sequential 878 0h 50min 0h 50min
20240926_Astral_Neo1_Mueller_MS_TechHub_IMP_THIDDIAXL001_Cas9_DSSO_500ng_FAIMS_001.raw 5.57 GB 176242 DSSO Yes 20328 Cas9 + Human SwissProt Sequential 506 3h 23min 3h 23min
20240926_Astral_Neo1_Mueller_MS_TechHub_IMP_THIDDIAXL001_Cas9_DSSO_500ng_FAIMS_001.raw 5.57 GB 176242 DSSO No 20328 Cas9 + Human SwissProt Sequential 381 2h 00min 2h 00min
20250605_Astral2_NEO1_Mueller_MS_TechHub_IMP_THIDRV001_Cas9_PhoX_500ng_FAIMS_001.raw 4.70 GB 175791 PhoX Yes 116 Cas9 + crapome Sequential 1475 0h 26min 0h 26min
20250605_Astral2_NEO1_Mueller_MS_TechHub_IMP_THIDRV001_Cas9_PhoX_500ng_FAIMS_001.raw 4.70 GB 175791 PhoX No 116 Cas9 + crapome Sequential 1420 0h 25min 0h 25min
20250605_Astral2_NEO1_Mueller_MS_TechHub_IMP_THIDRV001_Cas9_PhoX_500ng_FAIMS_001.raw 4.70 GB 175791 PhoX Yes 20328 Cas9 + Human SwissProt Sequential 484 2h 29min 2h 29min
20250605_Astral2_NEO1_Mueller_MS_TechHub_IMP_THIDRV001_Cas9_PhoX_500ng_FAIMS_001.raw 4.70 GB 175791 PhoX No 20328 Cas9 + Human SwissProt Sequential 422 2h 48min 2h 48min
20250605_Astral2_NEO1_Mueller_MS_TechHub_IMP_THIDRV001_Cas9_PhoX_250ng_FAIMS_003.raw 3.94 GB 172343 PhoX Yes 20328 Cas9 + Human SwissProt Parallel 483 4h 40min 1h 14min
20250605_Astral2_NEO1_Mueller_MS_TechHub_IMP_THIDRV001_Cas9_PhoX_500ng_FAIMS_001.raw 4.70 GB 175791 PhoX Yes 20328 Cas9 + Human SwissProt Parallel 484 4h 54min 1h 14min
20250605_Astral2_NEO1_Mueller_MS_TechHub_IMP_THIDRV001_Cas9_PhoX_500ng_FAIMS_002.raw 4.89 GB 176110 PhoX Yes 20328 Cas9 + Human SwissProt Parallel 344 4h 51min 1h 14min
20250605_Astral2_NEO1_Mueller_MS_TechHub_IMP_THIDRV001_Cas9_PhoX_500ng_FAIMS_003.raw 3.00 GB 134395 PhoX Yes 20328 Cas9 + Human SwissProt Parallel 38 3h 27min 1h 14min

Hardware

The system we tested this on was a desktop PC with the following hardware:

  • MB: ASUS ROG Strix B650E-I
  • CPU: AMD Ryzen 7900X [12 cores @ 4.7 GHz base / 5.6 GHz boost]
  • RAM: Kingston 64 GB DDR5 RAM [5600 MT/s, 36 CAS]
  • GPU: ASUS Dual [Nvidia] GeForce RTX 4060 Ti OC [16 GB VRAM]*
  • SSD/HDD: Corsair MP600 Pro NH 2 TB NVMe SSD [PCIe 4.0]
  • OS: Windows 11 Pro 64-bit (10.0, Build 22631)

*Note: Dual is part of the name, this is a single graphics card!

Support for MGF and timsTOF Data

The following MS Annika versions support MGF* and timsTOF** data input:

*MS Annika 3.0 only supports MS2 search for MGF files since MGF files don't contain sufficient MS3 information.
**optionally requires installation of the Bruker Ion Mobility reader to display ion mobilities in Proteome Discoverer, the node is not needed for crosslink search.
***requires installation of the IMP MS2 Spectrum Processor node.

Getting Help

In case something isn't working or if you need any help with MS Annika or one of the MS Annika extensions, please don't hesitate to reach out to us. You can open up an issue here or start a discussion there. We are usually fast to respond on GitHub and other users might be able to help too! Alternatively, you can always drop us an email at the addresses below.

Known Issues

List of known issues

Contributing & Source Code

The MS Annika codebase contains proprietary code and therefore can't be made open source. If you want to contribute to the development of MS Annika please contact us and we are happy to team up!

Citing

If you are using MS Annika please cite one of the following publications:

Contact

About

MS Annika is a crosslink search engine based on MS Amanda, aimed at identifying crosslinks of cleavable and non-cleavable crosslinkers from MS2 and MS3 spectra.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published