AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (1.1 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Method | Open Access

DIA-MS2pep: a library-free framework for comprehensive peptide identification from data-independent acquisition data

Junjie Hou1( )Jifeng Wang2Fuquan Yang3,4Tao Xu1,4( )
National Laboratory of Biomacramolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
Laboratory of Proteomics, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
Laboratory of Protein and Peptide Pharmaceuticals & Laboratory of Proteomics, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
Show Author Information

Graphical Abstract

Abstract

Identifying peptides directly from data-independent acquisition (DIA) data remains challenging due to the highly multiplexed MS/MS spectra. Spectral library-based peptide detection is sensitive, but it is limited to the depth of the library and mutes the discovery potential of DIA data. We present here, DIA-MS2pep, a library-free framework for comprehensive peptide identification from DIA data. DIA-MS2pep uses a data-driven algorithm for MS/MS spectrum demultiplexing using the fragments data without the need of a precursor. With a large precursor mass tolerance database search, DIA-MS2pep can identify the peptides and their modified forms. We demonstrate the performance of DIA-MS2pep by comparing it to conventional library-free tools in accuracy and sensitivity of peptide identifications using publicly available DIA datasets of varying samples, including HeLa cell lysates, phosphopeptides, plasma, etc. Compared with data-dependent acquisition-based spectral libraries, spectral libraries built directly from DIA data with DIA-MS2pep improve the accuracy and reproducibility of the quantitative proteome.

References

 

Bekker-Jensen DB, Bernhardt OM, Hogrebe A, Martinez-Val A, Verbeke L, Gandhi T, Kelstrup CD, Reiter L, Olsen JV (2020a) Rapid and site-specific deep phosphoproteome profiling by data-independent acquisition without the need for spectral libraries. Nat Commun 11(1): 787. https://doi.org/10.1038/s41467-020-14609-1

 

Bekker-Jensen DB, Martinez-Val A, Steigerwald S, Ruther P, Fort KL, Arrey TN, Harder A, Makarov A, Olsen JV (2020b) A compact quadrupole-orbitrap mass spectrometer with FAIMS interface improves proteome coverage in short LC gradients. Mol Cell Proteomics 19(4): 716−729

 

Chambers MC, Maclean B, Burke R, Amodei D, Ruderman DL, Neumann S, Gatto L, Fischer B, Pratt B, Egertson J, Hoff K, Kessner D, Tasman N, Shulman N, Frewen B, Baker TA, Brusniak MY, Paulse C, Creasy D, Flashner L, Kani K, Moulding C, Seymour SL, Nuwaysir LM, Lefebvre B, Kuhlmann F, Roark J, Rainer P, Detlev S, Hemenway T, Huhmer A, Langridge J, Connolly B, Chadick T, Holly K, Eckels J, Deutsch EW, Moritz RL, Katz JE, Agus DB, MacCoss M, Tabb DL, Mallick P (2012) A cross-platform toolkit for mass spectrometry and proteomics. Nat Biotechnol 30(10): 918−920

 

Chang HY, Kong AT, da Veiga Leprevost F, Avtonomov DM, Haynes SE, Nesvizhskii AI (2020) Crystal-C: a computational tool for refinement of open search results. J Proteome Res 19(6): 2511−2515

 

Chapman JD, Goodlett DR, Masselon CD (2014) Multiplexed and data-independent tandem mass spectrometry for global proteome profiling. Mass Spectrom Rev 33(6): 452−470

 

Creasy DM, Cottrell JS (2004) Unimod: protein modifications for mass spectrometry. Proteomics 4(6): 1534−1536

 

Du XX, Yang F, Manes NP, Stenoien DL, Monroe ME, Adkins JN, States DJ, Purvine SO, Camp DG, Smith RD (2008) Linear discriminant analysis-based estimation of the false discovery rate for phosphopeptide identifications. J Proteome Res 7(6): 2195−2203

 

Eng JK, Mccormack AL, Yates JR (1994) An approach to correlate tandem mass-spectral data of peptides with amino-acid-sequences in a protein database. J Am Soc Mass Spectrom 5(11): 976−989

 

Enserink JM, Kolodner RD (2010) An overview of Cdk1-controlled targets and processes. Cell Div 5: 11. https://doi.org/10.1186/1747-1028-5-11

 

Frewen BE, Merrihew GE, Wu CC, Noble WS, MacCoss MJ (2006) Analysis of peptide MS/MS spectra from large-scale proteomics experiments using spectrum libraries. Anal Chem 78(16): 5678−5684

 

Gessulat S, Schmidt T, Zolg DP, Samaras P, Schnatbaum K, Zerweck J, Knaute T, Rechenberger J, Delanghe B, Huhmer A, Reimer U, Ehrlich HC, Aiche S, Kuster B, Wilhelm M (2019) Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning. Nat Methods 16(6): 509−518

 
Gillet LC, Navarro P, Tate S, Rost H, Selevsek N, Reiter L, Bonner R, Aebersold R (2012) Targeted data extraction of the MS/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis. Mol Cell Proteomics 11(6): O111 016717. https://doi.org/10.1074/mcp.O111.016717
 

Granholm V, Noble WS, Kall L (2011) On using samples of known protein content to assess the statistical calibration of scores assigned to peptide-spectrum matches in shotgun proteomics. J Proteome Res 10(5): 2671−2678

 

Helbig AO, Gauci S, Raijmakers R, van Breukelen B, Slijper M, Mohammed S, Heck AJR (2010) Profiling of N-acetylated protein termini provides in-depth insights into the N-terminal nature of the proteome. Mol Cell Proteom 9(5): 928−939

 

Horn H, Schoof EM, Kim J, Robin X, Miller ML, Diella F, Palma A, Cesareni G, Jensen LJ, Linding R (2014) KinomeXplorer: an integrated platform for kinome biology studies. Nat Methods 11(6): 603−604

 
Hu A, Noble WS, Wolf-Yadlin A (2016) Technical advances in proteomics: new developments in data-independent acquisition. F1000Res 5: F1000 Faculty Rev-419. https://doi.org/10.12688/f1000research.7042.1
 

Kiledjian M, Dreyfuss G (1992) Primary structure and binding-activity of the hnRNP U-protein: binding RNA through RGG box. EMBO J 11(7): 2655−2664

 

Kong AT, Leprevost FV, Avtonomov DM, Mellacheruvu D, Nesvizhskii AI (2017) MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry-based proteomics. Nat Methods 14(5): 513−520

 

Kubinyi H (1991) Calculation of isotope distributions in mass-spectrometry — A trivial solution for a nontrivial problem. Anal Chim Acta 247(1): 107−119

 
Lund SP, Nettleton D, McCarthy DJ, Smyth GK (2012) Detecting differential expression in RNA-sequence data using quasi-likelihood with shrunken dispersion estimates. Stat Appl Genet Mol Biol 11(5): /j/sagmb.2012.11.issue-5/1544-6115.1826/1544-6115.1826. xml. https://doi.org/10.1515/1544-6115.1826
 

Ma J, Chen T, Wu SF, Yang CY, Bai MZ, Shu KX, Li KL, Zhang GQ, Jin Z, He FC, Hermjakob H, Zhu YP (2019) iProX: an integrated proteome resource. Nucleic Acids Res 47(D1): D1211−D1217

 

MacLean B, Tomazela DM, Shulman N, Chambers M, Finney GL, Frewen B, Kern R, Tabb DL, Liebler DC, MacCoss MJ (2010) Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics 26(7): 966−968

 

Meier F, Brunner AD, Frank M, Ha A, Bludau I, Voytik E, Kaspar-Schoenefeld S, Lubeck M, Raether O, Bache N, Aebersold R, Collins B, Rost HL, Mann M (2020) diaPASEF: parallel accumulation-serial fragmentation combined with data-independent acquisition. Nat Methods 17(12): 1229−1236

 

Mun DG, Renuse S, Saraswat M, Madugundu A, Udainiya S, Kim H, Park SKR, Zhao H, Nirujogi RS, Na CH, Kannan N, Yates III, Lee SW, Pandey A (2020) PASS-DIA: a data-independent acquisition approach for discovery studies. Anal Chem 92(21): 14466−14475

 

Rhee SY, Kim YS (2018) The role of advanced glycation end products in diabetic vascular complications. Diabetes Metab J 42(3): 188−195

 

Risso D, Ngai J, Speed TP, Dudoit S (2014) Normalization of RNA-seq data using factor analysis of control genes or samples. Nat Biotechnol 32(9): 896−902

 

Rosenberger G, Koh CC, Guo TN, Rost HL, Kouvonen P, Collins B, Heusel M, Liu YS, Caron E, Vichalkovski A, Faini M, Schubert OT, Faridi P, Ebhardt HA, Matondo M, Lam H, Bader SL, Campbell DS, Deutsch EW, Moritz RL, Tate S, Aebersold R (2014) A repository of assays to quantify 10, 000 human proteins by SWATH-MS. Sci Data 1: 140031. https://doi.org/10.1038/sdata.2014.31

 

Rost HL, Rosenberger G, Navarro P, Gillet L, Miladinovic SM, Schubert OT, Wolskit W, Collins BC, Malmstrom J, Malmstrom L, Aebersold R (2014) OpenSWATH enables automated, targeted analysis of data-independent acquisition MS data. Nat Biotechnol 32(3): 219−223

 

Searle BC, Lawrence RT, MacCoss MJ, Villen J (2019) Thesaurus: quantifying phosphopeptide positional isomers. Nat Methods 16(8): 703−706

 

Searle BC, Pino LK, Egertson JD, Ting YS, Lawrence RT, MacLean BX, Villen J, MacCoss MJ (2018) Chromatogram libraries improve peptide detection and quantification by data independent acquisition mass spectrometry. Nat Commun 9(1): 5128. https://doi.org/10.1038/s41467-018-07454-w

 
Sergushichev A (2020) Fast gene set enrichment analysis. bioRxiv. https://doi.org/10.1101/060012
 

Sinitcyn P, Hamzeiy H, Soto FS, Itzhak D, McCarthy F, Wichmann C, Steger M, Ohmayer U, Distler U, Kaspar-Schoenefeld S, Prianichnikov N, Yilmaz S, Rudolph JD, Tenzer S, Perez-Riverol Y, Nagaraj N, Humphrey SJ, Cox J (2021) MaxDIA enables library-based and library-free data-independent acquisition proteomics. Nat Biotechnol 39(12): 1563−1573

 

Spivak M, Weston J, Bottou L, Kall L, Noble WS (2009) Improvements to the percolator algorithm for peptide identification from shotgun proteomics data sets. J Proteome Res 8(7): 3737−3745

 

Taus T, Kocher T, Pichler P, Paschke C, Schmidt A, Henrich C, Mechtler K (2011) Universal and confident phosphorylation site localization using phosphoRS. J Proteome Res 10(12): 5354−5362

 

Ting YS, Egertson JD, Bollinger JG, Searle BC, Payne SH, Noble WS, MacCoss MJ (2017) PECAN: library-free peptide detection for data-independent acquisition tandem mass spectrometry data. Nat Methods 14(9): 903−908

 

Tsou CC, Avtonomov D, Larsen B, Tucholska M, Choi H, Gingras AC, Nesvizhskii AI (2015) DIA-Umpire: comprehensive computational framework for data-independent acquisition proteomics. Nat Methods 12(3): 258−264

 

Tsou CC, Tsai CF, Teo GC, Chen YJ, Nesvizhskii AI (2016) Untargeted, spectral library-free analysis of data-independent acquisition proteomics data generated using Orbitrap mass spectrometers. Proteomics 16(15-16): 2257−2271

 

Vizcaino JA, Deutsch EW, Wang R, Csordas A, Reisinger F, Rios D, Dianes JA, Sun Z, Farrah T, Bandeira N, Binz PA, Xenarios I, Eisenacher M, Mayer G, Gatto L, Campos A, Chalkley RJ, Kraus HJ, Albar JP, Martinez-Bartolome S, Apweiler R, Omenn GS, Martens L, Jones AR, Hermjakob H (2014) ProteomeXchange provides globally coordinated proteomics data submission and dissemination. Nat Biotechnol 32(3): 223−226

 

Wang XR, Chen CF, Baker PR, Chen PL, Kaiser P, Huang L (2007) Mass spectrometric characterization of the affinity-purified human 26S proteasome complex. Biochemistry 46(11): 3553−3565

 

Yang Y, Liu XH, Shen CP, Lin Y, Yang PY, Qiao L (2020) In silico spectral libraries by deep learning facilitate data-independent acquisition proteomics. Nat Commun 11(1): 146. https://doi.org/10.1038/s41467-019-13866-z

 

Zacchi LF, Schulz BL (2019) Data-independent acquisition for yeast glycoproteomics. Methods Mol Biol 2049: 191−202

 

Zhang F, Ge W, Ruan G, Cai X, Guo T (2020) Data-independent acquisition mass spectrometry-based proteomics and software tools: a glimpse in 2020. Proteomics 20(17-18): e1900276. https://doi.org/10.1002/pmic.201900276

Biophysics Reports
Pages 253-268
Cite this article:
Hou J, Wang J, Yang F, et al. DIA-MS2pep: a library-free framework for comprehensive peptide identification from data-independent acquisition data. Biophysics Reports, 2022, 8(5-6): 253-268. https://doi.org/10.52601/bpr.2022.220011

412

Views

39

Downloads

2

Crossref

1

Scopus

0

CSCD

Altmetrics

Received: 28 May 2022
Accepted: 06 June 2022
Published: 25 July 2022
© The Author(s) 2022

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Return