Proteogenomic analysis of Mycobacterium tuberculosis by high resolution mass spectrometry.
Article Details
- CitationCopy to clipboard
Kelkar DS, Kumar D, Kumar P, Balakrishnan L, Muthusamy B, Yadav AK, Shrivastava P, Marimuthu A, Anand S, Sundaram H, Kingsbury R, Harsha HC, Nair B, Prasad TS, Chauhan DS, Katoch K, Katoch VM, Kumar P, Chaerkady R, Ramachandran S, Dash D, Pandey A
Proteogenomic analysis of Mycobacterium tuberculosis by high resolution mass spectrometry.
Mol Cell Proteomics. 2011 Dec;10(12):M111.011627. doi: 10.1074/mcp.M111.011445. Epub 2011 Oct 3.
- PubMed ID
- 21969609 [ View in PubMed]
- Abstract
The genome sequencing of H37Rv strain of Mycobacterium tuberculosis was completed in 1998 followed by the whole genome sequencing of a clinical isolate, CDC1551 in 2002. Since then, the genomic sequences of a number of other strains have become available making it one of the better studied pathogenic bacterial species at the genomic level. However, annotation of its genome remains challenging because of high GC content and dissimilarity to other model prokaryotes. To this end, we carried out an in-depth proteogenomic analysis of the M. tuberculosis H37Rv strain using Fourier transform mass spectrometry with high resolution at both MS and tandem MS levels. In all, we identified 3176 proteins from Mycobacterium tuberculosis representing ~80% of its total predicted gene count. In addition to protein database search, we carried out a genome database search, which led to identification of ~250 novel peptides. Based on these novel genome search-specific peptides, we discovered 41 novel protein coding genes in the H37Rv genome. Using peptide evidence and alternative gene prediction tools, we also corrected 79 gene models. Finally, mass spectrometric data from N terminus-derived peptides confirmed 727 existing annotations for translational start sites while correcting those for 33 proteins. We report creation of a high confidence set of protein coding regions in Mycobacterium tuberculosis genome obtained by high resolution tandem mass-spectrometry at both precursor and fragment detection steps for the first time. This proteogenomic approach should be generally applicable to other organisms whose genomes have already been sequenced for obtaining a more accurate catalogue of protein-coding genes.
DrugBank Data that Cites this Article
- Polypeptides
Name UniProt ID Possible cellulase CelA1 (Endoglucanase) (Endo-1,4-beta-glucanase) (FI-cmcase) (Carboxymethyl cellulase) Q79G13 Details Cyclopropane mycolic acid synthase MmaA2 Q79FX6 Details Cyclase O07732 Details Conserved protein O53240 Details Hydroxymycolate synthase MmaA4 Q79FX8 Details DNA-directed RNA polymerase subunit beta' P9WGY7 Details Probable arabinosyltransferase A P9WNL9 Details 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine pyrophosphokinase P9WNC7 Details Probable arabinosyltransferase B P9WNL7 Details Enoyl-[acyl-carrier-protein] reductase [NADH] P9WGR1 Details Catalase-peroxidase P9WIE5 Details Probable arabinosyltransferase C P9WNL5 Details Cyclopropane mycolic acid synthase 2 P9WPB5 Details Thymidylate kinase P9WKE1 Details 6,7-dimethyl-8-ribityllumazine synthase P9WHE9 Details Deoxyuridine 5'-triphosphate nucleotidohydrolase P9WNS5 Details Nicotinate-nucleotide pyrophosphorylase [carboxylating] P9WJJ7 Details Cell division protein FtsZ P9WN95 Details NADPH-ferredoxin reductase FprA P9WIQ3 Details Pantothenate synthetase P9WIL5 Details Purine nucleoside phosphorylase P9WP01 Details Serine/threonine-protein kinase PknB P9WI81 Details 3-alpha-(or 20-beta)-hydroxysteroid dehydrogenase P9WGT1 Details Diacylglycerol acyltransferase/mycolyltransferase Ag85C P9WQN9 Details Putative 4-hydroxy-4-methyl-2-oxoglutarate aldolase P9WGY3 Details 3-oxoacyl-[acyl-carrier-protein] synthase 3 P9WNG3 Details 2-isopropylmalate synthase P9WQB3 Details Malate synthase G P9WK17 Details Isocitrate lyase P9WKK7 Details 3-dehydroquinate dehydratase P9WPX7 Details Mycocyclosin synthase P9WPP7 Details Dihydrofolate reductase P9WNX1 Details Dihydropteroate synthase P9WND1 Details 4-hydroxy-tetrahydrodipicolinate reductase P9WP23 Details D-3-phosphoglycerate dehydrogenase P9WNX3 Details Cyclopropane mycolic acid synthase 3 P9WPB3 Details Glutamine synthetase P9WN39 Details Group 1 truncated hemoglobin GlbN P9WN25 Details Aminoglycoside 2'-N-acetyltransferase P9WQG9 Details Inositol-3-phosphate synthase P9WKI1 Details Citrate lyase subunit beta-like protein P9WPE1 Details Guanylate kinase P9WKE9 Details Mycothiol acetyltransferase P9WJM7 Details 1,4-dihydroxy-2-naphthoyl-CoA synthase P9WNP5 Details HTH-type transcriptional regulator EthR P9WMC1 Details Ribose-5-phosphate isomerase B P9WKD7 Details NAD(P)H dehydrogenase (quinone) P9WHH7 Details UDP-galactopyranose mutase P9WIQ1 Details Probable thiol peroxidase P9WG35 Details dTDP-4-dehydrorhamnose 3,5-epimerase P9WH11 Details Proteasome subunit alpha P9WHU1 Details Cyclopropane mycolic acid synthase 1 P9WPB7 Details Adenosylhomocysteinase P9WGV3 Details Serine/threonine-protein kinase PknG P9WI73 Details Octanoyltransferase P9WK83 Details ATP-dependent dethiobiotin synthetase BioD P9WPQ5 Details Cytochrome P450 130 P9WPN5 Details Methionine aminopeptidase 2 P9WK19 Details (2Z,6E)-farnesyl diphosphate synthase P9WFF5 Details Decaprenyl diphosphate synthase P9WFF7 Details 4,5:9,10-diseco-3-hydroxy-5,9,17-trioxoandrosta-1(10),2-diene-4-oate hydrolase P9WNH5 Details Probable L-lysine-epsilon aminotransferase P9WQ77 Details Proteasome subunit beta P9WHT9 Details R2-like ligand binding oxidase P9WH69 Details Peptide deformylase P9WIJ3 Details Iron-dependent extradiol dioxygenase P9WNW7 Details 3-oxoacyl-[acyl-carrier-protein] synthase 1 P9WQD9 Details Secreted chorismate mutase P9WIB9 Details Arylamine N-acetyltransferase P9WJI5 Details Deazaflavin-dependent nitroreductase P9WP15 Details Nucleoid-associated protein Lsr2 P9WIP7 Details Uncharacterized MFS-type transporter EfpA P9WJY5 Details 16S/23S rRNA (cytidine-2'-O)-methyltransferase TlyA P9WJ63 Details Probable fatty acid synthase Fas (Fatty acid synthetase) P95029 Details