Signal sequence and keyword trap in silico for selection of full-length human cDNAs encoding secretion or membrane proteins from oligo-capped cDNA libraries.
Article Details
- CitationCopy to clipboard
Otsuki T, Ota T, Nishikawa T, Hayashi K, Suzuki Y, Yamamoto J, Wakamatsu A, Kimura K, Sakamoto K, Hatano N, Kawai Y, Ishii S, Saito K, Kojima S, Sugiyama T, Ono T, Okano K, Yoshikawa Y, Aotsuka S, Sasaki N, Hattori A, Okumura K, Nagai K, Sugano S, Isogai T
Signal sequence and keyword trap in silico for selection of full-length human cDNAs encoding secretion or membrane proteins from oligo-capped cDNA libraries.
DNA Res. 2005;12(2):117-26.
- PubMed ID
- 16303743 [ View in PubMed]
- Abstract
We have developed an in silico method of selection of human full-length cDNAs encoding secretion or membrane proteins from oligo-capped cDNA libraries. Fullness rates were increased to about 80% by combination of the oligo-capping method and ATGpr, software for prediction of translation start point and the coding potential. Then, using 5'-end single-pass sequences, cDNAs having the signal sequence were selected by PSORT ('signal sequence trap'). We also applied 'secretion or membrane protein-related keyword trap' based on the result of BLAST search against the SWISS-PROT database for the cDNAs which could not be selected by PSORT. Using the above procedures, 789 cDNAs were primarily selected and subjected to full-length sequencing, and 334 of these cDNAs were finally selected as novel. Most of the cDNAs (295 cDNAs: 88.3%) were predicted to encode secretion or membrane proteins. In particular, 165(80.5%) of the 205 cDNAs selected by PSORT were predicted to have signal sequences, while 70 (54.2%) of the 129 cDNAs selected by 'keyword trap' preserved the secretion or membrane protein-related keywords. Many important cDNAs were obtained, including transporters, receptors, and ligands, involved in significant cellular functions. Thus, an efficient method of selecting secretion or membrane protein-encoding cDNAs was developed by combining the above four procedures.
DrugBank Data that Cites this Article
- Polypeptides
Name UniProt ID Retinol dehydrogenase 13 Q8NBN7 Details Cathepsin B P07858 Details Synaptic vesicle glycoprotein 2A Q7L0J3 Details Corticoliberin P06850 Details 5'-nucleotidase P21589 Details Tubulin beta-4A chain P04350 Details ADP-ribose pyrophosphatase, mitochondrial Q9BW91 Details Syndecan-2 P34741 Details Choline transporter-like protein 2 Q8IWA5 Details Prolyl 3-hydroxylase 1 Q32P28 Details Fatty acid desaturase 2 O95864 Details Succinate dehydrogenase [ubiquinone] cytochrome b small subunit, mitochondrial O14521 Details Mitochondrial dicarboxylate carrier Q9UBX3 Details Thioredoxin domain-containing protein 12 O95881 Details All-trans-retinol 13,14-reductase Q6NUM9 Details Retinol dehydrogenase 11 Q8TC12 Details Estradiol 17-beta-dehydrogenase 11 Q8NBQ5 Details Transmembrane anterior posterior transformation protein 1 homolog Q6NXT6 Details Pannexin-1 Q96RD7 Details NADH-cytochrome b5 reductase 1 Q9UHQ9 Details Carboxypeptidase Q Q9Y646 Details Cytochrome P450 4F12 Q9HCS2 Details Solute carrier family 2, facilitated glucose transporter member 6 Q9UGQ3 Details Solute carrier family 2, facilitated glucose transporter member 11 Q9BYW1 Details Elongation of very long chain fatty acids protein 5 Q9NYP7 Details Solute carrier family 43 member 3 Q8NBI5 Details Sodium-coupled neutral amino acid transporter 1 Q9H2H9 Details Prostaglandin-H2 D-isomerase P41222 Details Zinc transporter ZIP1 Q9NY26 Details V-type proton ATPase subunit S1 Q15904 Details UbiA prenyltransferase domain-containing protein 1 Q9Y5Z9 Details CD276 antigen Q5ZPR3 Details