[Biopython] Sequence annotation (Features)

Anirban Bhattachariya anbhat at utu.fi
Sat Jan 16 08:38:28 UTC 2010


Hi,

I'm trying to download a protein sequence object (using ID or accession number) and then trying to print its variants (all variant sequences) from its features and annotations.I'm using pseudocholinesterase (http://www.uniprot.org/uniprot/P06276 ) as an example since it has lot of natural variants.

The problem is when I'm trying to access the features its saying "0 features" ; how can I access the features in Swiss-Prot file  like in genbank file format ( as in section 4.6  of the tutorial). 

Here is my code:

from Bio import ExPASy
from Bio import SeqIO
from Bio import SeqFeature

handle =ExPASy.get_sprot_raw("P06276")
seq_record = SeqIO.read(handle, "swiss")
handle.close()
print seq_record.id
print seq_record.name
print seq_record.description
print repr(seq_record.seq)
print "Length %i" % len(seq_record)
print seq_record.annotations["keywords"]
print len(seq_record)
print "%i features" % (len(seq_record.features))

output:
P06276
CHLE_HUMAN
RecName: Full=Cholinesterase; EC=3.1.1.8; AltName: Full=Acylcholine acylhydrolase; AltName: Full=Choline esterase II; AltName: Full=Butyrylcholine esterase; AltName: Full=Pseudocholinesterase; Flags: Precursor;
Seq('MHSKVTIICIRFLFWFLLLCMLIGKSHTEDDIIIATKNGKVRGMNLTVFGGTVT...VGL', ProteinAlphabet())
Length 602
['3D-structure', 'Complete proteome', 'Direct protein sequencing', 'Disease mutation', 'Disulfide bond', 'Glycoprotein', 'Hydrolase', 'Polymorphism', 'Serine esterase', 'Signal']
602
0 features


Thanks in advance.

-Anirban







More information about the Biopython mailing list