[Biopython] How to print variants ?

Anirban Bhattachariya anbhat at utu.fi
Sun Jan 17 21:48:27 UTC 2010


Hi ,

Suppose we want to study how mutations/SNPs affect on binding or some other biochemical reaction. Let's also assume, that we have a motif or motifs we want to test against These variants are listed in sequence files, there is listed only the original protein sequence. For to test motives against variants, we need complete protein sequence. Let's say our protein has 75 variants, so we need original + 75 protein sequences to test with motifs. My intention is to make a list of those 75 proteins.


For example if with slicing I can print :
print seq_record.features[5],
print seq_record.features[13],

Output:
location: [28:602]
ref: None:None
strand: None
qualifiers:
    Key: experiment, Value: ['experimental evidence, no additional details recorded']
    Key: gene, Value: ['BCHE']
    Key: gene_synonym, Value: ['CHE1']
    Key: note, Value: ['Cholinesterase. /FTId=PRO_0000008613.']
    Key: region_name, Value: ['Mature chain']
 type: Region

location: [31:32]
ref: None:None
strand: None
qualifiers:
    Key: experiment, Value: ['experimental evidence, no additional details recorded']
    Key: gene, Value: ['BCHE']
    Key: gene_synonym, Value: ['CHE1']
    Key: note, Value: ['Missing (in BChE deficiency). /FTId=VAR_040011.']
    Key: region_name, Value: ['Variant']
    Seq('I', IUPACProtein()) type: Region


Now I want to print the features which has 'variant' ( in above example the the second one " print seq_record.features[13]" in other words I only want to print features with     " Key: region_name, Value: ['Variant']" and ignore other features.



Now for the final part I want to print the sequence which has variant sequence.

For example :

location: [55:56]
ref: None:None
strand: None
qualifiers:
   Key: experiment, Value: ['experimental evidence, no additional
details recorded']
   Key: gene, Value: ['BCHE']
   Key: gene_synonym, Value: ['CHE1']
   Key: note, Value: ['F -> I (in BChE deficiency). /FTId=VAR_040013.']
   Key: region_name, Value: ['Variant']
   Seq('F', IUPACProtein()) type: Region

It says location: [55:56] also there is this line
Key: note, Value: ['F -> I (in BChE deficiency). /FTId=VAR_040013.']
That says that F in original sequence has changed to I variant sequence
So I need the protein sequence where there in position 55 is I instead
of F.



Thanks,
Anirban



More information about the Biopython mailing list