[BioPython] I don't understand why SeqRecord.feature is a list
Peter
biopython at maubp.freeserve.co.uk
Thu Jul 5 09:33:26 UTC 2007
Giovanni Marco Dall'Olio wrote:
> Let's have a look at your example:
> - we have a list of features like this:
> list_features = ['GTAAGT', 'TACTAAC', 'TGT']
>
> - then we specify the meaning of these features in another dictionary:
> splicesignal5 = list_features[0]
> polypirimidinetract = list_features[1]
> splicesignal3 = list_features[2]
>
> python passes the variables by value: this means that if you change
> one of the values in the list_features list, then you have to update
> all the variables which refer to it manually.
>
>>>> list_features = ['GTAAGT', 'TACTAAC', 'TGT']
>>>> splicesignal5 = list_features[0]
>>>> print splicesignal5
> 'GTAAGT'
>>>> list_features[0] = 'TTTTTTT'
>>>> print splicesignal5
> 'GTAAGT' # wrong!
>>>> splicesignal5 = list_features[0] # have to update all the
> variables which refer to list_features manually
>>>> print splicesignal5'
> 'TTTTTTT'
>
> This is why I prefer to save the positions of the features instead of
> their values:
>>>> list_features = ['GTAAGT', 'TACTAAC', 'TGT']
>>>> dict_aliases = {'splicesignal5': [0], 'polypirimidinetract' : [1],
> 'splicesignal3': [2]}
>>>> def get_feature(feature_name): return
> list_features[dict_aliases[feature_name]] # (this code doesn't work)
...
> Another option could be to use references to memory positions instead
> of dictionary keys, but I don't know how to implement this in python,
> and I'm not sure it would be computationally convenient.
Have you considered making "feature objects", where each object can hold
multiple pieces of information such as a name, alias, type - as well as
the sequence data itself. You may wish to create your own class here, or
try and use the existing Biopython SeqFeature object.
You could then use a list to hold your feature objects, or a dictionary
keyed on the alias perhaps. Or both.
e.g.
class Feature :
#Very simple class which could be extended
def __init__(self, seq_string) :
self.seq = seq_string
def __repr__(self) :
#Use id(self) is to show the memory location (in hex), just
#to show difference between two instance with same seq
return "Feature(%s) instance at %s" \
% (self.seq, hex(id(self)))
list_features = [Feature('GTAAGT'),
Feature('TACTAAC'),
Feature('TGT')]
splicesignal5 = list_features[0]
print splicesignal5
print list_features[0]
print "EDITING first object in the list:"
list_features[0].seq = 'TTTTTTT'
print splicesignal5 #changed, now TTTTTTT
print list_features[0]
print "REPLACING first object in the list:"
list_features[0] = Feature('GGGGGG')
print splicesignal5 #still points to old object, TTTTTTT
print list_features[0]
--
I'm not sure if that is closer to what you wanted, or not.
Peter
More information about the Biopython
mailing list