[Biopython] SeqRecord subclassing or composition
Peter Cock
p.j.a.cock at googlemail.com
Wed Mar 9 04:04:26 EST 2011
On Wed, Mar 9, 2011 at 3:07 AM, Uri Laserson <laserson at mit.edu> wrote:
> I am trying to implement a data type for my work. Each object will have a
> sequence (derived from a single read) and lots of annotations and features.
> However, I want to implement some extra interface that is problem-specific
> to make my analysis more convenient.
>
> I am debating whether to subclass SeqRecord and simply implement the extra
> interface or define a new object that wraps a SeqRecord object and pass on
> the subset of native SeqRecord calls and/or simply access the underlying
> SeqRecord directly.
>
> One additional factor is that I want to be able to read/write INSDC-style
> files for the data (e.g., GenBank). Therefore, if I use the SeqIO parser,
> it will return native SeqRecords. If I go the inheritance route, how do I
> cast a SeqRecord object to my new subclass?
There is (currently at least) no option in SeqIO parse/read
to override the use of the SeqRecord object. So you'd need
code to 'upgrade' a SeqRecord into your class. Probably
the simplest route would be for it's __init__ method to
take a single argument (a SeqRecord). Then you could
have:
def my_parse(...):
for seq_record in SeqIO.parse(...):
yield MyClass(seq_record)
def my_read(...):
return MyClass(SeqIO.read(...))
etc
> So, I am debating between inheritance
>
> class ImmuneChain(SeqRecord):
> def __init__(self, *args, **kw):
> SeqRecord.__init__(self,*args,**kw)
> # But how do I cast a SeqRecord to an ImmuneChain?
Unless you modify the methods/atttributes too much, a
ImmuneChain subclass of SeqRecord should be usable
as is with SeqIO.write etc. You don't need to 'cast'.
Also note the above __init__ method can be more specific,
you might have say 10 init args for ImmuneChain, only
some of which you pass to the SeqRecord init.
You could even have a single __init__ argument of a
SeqRecord, and copy all its attributes.
> or composition
>
> class ImmuneChain(object):
> def __init__(self, *args, **kw):
> if isinstance(args[0],SeqRecord):
> self._record = args[0]
> else:
> # Initialize the underlying SeqRecord manually
> self._record.seq = ...
With the above approach you'd have to pass the
private record to SeqIO.write etc (anything which
needs a SeqRecord). That could be done inside
methods of the ImmuneChain object (e.g. you
could expose the format method of the SeqRecord).
>
> Any thoughts?
>
You could alternatively go for a procedural style where
you write your code as functions taking SeqRecord
objects (perhaps expecting particular information in
the annotation).
Peter
More information about the Biopython
mailing list