[Biopython-dev] [Biopython] Bio.Sequencing.Ace
Jose Blanca
jblanca at btc.upv.es
Mon Jun 29 15:16:06 UTC 2009
> Are you using Bio.Sequencing.Ace in your code, or did you write a whole
> new parser instead?
I wrote one, because I wanted to be able to get one particular contig or just
the contig or the read names. But I don't think that is a problem. I gues
that the biopyhon parser could be easily adapted to that.
> Now that I have been using Ace files in my own work, I've been meaning
> to look over your stuff. In some ways, a contig class can be seen as a
> generalisation of a multiple sequence alignment class. Certainly this is
> something we should improve in Biopython (as you might gather from
> some of the enhancement bugs on bugzilla, I have lots of ideas for the
> current alignment class), and I'm sure you have some great ideas too.
I think that here is the main deviation from Biopython. The contig class is
similar to an alignment class, in fact my contig classes shoud be compatible
with your new alignment proporsal api.
alignment.
seq1 +++++++++>
seq2 +++++++++>
seq3 +++++++++>
contig
seq1 ++++>
seq2 +++++>
seq3 ++++++>
Basically every read has a different coordinate system in the contig case.
What I've done is to create a class named LocatableSequence that is a
container for sequence objects. It works like:
>>> seq1 = 'ATCG'
>>> locseq1 = locate_sequence(seq1, location=10)
>>> locseq1[10] == A
In that way the contig is a list of LocatableSequences and the coordinate
system transformations are done by the LocatableSequences, not by the contig.
The LocatableSequences also allow for masks.
The LocatableSequence works with any sequence like objects, strs, Seq,
SeqRecord, lists, etc.
There's also a Location class that represents a fragment of a sequence. My
Location class is more limited than the one in the Biopython SeqFeature. In
my case the start and end should be integers. I use this class to represent
the region not masked in the sequence and the Location of the sequence inside
the LocatableSequence.
Take a look at Contig.py and at LocatableSequence.py, these are the most
relevant classes for this.
Best regards,
--
Jose M. Blanca Postigo
Instituto Universitario de Conservacion y
Mejora de la Agrodiversidad Valenciana (COMAV)
Universidad Politecnica de Valencia (UPV)
Edificio CPI (Ciudad Politecnica de la Innovacion), 8E
46022 Valencia (SPAIN)
Tlf.:+34-96-3877000 (ext 88473)
More information about the Biopython-dev
mailing list