[Biopython-dev] [Biopython] Bio.Sequencing.Ace

David WInter winda002 at student.otago.ac.nz
Wed Jul 1 02:13:17 EDT 2009


Peter Cock wrote:
> On Tue, Jun 30, 2009 at 9:31 AM, Jose Blanca<jblanca at btc.upv.es> wrote:
>   
>>> What I was thinking of was a contig class as an alignment subclass,
>>> holding a list of SeqRecord objects and offsets.
>> I thought about that implementation and I created some code. The
>> problem I found with that approach is that the contig class code got
>> too messy.  .
>>     
>
> A simple masked sequence class would also be useful for Roche SFF
> files which hold sequencing reads (of about 500bp) with start and end
> trim points. This is a use case separate from the location offset in an
> alignment - so I'm not convinced it makes sense to do both in one
> class.
>
> Perhaps having the contig class hold a list of (masked) SeqRecord
> objects, their offset, and their direction would work?
>
>   
That sounds like the most intuitive way for the class to work from a 
user's perspective

>>> One important thing I think we should do BEFORE adding any contig
>>> class to Biopython, is get it working with at least one other contig file
>>>
>>>       
>> Well, In fact my contig class is modeled after the caf file format.
>> The ace parsing was just an afterthought, my primary interest
>> was the caf format.
>>     
>
> Well, as the CAF file format was an extension of the ACE format,
> perhaps a third contig format would be worth looking at before
> considering if a contig class would be sufficiently general.
>   
I came across the page somewhere in my travels, a quick description of a 
few contig files:
http://www.cbcb.umd.edu/research/contig_representation.shtml

At a glance I think all of them could be treated with a similar approach 
to the one described above.

David


More information about the Biopython-dev mailing list