[Biopython-dev] [Biopython] Bio.Sequencing.Ace

Jose Blanca jblanca at btc.upv.es
Mon Jun 29 10:25:30 EDT 2009


Hi:
I'm doing similar things and I took a slightly different approach. Instead of 
using the ace parser api I've created a contig class and my parsers return 
contig objects. You can take a look at the code at:
http://bioinf.comav.upv.es/svn/biolib/biolib/src/
(By the way if you find any code in that library interesting for biopython I 
would be delighted to add it to biopython).

In my library parsing an ace or a caf file works like:
>>> fhand = open('example3.ace', 'r')
>>> ace_parser = get_parser(fhand, format='ace')
>>> for contig in ace_parser:
>>>    print contig
You are also able to get a particular contig giving its name.
>>> ace_parser.contigs('contig_name')

The contigs are like a list of sequences with a consensus property.
>>> contig[0] #the first sequence
>>> contig[1] #the second sequence
>>> contig.consensus #the consensus

The sequeence and quality for every read is also accessible
>>> read0 = contig[0]
>>> read0.seq
>>> read0.qual

There are in fact two different coordinate systems, the contig one and the 
read one (because every read starts in a different place and it can be 
reversed). To acces to the read in its own coordinate sequence you have to 
ask for the sequence property of the read.
In fact the Contig and the LocatableSequence classes are capable of doing more 
things. For instance the contig accepts 2-D indexes and returns new contigs, 
columns, rows, subcontigs, etc.

If you find those classes interesting take a look at the code and take also a 
look at the tests. There is not much documentation, but many tests.
Best regards,

Jose Blanca

On Monday 29 June 2009 12:49:39 Fungazid wrote:
> David hi,
>
> Many many thanks for the diagram.
> I'm not sure I understand the differences between
> contig.af[readn].padded_start,  and contig.bs[readn].padded_start, and
> other unknown parameters. I'll try to compare to the Ace format
>
> Avi
>
> --- On Mon, 6/29/09, Peter <biopython at maubp.freeserve.co.uk> wrote:
> > From: Peter <biopython at maubp.freeserve.co.uk>
> > Subject: Re: [Biopython] Bio.Sequencing.Ace
> > To: "David Winter" <winda002 at student.otago.ac.nz>
> > Cc: biopython at lists.open-bio.org
> > Date: Monday, June 29, 2009, 10:26 AM
> > On Mon, Jun 29, 2009 at 6:19 AM,
> > David
> > Winter<winda002 at student.otago.ac.nz>
> >
> > wrote:
> > > Quoting Peter <biopython at maubp.freeserve.co.uk>:
> > >> There top level properties are simple enough - but
> >
> > I find drilling
> >
> > >> down into the reads a bit more tricky. In general
> >
> > the Ace parser is
> >
> > >> a bit non-obvious without knowing the Ace format.
> >
> > Having some
> >
> > >> __str__ and __repr__ methods defined on the
> >
> > objects returned
> >
> > >> would be very nice - I may get time to work on
> >
> > this later this year.
> >
> > >> Anyone else interested in this drop us an email.
> > >>
> > >> Peter
> > >
> > > I had a scrawled diagram of the contig class next to
> >
> > me when I was using
> >
> > > it more frequently - it was easy enough to reproduce
> >
> > digitally
> >
> > > http://biopython.org/wiki/Ace_contig_class
> > >
> > > Hopefully it helps make sese of where all the data is.
> >
> > I've added a couple
> >
> > > of very brief examples there for now - will expand it
> >
> > when I get a chance.
> >
> > > David
> >
> > This could get turned in docstring/doctest for the Ace
> > parser :)
> >
> > Peter
> > _______________________________________________
> > Biopython mailing list  -  Biopython at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biopython
>
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython



-- 
Jose M. Blanca Postigo
Instituto Universitario de Conservacion y
Mejora de la Agrodiversidad Valenciana (COMAV)
Universidad Politecnica de Valencia (UPV)
Edificio CPI (Ciudad Politecnica de la Innovacion), 8E
46022 Valencia (SPAIN)
Tlf.:+34-96-3877000 (ext 88473)



More information about the Biopython-dev mailing list