[BioPython] Parsing ACE files

Peter biopython at maubp.freeserve.co.uk
Mon Nov 10 11:15:52 UTC 2008


> Here is something I wrote some time back I hope it still works:
>
> from Bio.Sequencing import Ace
> aceparser = Ace.ACEParser()
> fn = '/mnt/hda2/bio/836CLEAN-100.fasta.cap.ace'
> acefilerecord = aceparser.parse(open(fn))
> # For each contig:
> for ctg in acefilerecord.contigs:
>    ....

I guess I'm the bearer of bad news - the ACEParser object (with its
iterator method) was deprecated in Biopython 1.48, in favour of a
simple function calls read and parse (the DEPRECATED file didn't
mention this, an oversight I've just rectified).  Your code needs a
small update:

from Bio.Sequencing import Ace
fn = '/mnt/hda2/bio/836CLEAN-100.fasta.cap.ace'
acefilerecord=Ace.read(open(fn))
# For each contig:
for ctg in acefilerecord.contigs:
   print '=========================================='
   print 'Contig name: %s'%ctg.name
   print 'Bases: %s'%ctg.nbases
   print 'Reads: %s'%ctg.nreads
   print 'Segments: %s'%ctg.nsegments
   print 'Sequence: %s'%ctg.sequence
   print 'Quality: %s'%ctg.quality
   # For each read in contig:
   for read in ctg.reads:
       print 'Read name: %s'%read.rd.name
       print 'Align start: %s'%read.qa.align_clipping_start
       print 'Align end: %s'%read.qa.align_clipping_end
       print 'Read sequence: %s'%read.rd.sequence
       print '=========================================='

If you try the old code on Biopython 1.48 or 1.49b you should get a
deprecation warning suggesting this change.

Or, you can use Ace.parse(open(fn)) to iterate over the contigs
directly (assuming you don't care about the WA, CT, RT and WR tags
which may be at the end of the file).

Peter



More information about the Biopython mailing list