[Biopython-dev] Accessing additional fields in ABI files

David Bulger davidabulger at gmail.com
Thu Jul 10 16:05:42 UTC 2014


You are welcome to use the A6_1-DB3.ab1 test file. It is from a sequencing
test on Amplicon 6 (A6_1) of the daf-2 gene in C. elegans to confirm a
homozygous mutation after backcrossing the strain. DB3 stands for David
Bulger's third primer, which was the primer used for sequencing.

However, the main reason I wanted to have access to the raw trace file data
was for evaluation of heterozygous mutations, since Phred2 quality scores
and the default bases called cannot be used for this purpose. Thus, it
might be better to use a file with a heterozygous mutation (gk169915) as a
test file (attached).

2_5-DB29.ab1
Amplicon 2 (2_5) of daf-2 C. elegans gene using David Bulger's Primer 29
(DB29)

Many thanks to Mike for your help getting this problem with the Biopython
AbiIO out into the open. Peter and Bow, many thanks for taking the time to
look into this problem and come up with a creative solution. The new
dictionary in SeqRecord sounds like a much more flexible option than the
simple additions of 'extra tags' for DATA9-DATA12 and PLOC1.

All the Best,
David


On Thu, Jul 10, 2014 at 11:25 AM, Peter Cock <p.j.a.cock at googlemail.com>
wrote:

> Great - I've checked in the initial work to the main branch,
>
> https://github.com/biopython/biopython/commit/b20f3641a5eaae5df1d25de5ece8b7d8db441e3e
>
> Peter
>
> On Thu, Jul 10, 2014 at 9:40 AM, Mike Cariaso <cariaso at gmail.com> wrote:
> > no thanks are necessary (quite the reverse), but they are welcomed.
> >
> >
> > On Thu, Jul 10, 2014 at 9:25 AM, Peter Cock <p.j.a.cock at googlemail.com>
> > wrote:
> >>
> >> Hello from the pre-BOSC Codefest, where Bow and I have been
> >> looking at the Bio.SeqIO ABI parser in response to an email from
> >> Mike & David (CC'd, see below).
> >>
> >> All the different versions of the ABI capillary sequencers record
> >> additional tags to the binary file which would record all sorts of
> >> extra information like voltages and the raw colour data.
> >>
> >> Mike & David wanted access to some of this data, but the SeqIO
> >> parser was not exposing it. Our proposal is to add a new dictionary
> >> to the SeqRecord's annotations containing all the raw data so that
> >> advanced users can do further processing.
> >>
> >> I'm going to work on this code today, adding a few more tests etc.
> >>
> >> Mike & David - are you happy to be thanked by name in the
> >> commit comment (e.g. "With input from ...")?
> >>
> >> Mike - I am intending to incorporate your test file A6_1-DB3.ab1
> >> into the Biopython unit test collection.
> >>
> >> Thanks,
> >>
> >> Peter
> >>
> >> ---------- Forwarded message ----------
> >> <snip>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython-dev/attachments/20140710/83c49e25/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 2_5-DB29.ab1
Type: application/octet-stream
Size: 295746 bytes
Desc: not available
URL: <http://mailman.open-bio.org/pipermail/biopython-dev/attachments/20140710/83c49e25/attachment-0001.obj>


More information about the Biopython-dev mailing list