[Biopython-dev] SeqIO Abi Parser

Peter Cock p.j.a.cock at googlemail.com
Fri Jul 29 08:14:06 EDT 2011


On Fri, Jul 29, 2011 at 12:34 PM, Wibowo Arindrarto
<w.arindrarto at gmail.com> wrote:
> Hi Peter,
> Thanks for explaining. I understand why we should stick to the stored
> sequence id. In this case, we can use the filename as SeqRecord.name as
> well. Regarding BioPerl, I don't have it installed myself -- but I took a
> quick look at their source and it seems they also use the stored sequence ID
> as their main identifier instead of the filename. If the stored sequence ID
> is not present, it's "(unknown)" in their case.

OK good, that means Biopython, BioPerl and EMBOSS should be
consistent :)

> As for concatenation, I don't think it's possible. The official spec from
> ABI does not mention anything about combining ABI records. Plus, the file
> structure itself does not allow multiple sequence to be stored.

OK good, I didn't think it was allowed but wanted to check.

> I'll look on the test_SeqIO.py over the weekend. I think it'll have
> something to do with some ambiguous dna base stored in the abi files.
> Regards,

Some of the alphabet stuff is a bit nasty - so please feel free to ask
or get me to help.

Peter


More information about the Biopython-dev mailing list