[Biopython-dev] Parsing PAML supplementary output
Peter Cock
p.j.a.cock at googlemail.com
Tue Oct 11 06:13:03 EDT 2011
On Tue, Oct 11, 2011 at 11:01 AM, Brandon Invergo <b.invergo at gmail.com> wrote:
>
>> Some of those examples don't really look like PHYLIP anymore to me.
>>
>> If there is any simple change to allow the current parser to cope
>> with (but ignore) any extra meta data like this, that sounds sensible
>> (with unit tests of course - grin).
>
> Agreed, it can get quite messy, though look at the link I provided; even
> the PHYLIP-specific example that they give includes some supplementary
> info at the top, as well as a tree at the bottom:
>
> 4 40 W
> W 0101001111 0101110101 0101110011
> 1101010110
> dmras1 GTCGTCGTTG GACCTGGAGG CGTGGGCAAG
>
> spras GTAGTTGTAG GAGATGGTGG TGTTGGTAAA
> scras1 GTAGTTGTCG GTGGAGGTGG CGTTGGTAAA
> scras2 GTCGTCGTTG GTGGTGGTGG TGTTGGTAAA
> TCCGCGCTCA
> AGTGCTTTGA
> TCTGCTTTAA
> TCTGCTTTGA
> 1
> ((dmras1,ddrasa),((hschras,spras),(scras1,scras2)));
>
I would consider that to be a meta file containing a PHYLIP
alignment and a tree, but in itself it isn't a PHYLIP alignment.
That looks like exactly the kind of issue NEXUS was designed
to solve: how to embed alignments, trees and other stuff into
a single plain text file for input into a phylogenetic tool.
Doesn't PHYLIP have an XML format these days? Trying
to parse something like that text (without a formal standard)
seems like a painful exercise and long term maintenance
headache.
Peter
More information about the Biopython-dev
mailing list