[Biopython-dev] Parsing PAML supplementary output

Peter Cock p.j.a.cock at googlemail.com
Mon Oct 10 12:21:52 UTC 2011


On Mon, Oct 10, 2011 at 11:36 AM, Brandon Invergo <b.invergo at gmail.com> wrote:
> Hi all,
> I've received a request to implement the parsing of the main
> supplementary output files of the PAML programs ('rst' files). I can't
> submit a bug on Bugzilla, so I'll just announce my intention to work on
> this here on the list.

That's because we moved to RedMine, there should have
been a link on the old Bugzilla page, but anyway its here:
https://redmine.open-bio.org/projects/biopython

> One question though. The rst file for baseml includes an alignment which
> is in the Phylip sequential format. I thought that it would be nice to
> parse that directly into a Biopython MultipleSeqAlignment. It's my
> understanding that Biopython only supports the interleaved format. Would
> it be worth it for me to extend that functionality to include the
> sequential format or would it be preferable to convert the alignments to
> be interleaved within the parser itself?
>
> Regards,
> Brandon Invergo

If you can extend the current PHYLIP parser (strict or relaxed)
to cover interleaved and sequential, that would be nice. For
strict mode at least, we can in principle follow whatever the
original PHYLIP tools do to detect this automatically. It may
be safer to make it explicit though - from what I recall without
seeing the PHYLIP implementation's source code it was not
obvious how to do this reliably.

Peter



More information about the Biopython-dev mailing list