[Biopython] Codeml parser in Biopython?

Peter biopython at maubp.freeserve.co.uk
Tue Sep 14 05:04:56 EDT 2010


Hi Anastasia,

On Tue, Sep 14, 2010 at 9:02 AM, natassa <natassa_g_2000 at yahoo.com> wrote:
> Hi Peter,
>
>>
>> Could you post a short example of the kind of output you are looking at?
>>
>
> Here is an example output, but this caan differ depending on the model used
> (there are several models for Branch, Site, BranchSite, but all are pretty
> standard)
>

Thanks - that looks possible to parse, but not very easy (especially if the
codeml output changes slightly between versions).

>>
>> Can you get codeml to output what you need in another format, such as NEXUS?
>>
>
> Haven't tried that, but as you can see, this is a very verbose output and
> NEXUS does not seem an option.

At first glance, the NEXUS format could hold a lot of that information.
Another possibility might be phyloXML. However, you are at the mercy
of the codeml tool and what it supports. I might be worth politely asking
the author(s) about supporting one of these more standard formats as
a optional output.

> Ultimately, I want to parse this to get all the information I need in a
> tabulated file. I am still working out what exactly I need (there are standard
> values to get out, as LnL, branch length, Dn/Ds, but it also depends on the type
> of downstram analysis). I will now work on the pypaml class and modify the
> original code to make it more generic (it seems that it only works for Site
> Models).

Note that Ziheng Yang's pypaml code is licensed under the GPL v3, so
unless he agrees to re-license it we cannot include it in Biopython.

> Will let you know, was just wondering if there was already a solution.There is
> one in Bioperl, but heard it is very slow and in any case, I don't understand
> much of perl....

I don't know much Perl either ;)

Peter


More information about the Biopython mailing list