[Biopython] Nexus parsing

Mon Feb 9 14:54:24 UTC 2015

On Mon, Feb 9, 2015 at 1:37 PM, Tiago Antao <tra at popgen.net> wrote:
> Hi,
>
> I am trying to parse a (heavily annotated) nexus file with Bio.Phylo.
> The file is from a paper on science
> http://www.sciencemag.org/content/345/6202/1369/suppl/DC1 available here
> http://www.sciencemag.org/content/suppl/2014/08/27/science.1259657.DC1/1259657_file_s2.zip
> and called
> trees/ebola.raxml.tree
>
> I am able to parse this with DendroPy just fine, but not with Bio.Phylo
>
> The error that I get is:
>
> hdl = Phylo.read('trees/ebola.raxml.tree', 'nexus')
>
> /home/tra/Dropbox/soft/biopython/Bio/Nexus/Trees.pyc in
> _get_values(self, text) 161             if nc_end == -1:
>     162                 raise TreeError('Error in tree description:
> Found %s without matching %s' --> 163                                 %
> (NODECOMMENT_START, NODECOMMENT_END)) 164             nodecomment =
> text[nc_start:nc_end + 1] 165             text = text[:nc_start] +
> text[nc_end + 1:]
>
> TreeError: Error in tree description: Found [& without matching ]
>
>
> Any ideas would be most appreciated, thanks.
> Tiago

That sounds like a nice reproducible test case. Can you find the
mismatched tags in the raw data? My guess without checking is
this is due to expected line wrapping.

Maybe file this on GitHub?

Peter