[Biopython-dev] Fwd: [biopython] Newick parser (#156)
Peter Cock
p.j.a.cock at googlemail.com
Fri Feb 8 15:21:46 UTC 2013
Eric,
Could you take a look at this please?
Thanks,
Peter
---------- Forwarded message ----------
From: Ben Morris <notifications at github.com>
Date: Fri, Feb 8, 2013 at 3:12 PM
Subject: [biopython] Newick parser (#156)
To: biopython/biopython <biopython at noreply.github.com>
In light of three issues with the Newick parser:
https://redmine.open-bio.org/issues/3409
https://redmine.open-bio.org/issues/3386
https://redmine.open-bio.org/issues/3407
this is a rewrite of the parser from scratch. It supports quoted node
labels and can handle support values either as they were previously handled
or from square-bracketed comments, as requested by Arlin. Additionally,
it's consistently quite fast:
[image: newick_parse_times]<https://f.cloud.github.com/assets/544977/139616/fac0df38-71fe-11e2-91a8-a95ba7c6340b.png>
The unit tests still pass with these changes, and I'm now able to parse
trees that previously raised exceptions.
------------------------------
You can merge this Pull Request by running
git pull https://github.com/bendmorris/biopython newick
Or view, comment on, or merge it at:
https://github.com/biopython/biopython/pull/156
Commit Summary
- A more efficient implementation of a Newick parser (linear time vs.
quadratic) that makes only a single pass over the text and handles quoted
labels correctly.
- Implementing support values and fixing issue when external parentheses
are missing.
File Changes
- *M* Bio/Phylo/NewickIO.py<https://github.com/biopython/biopython/pull/156/files#diff-0>(198)
Patch Links:
- https://github.com/biopython/biopython/pull/156.patch
- https://github.com/biopython/biopython/pull/156.diff
More information about the Biopython-dev
mailing list