[Biopython-dev] Fwd: [Open-bio-l] Proposed BLAST XML Changes
Wibowo Arindrarto
w.arindrarto at gmail.com
Tue Mar 18 09:52:29 UTC 2014
Hi Peter, everyone,
Thanks for the heads up. If implemented as it is, the updates will
change our underlying SearchIO model (aside from the blast-xml parser
itself), by allowing a Hit retrieval using multiple different keys.
I have a feeling it will be difficult to jam all the new changes into
a backwards-compatible parser. One way to make it transparent to users
is to use the underlying DTD to do validation before parsing (for the
two BLAST DTDs, use the one which the file can be validated against).
However, this comes at a price. Since the standard library-bundled
elementtree doesn't seem to support validation, we have to use another
library (lxml is my choice). This means adding 3rd party dependency
which require compiling (lxml is also partly written in C).
The other option is to introduce a new format name (e.g.
'blast-xml2'), which makes the user responsible for knowing which
BLAST XML he/she is parsing. It feels more explicit this way, so I am
leaning towards this option, despite 'blast-xml2' not sounding very
nice to me ;).
Any other thoughts?
Best,
Bow
On Mon, Mar 17, 2014 at 7:35 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> Hi all,
>
> Bow (regarding SearchIO) others should probably read this...
>
> I've commented, see also:
> http://blastedbio.blogspot.co.uk/2014/02/blast-xml-output-needs-more-love-from.html
>
> Peter
>
>
> ---------- Forwarded message ----------
> From: Maloney, Christopher (NIH/NLM/NCBI) [C] <maloneyc at ncbi.nlm.nih.gov>
> Date: Mon, Mar 17, 2014 at 5:17 PM
> Subject: [Open-bio-l] Proposed BLAST XML Changes
> To: "open-bio-l at lists.open-bio.org" <open-bio-l at lists.open-bio.org>
>
>
> We are not directly soliciting comments, but if anyone would like to
> make any technical or programmatic suggestions, there is a link from
> which anyone may comment in the document.
>
> ftp://ftp.ncbi.nlm.nih.gov/blast/documents/NEWXML/ProposedBLASTXMLChanges.pdf
>
> Thank you.
>
>
> P.S. Please re-post this to other lists that might have interested readers.
>
> Chris Maloney
> NIH/NLM/NCBI (Contractor)
> Building 45, 5AN.24D-22
> 301-594-2842
>
>
> _______________________________________________
> Open-Bio-l mailing list
> Open-Bio-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/open-bio-l
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
More information about the Biopython-dev
mailing list