[Biojava-l] SAX parser demo
jinchen at ufl.edu
jinchen at ufl.edu
Sun Jun 29 15:00:14 EDT 2003
Sorry for bothering on this topic again. I have one sample in my zip file. My
simple XML parser simply works for my sample xml. However, I got this error when
I try to use your parser:
staxenv org.biojava.bio.program.sax.blastxml.BlastXMLParser at 1cd2e5f
org.xml.sax.SAXParseException: The markup declarations contained or pointed to
by the document type declaration must be well-formed.
at org.apache.xerces.framework.XMLParser.reportError(XMLParser.java:1060)
at
org.apache.xerces.framework.XMLDTDScanner.reportFatalXMLError(XMLDTDScanner.java:651)
at
org.apache.xerces.framework.XMLDTDScanner.scanDecls(XMLDTDScanner.java:1523)
at
org.apache.xerces.framework.XMLDocumentScanner.scanDoctypeDecl(XMLDocumentScanner.java:2199)
at
org.apache.xerces.framework.XMLDocumentScanner.access$0(XMLDocumentScanner.java:2152)
at
org.apache.xerces.framework.XMLDocumentScanner$PrologDispatcher.dispatch(XMLDocumentScanner.java:883)
at
org.apache.xerces.framework.XMLDocumentScanner.parseSome(XMLDocumentScanner.java:381)
at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:952)
at
org.biojava.bio.program.sax.blastxml.BlastXMLParserFacade.parse(BlastXMLParserFacade.java:167)
at BlastParser3.main(BlastParser3.java:47)
Would you like to let me know why?
Thanks,
Jin
Quoting David Huen <david.huen at ntlworld.com>:
> Hi,
> OK, I have uploaded a demo to CVS. It is at biojava-live/demos/blastxml.
> It's just a plain ripoff of Mark Schreiber's demo in Biojava In Anger
> ported to use the BlastXML parser. You will need to do a "cvs update -d"
> to create the new directories for the demos and for the DTD directory.
>
> I have added a facade to the BlastXML parsing framework. The facade is
> called BlastXMLParserFacade and is used identically to the way the existing
> BlastLikeSAXParser is used with blast text output. I think this will make
> it easier for users all round: that both have the same interface. You can
> look in that class to see how the BJ parsing framework is actually set up.
>
> I won't have more time available to work on this for a bit but bug reports
> are welcome for eventual fixes. As previously mentioned, running multiple
> sequence queries on a database with NCBI blast results in the concatenation
> of all the Blast XML outputs resulting in an almighty completely non-XML
> compliant file (multiple <xml> and <DOCTYPE> elements for example).
> Parsing those requires a hack I have previously described but it is ugly,
> ugly, ugly. Maybe the latest NCBI version might have fixed this problem
> but I haven't looked.
>
> Best wishes,
> David Huen
> P.S. It is really really bedtime, guys.....
> P.P.S There is an ugly entity resolver hack I will need to clean up later
> too.
>
>
More information about the Biojava-l
mailing list