[Biojava-dev] biojava3-ws alternate NCBIQBlastService implementation
Gediminas Rimša
gediminas.rimsa at gmail.com
Sat Feb 11 22:25:10 UTC 2012
Hi,
the new implementation of NCBIQblastService is now on biojava-live.
There is a simple usage example - refer to demo.NCBIQBlastServiceDemo class.
Also, I couldn't find much about parsing Blast XML results in Java when
I needed it, so here's a short guide for that. It might not be pretty,
but it worked for me :)
Step 1. Acquire Blast output in XML format (for example from
NCBIQBlastService). It will start like this (note the root element
"BlastOutput"):
<?xml version="1.0"?>
<!DOCTYPE BlastOutput PUBLIC "-//NCBI//NCBI BlastOutput/EN"
"NCBI_BlastOutput.dtd">
<BlastOutput>
...
Step 2. Acquire referenced schema files - you will need
NCBI_BlastOutput.dtd, NCBI_BlastOutput.mod.dtd and NCBI_Entity.mod.dtd
(they can be found on NCBI site or attached to this message).
Step 3. Use XJC to generate Java classes from XML schema. I used
Maven's JAXB plugin:
<plugin>
<groupId>org.jvnet.jaxb2.maven2</groupId>
<artifactId>maven-jaxb2-plugin</artifactId>
<version>0.8.0</version>
<executions>
<execution>
<goals>
<goal>generate</goal>
</goals>
<configuration>
<generatePackage>ncbi.blast.result.generated</generatePackage> <!--
package name for generated classes -->
<generateDirectory>${basedir}/src/main/java</generateDirectory>
<schemaLanguage>dtd</schemaLanguage>
<schemaIncludes>
<value>outputSchema/NCBI_BlastOutput.dtd</value> <!-- main schema file
location ( here: /src/main/resources/outputSchema/NCBI_BlastOutput.dtd) -->
</schemaIncludes>
</configuration>
</execution>
</executions>
<dependencies>
<dependency>
<groupId>org.jvnet.jaxb2-commons</groupId>
<artifactId>property-listener-injector</artifactId>
<version>1.0</version>
</dependency>
</dependencies>
</plugin>
Alternatively, you can do it from command line, example:
http://plindenbaum.blogspot.com/2010/11/blastxmlannotations.html
Step 4. Put all 3 schema files next to the generated classes (this,
together with a custom EntityResolver in the next step, is done so that
you don't have to copy the schema files to every directory in which you
want to process blast output XML files).
Step 5. Create BlastOutput object representing root XML element:
JAXBContext jc = JAXBContext.newInstance(BlastOutput.class);
Unmarshaller u = jc.createUnmarshaller();
XMLReader xmlreader = XMLReaderFactory.createXMLReader();
xmlreader.setFeature("http://xml.org/sax/features/namespaces", true);
xmlreader.setFeature("http://xml.org/sax/features/namespace-prefixes",
true);
xmlreader.setEntityResolver(new EntityResolver() {
public InputSource resolveEntity(String publicId,
String systemId) throws SAXException, IOException {
String file = null;
if (systemId.contains("NCBI_BlastOutput.dtd")) {
file = "NCBI_BlastOutput.dtd";
}
if (systemId.contains("NCBI_Entity.mod.dtd")) {
file = "NCBI_Entity.mod.dtd";
}
if (systemId.contains("NCBI_BlastOutput.mod.dtd")) {
file = "NCBI_BlastOutput.mod.dtd";
}
return new
InputSource(BlastOutput.class.getResourceAsStream(file));
}
});
InputSource input = new InputSource(new FileReader(new
File( "blast-results-file.xml" )));
Source source = new SAXSource(xmlreader, input);
return (BlastOutput) u.unmarshal(source);
Step 6. Use the created blastOutput like any other Java object. For
example, if you want to get the number of Blast iterations, you can do
it like this:
blastOutput.getBlastOutputIterations().getIteration().size()
And that's about it. Hope this helps someone
Gediminas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.open-bio.org/pipermail/biojava-dev/attachments/20120212/80b196e2/attachment-0006.html>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.open-bio.org/pipermail/biojava-dev/attachments/20120212/80b196e2/attachment-0007.html>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.open-bio.org/pipermail/biojava-dev/attachments/20120212/80b196e2/attachment-0008.html>
More information about the biojava-dev
mailing list