[Biojava-dev] NullPointerException from BlastSAXParser.java
Sicotte, Hugues (NIH/NCI)
sicotteh at mail.nih.gov
Fri Oct 7 13:27:01 EDT 2005
I've been through this before when I was working for
NCBI.
The answer was that the text output of BLAST was never a supported format.
The only supported format is the XML Blast Output.
http://ccgb.umn.edu/~crow/projects/xmlblast/example.html
also
In the case of parsing multiple blast files,
breaking on "Searching..." is not a good idea
because if the parameters are wrong or the query sequence
too low complexity, this String is not emitted by the program.
Hugues Sicotte
-----Original Message-----
From: W. Eric Trull [mailto:wetrull at yahoo.com]
Sent: Friday, October 07, 2005 12:05 PM
To: biojava-dev at biojava.org
Cc: mark.schreiber at novartis.com
Subject: Re: [Biojava-dev] NullPointerException from BlastSAXParser.java
Should I raise this as an issue with NCBI? Seems like it makes writting
parsing routines more difficult.
Thanks.
-Eric Trull
--- mark.schreiber at novartis.com wrote:
> Looks like there might be a difference in the Windows output. I will try
> to take a look at this over the next few days. Probably need to change the
> BlastSAXParser to look for something other than Searching so that this
> will get parsed as well.
>
> - Mark
>
>
>
>
>
> "W. Eric Trull" <wetrull at yahoo.com>
> 10/06/2005 11:01 PM
>
>
> To: biojava-dev at biojava.org
> cc: Mark Schreiber/GP/Novartis at PH
> Subject: Re: [Biojava-dev] NullPointerException from
> BlastSAXParser.java
>
>
> Hello Mark,
>
> Here is what I've done, using NCBI Blast 2.0.11, Windows XP, JDK 1.4.2
>
> 1. Downloaded the PDB's pdb_seqres.txt
> 2. Created a blast database (after changing the deflines):
> C:\blast-2.0.11\formatdb.exe
> -t "PDB"
> -i blast\pdb_seqres.txt
> -l blast\pdb_formatdb.log
> -o T
> -n blast\pdb
> 3. BLASTed 26SPS9_Hs:
> C:\blast-2.0.11\blastall.exe
> -p blastp
> -d blast\pdb
> -i 26SPS9_Hs.fasta
> -o 26SPS9_Hs.blast
> 4. Tried to parse 26SPS9_Hs.blast using the class shown in BioJava in
> Anger
> and BlastEcho, both of which give me the NullPointerException. The
> beginning
> of 26SPS9_Hs.blast file is shown below, the entire file is attached.
>
> Please let me know if you see anything obviously wrong with the way I'm
> doing
> the BLAST. I'm going to cvs checkout the BioJava source code and have a
> look
> at the JUnit test later today.
>
> Thanks!
>
> -Eric Trull
>
> -------- 26SPS9_Hs.blast --------
> BLASTP 2.0.11 [Jan-20-2000]
>
>
> Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
> Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
> "Gapped BLAST and PSI-BLAST: a new generation of protein database search
> programs", Nucleic Acids Res. 25:3389-3402.
>
> Query= 26SPS9_Hs
> (176 letters)
>
> Database: PDB
> 78,094 sequences; 17,596,117 total letters
>
>
>
> Score
> E
> Sequences producing significant alignments: (bits)
> Value
>
> pdb|1UFM|A Cop9 Complex Subunit 4 39
> 0.003
> .
> .
> .
> -------- 26SPS9_Hs.blast --------
>
>
> --- mark.schreiber at novartis.com wrote:
>
> > Hello -
> >
> > This is very odd.
> >
> > The JUnit tests currently pass using the files in
> > /tests/files/org/biojava/bio/programs/ssbind These BLAST files all have
>
> > the string "Searching....". Maybe there is a variation in the windows
> > output?
> >
> > Can you post at least the header of your output to the list (preferably
> an
> > entire example output)?
> >
> > - Mark
> >
> >
> >
> >
> >
> > "W. Eric Trull" <wetrull at yahoo.com>
> > Sent by: biojava-dev-bounces at portal.open-bio.org
> > 10/06/2005 06:11 AM
> >
> >
> > To: biojava-dev at biojava.org
> > cc: (bcc: Mark Schreiber/GP/Novartis)
> > Subject: [Biojava-dev] NullPointerException from
> > BlastSAXParser.java
> >
> >
> > Hello all,
> >
> > I'm new to the list, but have done as much archive searching, Google
> > searching, and debugging as I can on the problem I describe here.
> >
> > I'm trying to parse NCBI BLAST output (as shown in BioJava in Anger),
> but
> > keep getting a NullPointerException. One of my searches turned up using
> > BlastEcho to debug the problem, but that also throws the
> > NullPointerException:
> >
> > startSearch
> > SearchProp: program: ncbi-blastp
> > SearchProp: version: 2.0.11
> > java.lang.NullPointerException
> > at
> >
>
org.biojava.bio.program.sax.BlastSAXParser.interpret(BlastSAXParser.java:215
)
> > at
> >
> org.biojava.bio.program.sax.BlastSAXParser.parse(BlastSAXParser.java:164)
> > at
> >
>
org.biojava.bio.program.sax.BlastLikeSAXParser.onNewDataSet(BlastLikeSAXPars
er.java:311)
> > at
> >
>
org.biojava.bio.program.sax.BlastLikeSAXParser.interpret(BlastLikeSAXParser.
java:274)
> > at
> >
>
org.biojava.bio.program.sax.BlastLikeSAXParser.parse(BlastLikeSAXParser.java
:160)
> > at
> > com.pfizer.search.sequence.BlastEcho.echo(BlastEcho.java:42)
> > at
> > com.pfizer.search.sequence.BlastEcho.main(BlastEcho.java:88)
> > Exception in thread "main"
> >
> > Stepping through the code in a debugger shows that the while loop added
> in
> > revision 1.13 of
> > /biojava-live/src/org/biojava/bio/program/sax/BlastSAXParser.java (fixed
> > truncation of database id) reads all the lines without ever matching the
> > "Searching" string. At first I thought it was because I was using a
> later
> > version of BLAST, but then I tried 2.0.11 and 2.2.3 (supported version)
> > but
> > they also result in a NullPointerException. In the BLAST output for the
> > various versions I never see a "Searching" string anywhere. I've tried
> > all
> > the -m options as well, without success.
> >
> > Is there a NCBI BLAST option that I need to be using? I'm running on
> > Windows
> > XP (during development) - is the UNIX version output different?
> >
> > Thanks.
> >
> > -Eric Trull
> >
> >
> > _______________________________________________
> > biojava-dev mailing list
> > biojava-dev at biojava.org
> > http://biojava.org/mailman/listinfo/biojava-dev
> >
> >
> >
> >
> BLASTP 2.0.11 [Jan-20-2000]
>
>
> Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
> Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
> "Gapped BLAST and PSI-BLAST: a new generation of protein database search
> programs", Nucleic Acids Res. 25:3389-3402.
>
> Query= 26SPS9_Hs
> (176 letters)
>
> Database: PDB
> 78,094 sequences; 17,596,117 total letters
>
>
>
>
=== message truncated ===
_______________________________________________
biojava-dev mailing list
biojava-dev at biojava.org
http://biojava.org/mailman/listinfo/biojava-dev
More information about the biojava-dev
mailing list