[Biojava-l] Parsing Genbank-sequences from NCBI
Seth Johnson
johnson.biotech at gmail.com
Sat Aug 12 17:17:57 UTC 2006
More problems with parsing nucleotide sequences from NCBI. Apparently,
there's an odd dbxref tag on some of the sequences submitted by ATCC that
causes an exception. I've ran into 2 so far, but I'm sure there are more:
AA343569.1
AA325485.1
Exceptions produced are as follows:
--------------------------------------------------------------
Trying to get: AA343569.1
org.biojava.bio.BioException: Failed to read Genbank sequence
at
org.biojavax.bio.db.ncbi.GenbankRichSequenceDB.getRichSequence(GenbankRichSequenceDB.java:157)
at exonhit.parsers.EventParser.getSeqFromNCBI(EventParser.java:250)
at exonhit.parsers.EventParser.insertRglrSE(EventParser.java:197)
at
exonhit.parsers.EventParser.createSpliceEvents(EventParser.java:105)
at exonhit.parsers.EventParser.main(EventParser.java:310)
Caused by: org.biojava.bio.BioException: Could not read sequence
at
org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:112)
at
org.biojavax.bio.db.ncbi.GenbankRichSequenceDB.getRichSequence(GenbankRichSequenceDB.java:153)
... 4 more
Caused by: org.biojava.bio.seq.io.ParseException: Bad dbxref found: ATCC
(inhost):145151, accession:AA343569
at
org.biojavax.bio.seq.io.GenbankFormat.readRichSequence(GenbankFormat.java:438)
at
org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:109)
... 5 more
Java Result: -1
=========================================================
Trying to get: AA325485.1
org.biojava.bio.BioException: Failed to read Genbank sequence
at
org.biojavax.bio.db.ncbi.GenbankRichSequenceDB.getRichSequence(GenbankRichSequenceDB.java:157)
at exonhit.parsers.EventParser.getSeqFromNCBI(EventParser.java:250)
at exonhit.parsers.EventParser.insertRglrSE(EventParser.java:197)
at
exonhit.parsers.EventParser.createSpliceEvents(EventParser.java:105)
at exonhit.parsers.EventParser.main(EventParser.java:312)
Caused by: org.biojava.bio.BioException: Could not read sequence
at
org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:112)
at
org.biojavax.bio.db.ncbi.GenbankRichSequenceDB.getRichSequence(GenbankRichSequenceDB.java:153)
... 4 more
Caused by: org.biojava.bio.seq.io.ParseException: Bad dbxref found: ATCC
(inhost):125990, accession:AA325485
at
org.biojavax.bio.seq.io.GenbankFormat.readRichSequence(GenbankFormat.java:438)
at
org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:109)
... 5 more
Java Result: -1
--
View this message in context: http://www.nabble.com/Parsing-Genbank-sequences-from-NCBI-tf2052235.html#a5777810
Sent from the BioJava forum at Nabble.com.
More information about the Biojava-l
mailing list