[Biojava-l] GenBank XML File Parse Error
Toralf Kirsten
tkirsten at izbi.uni-leipzig.de
Fri Jan 23 10:35:47 EST 2004
Hi,
I have to extract data from the GenBank XML files.
For this purpose I use the biojava API. But I get a parser error.
java.lang.StringIndexOutOfBoundsException: String index out of range: 12
at java.lang.String.substring(String.java:1477)
at org.biojava.bio.seq.io.GenbankContext.processHeaderLine
(GenbankContext.java:621)
at org.biojava.bio.seq.io.GenbankContext.processLine
(GenbankContext.java:263)
at org.biojava.bio.seq.io.GenbankFormat.readSequence
(GenbankFormat.java:144)
at org.biojava.bio.seq.io.StreamReader.nextSequence
(StreamReader.java:100)
rethrown as org.biojava.bio.BioException: Could not read sequence
at
org.biojava.bio.seq.io.StreamReader.nextSequence
(StreamReader.java:103)
at de.izbi.gbm.logistics.GenBankBioJavaImporter.readFile
(GenBankBioJavaImporter.java:41)
at de.izbi.gbm.gui.GenBankBaseFrame.actionPerformed
(GenBankBaseFrame.java:134)
at javax.swing.AbstractButton.fireActionPerformed
(AbstractButton.java:1764)
at javax.swing.AbstractButton$ForwardActionEvents.actionPerformed
(AbstractButton.java:1817)
at javax.swing.DefaultButtonModel.fireActionPerformed
(DefaultButtonModel.java:419)
at javax.swing.DefaultButtonModel.setPressed
(DefaultButtonModel.java:257)
at javax.swing.AbstractButton.doClick(AbstractButton.java:289)
at javax.swing.plaf.basic.BasicMenuItemUI.doClick
(BasicMenuItemUI.java:1109)
at javax.swing.plaf.basic.BasicMenuItemUI$MouseInputHandler.
mouseReleased(BasicMenuItemUI.java:943)
at java.awt.Component.processMouseEvent(Component.java:5093)
at java.awt.Component.processEvent(Component.java:4890)
at java.awt.Container.processEvent(Container.java:1566)
at java.awt.Component.dispatchEventImpl(Component.java:3598)
at java.awt.Container.dispatchEventImpl(Container.java:1623)
at java.awt.Component.dispatchEvent(Component.java:3439)
at java.awt.LightweightDispatcher.retargetMouseEvent
(Container.java:3450)
at java.awt.LightweightDispatcher.processMouseEvent
(Container.java:3165)
at java.awt.LightweightDispatcher.dispatchEvent(Container.java:3095)
at java.awt.Container.dispatchEventImpl(Container.java:1609)
at java.awt.Window.dispatchEventImpl(Window.java:1585)
at java.awt.Component.dispatchEvent(Component.java:3439)
at java.awt.EventQueue.dispatchEvent(EventQueue.java:450)
at java.awt.EventDispatchThread.pumpOneEventForHierarchy
(EventDispatchThread.java:197)
at java.awt.EventDispatchThread.pumpEventsForHierarchy
(EventDispatchThread.java:150)
at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:144)
at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:136)
at java.awt.EventDispatchThread.run(EventDispatchThread.java:99)
The program is just simple. The user specifies path and file name by the
FileChooser component. Then I open the file and apply the Sequence and
Annotation classes as visible in the attached method taken from a extended
file class.
What I need are the sequence data of the GenBank entry (accession,
sequence etc.)
and also for its features (start, end position, subtype like t-RNA, cds
etc.)
Any hints are welcome.
Thanks Tori
---------------------
public GenBankBioJavaImporter(String path, String fileName, Connection
genDbCon) {
super();
super.setPath(path);
super.setFileName(fileName);
}
public boolean readFile() {
if (!super.createInputFile()) return(false);
//read the GenBank File
SequenceIterator sequences =
SeqIOTools.readGenbank(super.fileReaderHandler); // fileReaderHandler is
a BufferedReader
//iterate through the sequences
while(sequences.hasNext()) {
try {
Sequence seq = sequences.nextSequence();
//do stuff with the sequence
System.out.println("Info: "+seq.getName()+", "+seq.getURN()+",
"+seq.countFeatures());
Annotation anno = seq.getAnnotation();
//anno.getProperty()
}
catch (BioException ex) {
//not in GenBank format
ex.printStackTrace();
super.closeInputFile();
return(false);
}catch (NoSuchElementException ex) {
//request for more sequence when there isn't any
ex.printStackTrace();
super.closeInputFile();
return(false);
}
}
super.closeInputFile();
return(true);
}
More information about the Biojava-l
mailing list