[Biojava-l] [biojavax] EMBL parser error
Richard Holland
richard.holland at ebi.ac.uk
Fri Apr 7 12:48:46 UTC 2006
Sorry, my bad. An off-by-one error...
Check it out again and see if it works now.
cheers,
Richard
PS. I don't have any EMBL files to test with at the moment otherwise I'd
check it myself... :)
On Fri, 2006-04-07 at 14:18 +0200, Morgane THOMAS-CHOLLIER wrote:
> I now get another error message with the same file :
>
> Exception in thread "main" org.biojava.bio.BioException: Could not read
> sequence
> at
> org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:111)
> at
> org.embnet.be.biojavax.tryout.EMBLParseTest.main(EMBLParseTest.java:34)
> Caused by: java.lang.IndexOutOfBoundsException: No group 5
> at java.util.regex.Matcher.group(Matcher.java:355)
> at
> org.biojavax.bio.seq.io.EMBLFormat.readRichSequence(EMBLFormat.java:271)
> at
> org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:108)
> ... 1 more
>
> Here is the complete file, for info:
>
> ID DQ158013 standard; genomic DNA; VRT; 118 BP.
> XX
> AC DQ158013;
> XX
> SV DQ158013.1
> XX
> DT 19-JAN-2006 (Rel. 86, Created)
> DT 19-JAN-2006 (Rel. 86, Last updated, Version 1)
> XX
> DE Triturus helveticus clone Thel.b9 HOXB9 (Hoxb9) gene, partial cds.
> XX
> KW .
> XX
> OS Triturus helveticus (palmate newt)
> OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
> Amphibia;
> OC Batrachia; Caudata; Salamandroidea; Salamandridae; Triturus.
> XX
> RN [1]
> RP 1-118
> RX DOI; 10.1016/j.ympev.2005.08.012.
> RX PUBMED; 16198128.
> RA Mannaert A., Roelants K., Bossuyt F., Leyns L.;
> RT "A PCR survey for posterior Hox genes in amphibians";
> RL Mol. Phylogenet. Evol. 38(2):449-458(2006).
> XX
> RN [2]
> RP 1-118
> RA Mannaert A., Roelants K., Bossuyt F., Leyns L.;
> RT ;
> RL Submitted (09-AUG-2005) to the EMBL/GenBank/DDBJ databases.
> RL Biology Department, Vrije Universiteit Brussel, Pleinlaan 2,
> Brussels 1050,
> RL Belgium
> XX
> FH Key Location/Qualifiers
> FH
> FT source 1..118
> FT /organism="Triturus helveticus"
> FT /mol_type="genomic DNA"
> FT /clone="Thel.b9"
> FT /db_xref="taxon:256425"
> FT gene <1..>118
> FT /gene="Hoxb9"
> FT /note="Hoxb-9"
> FT mRNA <1..>118
> FT /gene="Hoxb9"
> FT /product="HOXB9"
> FT CDS <1..>118
> FT /codon_start=2
> FT /gene="Hoxb9"
> FT /product="HOXB9"
> FT /db_xref="UniProtKB/TrEMBL:Q2LK47"
> FT /protein_id="ABA39736.1"
> FT /translation="KYQTLELEKEFLFNMYLTRDRRHEVARLLNLSERQVKIW"
> XX
> SQ Sequence 118 BP; 28 A; 35 C; 37 G; 18 T; 0 other;
> caaataccag acgctggagc tggagaagga gttcctgttc aacatgtacc
> tcacccggga 60
> ccgcaggcac gaggtggccc ggctgctgaa cctcagcgag cgccaggtca
> agatctgg 118
> //
>
> Thanks for helping,
>
> Morgane.
>
> Richard Holland wrote:
>
> >That was indeed a bug. I have made a change to the date parsing in
> >EMBLFormat and committed it to CVS. Could you test it for me please?
> >
> >cheers,
> >Richard
> >
> >On Fri, 2006-04-07 at 11:20 +0200, Morgane THOMAS-CHOLLIER wrote:
> >
> >
> >>Hello,
> >>
> >>I am currently using biojavax that I checked out today from CVS to parse
> >>an EMBL file, exported from EBI SRS server.
> >>
> >>I ran into this error :
> >>
> >>Exception in thread "main" org.biojava.bio.BioException: Could not read
> >>sequence
> >> at
> >>org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:111)
> >> at
> >>org.embnet.be.biojavax.tryout.EMBLParseTest.main(EMBLParseTest.java:34)
> >>Caused by: org.biojava.bio.seq.io.ParseException: Bad date type found: 86
> >> at
> >>org.biojavax.bio.seq.io.EMBLFormat.readRichSequence(EMBLFormat.java:278)
> >> at
> >>org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:108)
> >> ... 1 more
> >>
> >>The EMBL file is :
> >>
> >>ID DQ158013 standard; genomic DNA; VRT; 118 BP.
> >>XX
> >>AC DQ158013;
> >>XX
> >>SV DQ158013.1
> >>XX
> >>DT 19-JAN-2006 (Rel. 86, Created)
> >>DT 19-JAN-2006 (Rel. 86, Last updated, Version 1)
> >>XX
> >>DE Triturus helveticus clone Thel.b9 HOXB9 (Hoxb9) gene, partial cds.
> >>
> >>Removing the two lines that comprise the date information resolves the
> >>problem.
> >>
> >>Thanks,
> >>
> >>Morgane.
> >>
> >>
> >>
>
> --
> **********************************************************
> Morgane THOMAS-CHOLLIER, PHD Student
>
> Vrije Universiteit Brussels (VUB)
> Laboratory of Cell Genetics
> Pleinlaan 2
> 1050 Brussels
> Belgium
>
--
Richard Holland
European Bioinformatics Institute
Wellcome Trust Genome Campus, Hinxton
Cambridge CB10 1SD, UK
Tel: +44-(0)1223-494416
---------------
More information about the Biojava-l
mailing list