[Biojava-l] need help

Francois Pepin fpepin at cs.mcgill.ca
Thu Feb 20 22:44:45 EST 2003


This is a guess more than anything else (but probably a good one):

I don't this that the fasta reader accepts spaces in the sequence.

This explains the error message, that it doesn't contain the character '
'.

I don't remember the specifications for the fasta format, but I don't
think it includes having spaces in the sequence.

Fran?ois

-----Original Message-----
From: biojava-l-bounces at biojava.org
[mailto:biojava-l-bounces at biojava.org] On Behalf Of zhao guijun
Sent: 17 f¨¦vrier, 2003 21:00
To: Biojava-l at biojava.org
Subject: [Biojava-l] need help


hello, could anyone tell me why I got such errors?

 /usr/java2/bin/java ReadFasta imsIND020717.fas DNA
 
org.biojava.bio.symbol.IllegalSymbolException: This tokenization doesn't
contain character:  
        at
org.biojava.bio.seq.io.CharacterTokenization.parseTokenChar(CharacterTok
enization.java:166)
        at
org.biojava.bio.seq.io.CharacterTokenization$TPStreamParser.characters(C
haracterTokenization.java:237)
        at
org.biojava.bio.seq.io.FastaFormat.readSequenceData(FastaFormat.java:150
)
        at
org.biojava.bio.seq.io.FastaFormat.readSequence(FastaFormat.java:114)
        at
org.biojava.bio.seq.io.StreamReader.nextSequence(StreamReader.java:100)
rethrown as org.biojava.bio.BioException: Could not read sequence
        at
org.biojava.bio.seq.io.StreamReader.nextSequence(StreamReader.java:103)
        at
org.biojava.bio.seq.io.SeqIOTools.readFasta(SeqIOTools.java:244)
        at ReadFasta.main(ReadFasta.java:17)
the reading file(imsIND020717.fas) format is like the following:

>gnl|IMS-JST_InsDel_IND|IMS-JST075587_Major allelePos=61 total 
>gnl|len=121|AB014087.1 Pos=16412^16413|||4475009|
AGGCCTTTCT CAAAGTGGAA GTCTCATCCT CACTTCTCTG GTTACAGTGC TGGGCCATGG 
G
TAACTTACAA GGCTTAGCAG GAACTGTCTG CGCACTCCCC CTTCCTGCCC ACTACCTTGT 

>gnl|IMS-JST_InsDel_IND|IMS-JST075587_Minor allelePos=60^61 total 
>gnl|len=120|AB014087.1 Pos=16412^16413|||4475009|
AGGCCTTTCT CAAAGTGGAA GTCTCATCCT CACTTCTCTG GTTACAGTGC TGGGCCATGG 
TAACTTACAA GGCTTAGCAG GAACTGTCTG CGCACTCCCC CTTCCTGCCC ACTACCTTGT 

Best Regards,

guijun



_______________________________________________
Biojava-l mailing list  -  Biojava-l at biojava.org
http://biojava.org/mailman/listinfo/biojava-l




More information about the Biojava-l mailing list