[Biojava-dev] Bug in PHYLIPFileBuilder?
Mark Schreiber
markjschreiber at gmail.com
Thu Jun 7 10:46:47 UTC 2007
Hi -
BioJava's phylogenetics support is very experimental right now.
However, this would be something that our Google SOC student might
want to take a look at.
- Mark
On 6/7/07, Felipe Albrecht <felipe.albrecht at gmail.com> wrote:
> Sorry,
> from my example file, substitute the line
> builder.setSitesCount(seq.seqString().length());
> to
> builder.setSitesCount(l.get(l.size()-1).seqString().length());
>
> Thanks
>
> Felipe Albrecht
>
>
> On 6/7/07, Felipe Albrecht <felipe.albrecht at gmail.com> wrote:
> > Hello, im trying to convert a protein multiple alignment in fasta
> > format to phylip format.
> >
> > The source is:
> >
> > BufferedReader br = new BufferedReader(new FileReader(args[0]));
> > PHYLIPFileBuilder builder = new PHYLIPFileBuilder();
> > RichSequenceIterator richSequenceIterator =
> > IOTools.readFastaProtein(br, null);
> > List<Sequence> l = new LinkedList<Sequence>();
> > Sequence seq = null;
> > while (richSequenceIterator.hasNext()) {
> > l.add(richSequenceIterator.nextSequence());
> > }
> > builder.startFile();
> > builder.setSequenceCount(l.size());
> > builder.setSitesCount(seq.seqString().length());
> > for (Sequence sequence : l) {
> > builder.setCurrentSequenceName(sequence.getName());
> > builder.receiveSequence(sequence.seqString());
> > }
> > builder.endFile();
> >
> > As I said, my input data is a protein multiple alignment.
> >
> > Running this source, this trace is showed:
> >
> > Exception in thread "main" org.biojava.bio.BioError: Something has
> > gone badly wrong with DNA
> > at org.biojava.bio.seq.DNATools.createDNASequence(DNATools.java:199)
> > at org.biojava.bio.seq.DNATools.createGappedDNASequence(DNATools.java:207)
> > at org.biojavax.bio.phylo.io.phylip.PHYLIPFileBuilder.createSequences(PHYLIPFileBuilder.java:121)
> > at org.biojavax.bio.phylo.io.phylip.PHYLIPFileBuilder.buildAlignment(PHYLIPFileBuilder.java:94)
> > at org.biojavax.bio.phylo.io.phylip.PHYLIPFileBuilder.endFile(PHYLIPFileBuilder.java:63)
> > at FastaParser.main(FastaParser.java:54)
> > Caused by: org.biojava.bio.symbol.IllegalSymbolException: This
> > tokenization doesn't contain character: 'Q'
> > at org.biojava.bio.seq.io.CharacterTokenization.parseTokenChar(CharacterTokenization.java:175)
> > at org.biojava.bio.seq.io.CharacterTokenization$TPStreamParser.characters(CharacterTokenization.java:246)
> > at org.biojava.bio.symbol.SimpleSymbolList.<init>(SimpleSymbolList.java:178)
> > at org.biojava.bio.seq.DNATools.createDNA(DNATools.java:173)
> > at org.biojava.bio.seq.DNATools.createDNASequence(DNATools.java:195)
> > ... 5 more
> >
> > IMHO, the bug is at the line 121 of PHYLIPFileBuilder.java, method
> > createSequence , where is done:
> > try {
> > DNATools.createGappedDNASequence(sequence, name);
> > } catch (IllegalSymbolException e) {
> > isDNA = false;
> > }
> >
> > Where is execpeted that DNATools.CreateGappedDNASequence throws a
> > IllegalSymbolException , but seeking this method, in the file
> > DNAToos.java line 198:
> > } catch (BioException se) {
> > throw new BioError("Something has gone badly wrong with DNA", se);
> > }
> > Being the IllegalSymbolException subclass of the BioError, they is
> > "catched" and a new exception is created and in
> > NATools.createSequences they arent catched.
> >
> >
> > I solved this problem adding:
> > } catch (IllegalSymbolException ie) {
> > throw ie;
> > }
> >
> > in createDNASequence, but it's a workaround for the exception be catched.
> >
> > A better solution, is check the type of the sequence. Exist a method
> > for discover is the sequence is DNA/RNA/Protein/mistake? If yes, uses
> > it, also the exceptions must be used when occurs an exception and dont
> > for flow control.
> >
> > PS: Im using the biojava source code downloaded today from
> > http://www.biojava.org/download/bj15b/all/biojava-1.5-beta2.tar.gz
> >
> > Thanks and Im waiting opinions.
> >
> > Felipe Albrecht
> >
> _______________________________________________
> biojava-dev mailing list
> biojava-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>
More information about the biojava-dev
mailing list