[Biojava-dev] SymbolPropertyTableIterator for AAindex files

Martin Szugat Martin.Szugat at GMX.net
Fri Sep 2 19:09:29 EDT 2005


Hi!

I've implemented a stream reader for AAindex files (Amino acid indices and
similarity matrices, http://www.genome.ad.jp/dbget/aaindex.html) called
AAindexStreamReader. It implements an interface called
SymbolPropertyTableIterator which iterates over SymbolPropertyTable objects.
The iterator is BioJava-style and fully documentated. The
AAindexStreamReader returns in fact AAindex objects which is derived from
SimpleSymbolPropertyTable and provides additional methods to set and
retrieve information that is stored within an AAindex file (in the AAindex1
format) like an hashtable of similar amino acid indices and its correlation
coefficients.

I'll hope you find these classes useful and integrate it into BioJava. If
you have further question or if some changes are needed don't hesitate to
contact me! I'd really like to see these classes in BioJava ;)

In addition there are a few more classes that might be useful, too. First
there is an interface called SymbolPropertyTableDB (in analogy to the
SequenceDB interface) and a simple implementation called
SimpleSymbolPropertyTableDB (what a long name!).

Finally there is a class called ClassificationFastaDescriptionLineParser
which extends SequenceBuilderFilter and extracts a classification value
(e.g. SCOP or CATH) from the description line of FASTA entries. This must be
the second item in the description line after the name. The
ClassificationFastaDescriptionLineParser should be used in conjunction with
the FastaDescriptionLineParser.

I've implemented all these classes for an open source project called BioWeka
(http://www.bioweka.org)---it's an extension to the Weka data mining
framework for bioinformaticians and biologists. And of course, it relies on
BioJava. In this sense, thanks for your fine work!

Martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: AAindexStreamReader.java
Type: text/java
Size: 8019 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/biojava-dev/attachments/20050903/2fb3aabf/AAindexStreamReader-0001.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ClassificationFastaDescriptionLineParser.java
Type: text/java
Size: 3844 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/biojava-dev/attachments/20050903/2fb3aabf/ClassificationFastaDescriptionLineParser-0001.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: AAindex.java
Type: text/java
Size: 6686 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/biojava-dev/attachments/20050903/2fb3aabf/AAindex-0001.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SimpleSymbolPropertyTableDB.java
Type: text/java
Size: 5018 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/biojava-dev/attachments/20050903/2fb3aabf/SimpleSymbolPropertyTableDB-0001.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SymbolPropertyTableDB.java
Type: text/java
Size: 2039 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/biojava-dev/attachments/20050903/2fb3aabf/SymbolPropertyTableDB-0001.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SymbolPropertyTableIterator.java
Type: text/java
Size: 1757 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/biojava-dev/attachments/20050903/2fb3aabf/SymbolPropertyTableIterator-0001.bin


More information about the biojava-dev mailing list