[Biojava-dev] Newbie design question

Chris Abajian chrisa at espressosoftware.com
Thu Jul 17 18:08:02 EDT 2003


Hi.  I'm new to biojava but have already found it useful.  I was hoping
to contribute some code & wanted suggestions on the design and
implementation.

I recently ported sputnik (http://abajian.net/sputnik) to Java, mostly
because I got tired of people emailing me with problems compiling and
running it on different platforms.  Sputnik is a utility to search DNA
sequence for microsatellite repeats (which are basically repeated
patterns of 2-5 ntides with the occasional error,e.g.
"CACACACACCACACACAACCACACACA")

I got two suprises:

- On a large test file, Sun 1.4 outperformed the C (gcc) version by
almost exactly 2x (cpu time).  This is with a direct port of the
algorithm, no attempt at optimization, same unloaded CPU/OS.  Java
rocks.

- The biojava Fasta file parsing was really easy to use and SymbolList
was measurably faster than my own trivial Stringbuffer implementation.

What's the best way to return the results of the search?  Adding
Features?  The original utility produced an ugly & difficult to parse
report to stdout.  I can imagine at least three output "formats" of the
results, including other biojava apps (i.e. a java class), a display
utility and possibly an XML format file to be imported into other
systems.

Where does it belong in the class heirarchy (if it does)?  I realize
that these are questions I could answer myself, but the docs are a bit
daunting and even then it takes a while to learn the local "style" of a
package.  What would be a good interface for the SatelliteFinder class? 
I'm hoping to get some design advice at this stage to make it more
intuitive and consistent with the rest of biojava.

OK, I'm lazy and I want someone to tell me what to do ;-)  Reply offline
if not of general interest.

Thanks.


-- 
Chris Abajian
Espresso Software Development, L.L.C.
http://espressosoftware.com
206.910.4903

Espresso Software Development provides software development and
consulting services. We develop, deploy and support scalable,
multi-tiered, high-availability web, e-commerce and data-processing
applications.





More information about the biojava-dev mailing list