[Biojava-dev] Proposed change to RichFormat interface

mark.schreiber at novartis.com mark.schreiber at novartis.com
Wed Jun 7 06:02:51 UTC 2006


That might be a more elegant solution.

Could even make the InputStream implement RichSeqIOListener thus it would 
be sending data to the RichFormat and listening to what the RichFormat 
makes of the data.

The InputStreamIOListener could remember when the RichFormat emits a 
startXXX() event record the line number and start buffering all the data 
sent as the readLine() requests are made (while also sending it to the 
RichFormat). When the RichFormat emits the corresponding endXXX() event 
the buffer can be cleared and the process starts again.

Only problem might be what to do when the RichFormat consumes data in 
between emitting events (which is allowed).

- Mark





Michael Heuer <heuermh at acm.org>
Sent by: Michael Heuer <heuermh at shell3.shore.net>
06/07/2006 01:51 PM

 
        To:     mark.schreiber at novartis.com
        cc:     biojava-dev at biojava.org
        Subject:        Re: [Biojava-dev] Proposed change to RichFormat interface


Mark Schreiber wrote:

> Hi all -
>
> I would like to propose a change  to the RichFormat interface. I think 
we
> should do this now as we haven't done a stable biojavax roll out yet so
> interface
> changes should still be allowed. The additional methods would be:
>
> public String currentLine();
> public int currentLineNumber();
>
> This would make debugging a lot easier, it would also make construction 
of
> a RichSeqIOListener that logs and debugs much easier. I was trying to do
> this a while back. I started a background process that parsed 6GB of
> genbank records looking for records that failed. It worked ok but would 
be
>
> much better with the ability to query the RichFormat in the above way. 
We
> might even be able to make it  a utility that people could run on 
suspect
> files and generate standard bug reports to make it easier for us to 
debug
> the parser code.
>
> What do people think??

Another possibility would be to leave this sort of progress tracking up
to the client, in that they could wrap the InputStream in something like
an CountingInputStream before passing it to the parser(s):

http://jakarta.apache.org/commons/io/api-release/org/apache/commons/io/input/CountingInputStream.html

   michael







More information about the biojava-dev mailing list