[Biojava-dev] Proposed change to RichFormat interface
mark.schreiber at novartis.com
mark.schreiber at novartis.com
Thu Jun 8 01:03:22 UTC 2006
Very cool!
Can you put this example in the cookbook?
- Mark
Richard Holland <richard.holland at ebi.ac.uk>
Sent by: biojava-dev-bounces at lists.open-bio.org
06/07/2006 08:36 PM
To: Mark Schreiber <mark.schreiber at novartis.com>
cc: biojava-dev <biojava-dev at biojava.org>, Michael Heuer <heuermh at acm.org>,
Michael Heuer <heuermh at shell3.shore.net>
Subject: Re: [Biojava-dev] Proposed change to RichFormat interface
Hi guys.
See org.biojavax.seq.io.DebuggingRichSeqIOListener.
It extends BufferedInputStream, so can be used to wrap a normal
InputStream before being passed around.
It also implements RichSeqIOListener.
The idea is that you do something like this:
Namespace ns = RichObjectFactory.getDefaultNamespace();
InputStream is = new
FileInputStream("myFastaFile.fasta");
FASTAFormat format = new FASTAFormat();
DebuggingRichSeqIOListener debug =
new DebuggingRichSeqIOListener(is);
BufferedReader br = new BufferedReader(
new InputStreamReader(debug));
SymbolTokenization symParser =
format.guessSymbolTokenization(debug);
format.readRichSequence(
br,
symParser,
debug,
ns);
This will then dump out everything as it is read, and all events as they
happen in-line with the input as it is interpreted.
Hope this helps?
cheers,
Richard
On Wed, 2006-06-07 at 14:02 +0800, mark.schreiber at novartis.com wrote:
> That might be a more elegant solution.
>
> Could even make the InputStream implement RichSeqIOListener thus it
would
> be sending data to the RichFormat and listening to what the RichFormat
> makes of the data.
>
> The InputStreamIOListener could remember when the RichFormat emits a
> startXXX() event record the line number and start buffering all the data
> sent as the readLine() requests are made (while also sending it to the
> RichFormat). When the RichFormat emits the corresponding endXXX() event
> the buffer can be cleared and the process starts again.
>
> Only problem might be what to do when the RichFormat consumes data in
> between emitting events (which is allowed).
>
> - Mark
>
>
>
>
>
> Michael Heuer <heuermh at acm.org>
> Sent by: Michael Heuer <heuermh at shell3.shore.net>
> 06/07/2006 01:51 PM
>
>
> To: mark.schreiber at novartis.com
> cc: biojava-dev at biojava.org
> Subject: Re: [Biojava-dev] Proposed change to RichFormat
interface
>
>
> Mark Schreiber wrote:
>
> > Hi all -
> >
> > I would like to propose a change to the RichFormat interface. I think
> we
> > should do this now as we haven't done a stable biojavax roll out yet
so
> > interface
> > changes should still be allowed. The additional methods would be:
> >
> > public String currentLine();
> > public int currentLineNumber();
> >
> > This would make debugging a lot easier, it would also make
construction
> of
> > a RichSeqIOListener that logs and debugs much easier. I was trying to
do
> > this a while back. I started a background process that parsed 6GB of
> > genbank records looking for records that failed. It worked ok but
would
> be
> >
> > much better with the ability to query the RichFormat in the above way.
> We
> > might even be able to make it a utility that people could run on
> suspect
> > files and generate standard bug reports to make it easier for us to
> debug
> > the parser code.
> >
> > What do people think??
>
> Another possibility would be to leave this sort of progress tracking up
> to the client, in that they could wrap the InputStream in something like
> an CountingInputStream before passing it to the parser(s):
>
> http://jakarta.apache.org/commons/io/api-release/org/apache/commons/io/input/CountingInputStream.html
>
> michael
>
>
>
>
> _______________________________________________
> biojava-dev mailing list
> biojava-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-dev
--
Richard Holland (BioMart Team)
EMBL-EBI
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
UNITED KINGDOM
Tel: +44-(0)1223-494416
_______________________________________________
biojava-dev mailing list
biojava-dev at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biojava-dev
More information about the biojava-dev
mailing list