[Biojava-l] BioJava-X parsing of RichSequences

Thu May 4 09:02:10 UTC 2006

I have added the capability to guess the format of streams, and read
directly from them. See RichSequence.IOTools.readStream() for details.

In CVS biojava-live now.

cheers,
Richard

On Wed, 2006-05-03 at 19:54 +0200, Egon Willighagen wrote:
> On Wednesday 03 May 2006 18:50, Francois Pepin wrote:
> > How much of the file is generally being read before the guess is made?
> > I'm thinking very little is needed, especially compared to how much
> > memory Java usually takes.
> 
> Generally not much. Jmol uses 16384 bytes.
> 
> > It would not be very difficult to save that first part of the stream and
> > then play it back once the guess is made.
> 
> See how Jmol does it:
> 
> http://svn.sourceforge.net/viewcvs.cgi/jmol/trunk/Jmol/src/org/jmol/adapter/smarter/Resolver.java?view=markup
> 
> > I kind of like the idea of using streams, in cases where you are not
> > reading from a file. Having to write everything to a temporary file to
> > satisfy the API isn't a very appealing solution, I think.
> >
> > I could code something up if people are interested.
> 
> An additional advantage is that you get .gz support in one go:
> 
>       BufferedInputStream bis = new BufferedInputStream((InputStream)t, 8192);
>       InputStream is = bis;
>       bis.mark(5);
>       int countRead = 0;
>       countRead = bis.read(abMagic, 0, 4);
>       bis.reset();
>       if (countRead == 4 &&
>           abMagic[0] == (byte)0x1F && abMagic[1] == (byte)0x8B)
>         is = new GZIPInputStream(bis);
> 
> where t is your InputStream, and is the stream to use after the gzip 
> check/unzip. For the full working code, see again Jmol CVS:
> 
> http://svn.sourceforge.net/viewcvs.cgi/jmol/trunk/Jmol/src/org/jmol/viewer/FileManager.java?view=markup
> 
> Egon
> 
-- 
Richard Holland (BioMart Team)
EMBL-EBI
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
UNITED KINGDOM
Tel: +44-(0)1223-494416