[Bioperl-l] Announcing Bio::SFF

Leon Timmermans l.m.timmermans at students.uu.nl
Mon Dec 19 15:48:34 UTC 2011


On Mon, Dec 19, 2011 at 3:31 PM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> I presume that's what Roche uses if they keep the index on disk.
>
> The alternative is to load the index into RAM, which is really fast.
> You just open the SFF, read the header, seek to the index, load
> the index. Without the index, you have to scan the entire SFF file
> to find each record and its offset - which is much slower.
>

That's what I'm doing now. It's much faster, but it still takes a
noticeable amount of time on large files.

Have you looked at the sample SFF data in Biopython? Please
> use them for the BioPerl unit tests (we're been talking about a
> cross project collection of test data files like this), the README
> file should be self-explanatory:
> https://github.com/biopython/biopython/tree/master/Tests/Roche
>

Yeah, I'm using those now (
https://github.com/Leont/bio-sff/blob/master/t/reader.t). I must say there
were some interesting corner cases in it.

Leon



More information about the Bioperl-l mailing list