[Biopython-dev] [Bug 2837] Reading Roche 454 SFF sequence read files in Bio.SeqIO

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Fri Sep 4 10:23:26 UTC 2009


http://bugzilla.open-bio.org/show_bug.cgi?id=2837





------- Comment #6 from biopython-bugzilla at maubp.freeserve.co.uk  2009-09-04 06:23 EST -------
I've been working on the Roche SFF indexes, and via their tools have discovered
there are at least two index block formats used:

Most SFF files I have looked at have an index block which starts ".mft1.00"
(short for Manifest v1.00 is my guess) which hold both an XML "manifest" or
meta data, plus a read offset index.

You can also get SFF files where the index block starts ".srt1.00" (Short Read
Table v1.00 maybe?) which have just an index.

The indexes details themselves are the same in both cases, and support
arbitrary read name lengths. The offset is in base 255 (not 256), apparently so
that byte 255 (0xFF) can be used as a separator character. For typical Roche
SFF files, the read names are 14 characters, and the index uses 20 bytes per
read.


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list