[emboss-dev] GFF3 in EMBOSS

Pjotr Prins pjotr.public78 at thebird.nl
Thu Aug 12 07:57:55 EDT 2010


On Thu, Aug 12, 2010 at 11:52:23AM +0100, Peter Rice wrote:
> We are looking into storing data structures for large datasets on disk -  
> not only for features but also for next-generation mapped reads.

That is a great idea! The first quick-win is not to load sequence
data in memory, but fetch it on demand using a seek index. Something
that BioPerl has.

> Can you give an example of the input you are trying to handle?

I am dealing with Worms - Wormbase uses gff3 for some worms. EMBOSS,
is already memory efficient, compared to BioRuby/Python/Perl - so I
am thinking of a BioLib mapping. A writeup is here:

  http://thebird.nl/biolib/Adding_BioLib_EMBOSS_GFF3_Support.html

> I hope to explore these issues at the GMOD meeting in Cambridge (UK) soon.

It makes sense for (desktop) genome browsers, for one.

Pj.


More information about the emboss-dev mailing list