[Bioperl-l] QUestions about Bio::Index and Bio::DB

morgarws@mh.us.sbphrd.com morgarws@mh.us.sbphrd.com
Tue, 02 Apr 2002 16:54:38 -0500


I have a couple of questions about these two methods for random access into 
FASTA files:

1. Since a GB FASTA file normally has gi and gb identifiers concatenated 
together it appears the Bio::Index then can only access a sequence by the 
concatenated ID. Is this the expected correct behavior?

2. Bio::DB::Fasta seems to be able to generate an index (via the -makeid 
option) for one of the IDs but not for all that appear, ie the routine 
specified with the makeid option is supposed to return a scalar. Is it planned 
or in the works to allow the makeid routine to return a list of IDs that the 
sequence is indexed by?

3. Both the latest version of WashU BLAST and (I believe though haven't 
checked for myself) NCBI BLAST have the ability to generate an index file of 
the FASTA file which can be used to randomly access the sequences in that file 
(WASHU provides this with its xdformat and xdget commands). Is anybody working 
on adding support to either Bio::Index or Bio::DB for these indexes? And if 
someone wanted to do it for which module would it make the most sense to add 
it to?

Thanks in Advance,

Bill Morgart
morgarws@molbio.sbphrd.com