[Bioperl-l] Reading all sequences using Bio::DB::Flat in SwissProtfile

Chris Mungall cjm at fruitfly.org
Fri Jan 21 12:32:33 EST 2005


Brian,

Unfortunately the id_parser method isn't supported in
Bio::Index::Swissprot

Even if it was I don't think it would be sufficient here - Kenny needs to
index using the feature fields. This implies that the search key wouldn't
be unique. Bio::Index::Abstract requires a unique key for the index.

Flexible indexing and retrieval such as this is best handled using some
generic non-bioperl specific solution - RDB, XMLDB, SRS, Lucene, LuceGene
etc

I forgot to mention Don Gilbert's LuceGene in my original reply - it's a
fairly sane open-source alternative to SRS. It handles lots of
bioinformatics file formats (not sure about swissprot but I'm sure it
could be added)

See:
http://www.gmod.org/lucegene/index.shtml

Cheers
Chris

On Fri, 21 Jan 2005, Brian Osborne wrote:

> Kenny,
>
> Did you take a look at Bio/Index/Swissprot.pm? What's important for you will
> be building the index using the keys you're interested in as opposed to the
> default key, using the id_parser method. See the Bio::Index section in the
> bptutorial for an example.
>
> Brian O.
>
> -----Original Message-----
> From: bioperl-l-bounces at portal.open-bio.org
> [mailto:bioperl-l-bounces at portal.open-bio.org]On Behalf Of Daily,
> Kenneth Michael
> Sent: Wednesday, January 19, 2005 11:49 AM
> To: bioperl-l at portal.open-bio.org
> Subject: [Bioperl-l] Reading all sequences using Bio::DB::Flat in
> SwissProtfile
>
>
> I want to work with a local copy of the SwissProt database, and need to
> search through all of the entries. I only see methods to return sequences by
> accession. However, I cannot use just FASTA format of the SwissProt records,
> as I need to use the feature fields. What I need to learn is how to do a DB
> search on the features field of the SwissProt records, if its possible.
> Would there be any advantage do doing it with the DB instead of just using
> SeqIO as an input stream? I think it might, since every time I want to do a
> search I must read in the entire file again, which is very costly. Thank
> you.
>
> Kenny Daily
> Indiana University
> School of Informatics
> kmdaily [at] indiana [dot] edu
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>


More information about the Bioperl-l mailing list