[Bioperl-l] Beginner Script Error

Robert Bradbury robert.bradbury at gmail.com
Mon Sep 14 19:34:52 UTC 2009


On 9/13/09, Cavin Ward-Caviness <cavin.wardcaviness at gmail.com> wrote:

> $seq = get_sequence('swiss',"ROA1_HUMAN");

Well, I haven't looked at the documentation or the source, but the
code I've got which does work which does a similar function is:
             # database options include: Swissprot, EMBL, GenBank and RefSeq
            $seq_object = get_sequence('swissprot', $seqname);

I think the names have to be string specific but may not need to be
case specific.  The seqname's also tend to be database format
specific, so my "general" function fetch will catch exceptions and
then try other databases, if for example it looks like a PDB
identifier.  I'm not sure whether there is a library function which
fetches a "general" sequence based on the sequence name format.
Presumably one could do something like this with some kind of
"prioritized" list of databases to go through, e.g. GenBank, EMBL,
SwissProt, RefSeq, PDB, JDB, JGI, Broad, NCBI, C. elegans, Drosophila,
Yeast, other organism specific databases.  It might be nice if there
were a "general" BioPerl function that would do this based on sequence
name format, locality (fetch from the nearest database),
up-to-dated-ness, ultimately one might like to have kind of a sequence
"rsync" function that of the form  UpdateSequence(SeqName, prefDb,
last-update-date, update-size, update-md5sum, ...) which would perform
inexpensive network-based updates for gene-sets of interest.  I'm
presuming that many sequence entries in active databases are
undergoing periodic updates and thus one might be interested in weekly
or monthly "local" db updates.

Robert



More information about the Bioperl-l mailing list