[Bioperl-l] Walking multiple bioentries using bioperl-db
Jay Hannah
jay at jays.net
Wed Jul 19 09:43:52 EDT 2006
Howdy --
I'm using bioperl-db + biosql-schema + mySQL.
I can now successfully build a biosql-schema instance in mySQL, load
taxonomy, then using bioperl-db load a GenBank file from disk, commiting
the sequences I want. For a given accession number + version + namespace,
I can tell bioperl-db to delete that from mySQL and it does. Yay!! I'll be
throwing a "Using bioperl-db" document onto the wiki over the next week.
What I am current baffled by:
How do I ask bioperl-db to walk over multiple bioentries in my database so
I can do things with them? The simplest possible example: print a list of
all bioentries in my database.
It is trivially easy to just query mySQL directly, but if I'm reading /
understanding the documentation correctly bioperl-db intends to be
database schema and RDBMS agnostic. In that case, I should use bioperl-db
to walk my records. So, how do I do that?
Is Bio::DB::Query::BioQuery the way to do this? The only way?
If so then can someone help me understand the datacollections() and
where() methods?
perldoc Bio::DB::Query::BioQuery
# all mouse sequences loaded under namespace ensembl that
# have receptor in their description
$query->datacollections(["Bio::PrimarySeqI e",
"Bio::Species=>Bio::PrimarySeqI sp",
"BioNamespace=>Bio::PrimarySeqI db"]);
$query->where(["sp.binomial like 'Mus *'",
"e.desc like '*receptor*'",
"db.namespace = 'ensembl'"]);
# all mouse sequences loaded under namespace ensembl that
# have receptor in their description, and that also have a
# cross-reference with SWISS as the database
$query->datacollections(["Bio::PrimarySeqI e",
"Bio::Species=>Bio::PrimarySeqI sp",
"BioNamespace=>Bio::PrimarySeqI db",
"Bio::Annotation::DBLink xref",
I'm bewildered by this API. Please forgive my ignorance.
1) How do I get *all* bioentries out of my database?
2) Say I did want just the "namespace" 'Pico' (one of my
biodatabase.name's). Where did
"BioNamespace=>Bio::PrimarySeqI db"]);
come from? How was I supposed to figure out the left hand side of that
mapping? The right hand side? If that line wasn't sitting in that document
was there a way for me to figure it out as a *user* of bioperl-db? Or
would I need to be a *programmer* of bioperl-db reading source to figure
this out? Where did
"db.namespace = 'ensembl'"]);
come from? Again, do I have to read source code to know how to invoke
that magic?
Sorry if I sound like a jerk. That is not my intention. Hopefully I can
document the answers for future bioperl-db'ers.
Thanks in advance,
j
my current plaything: http://openlab.jays.net
More information about the Bioperl-l
mailing list