[Bioperl-l] Bio::Graphics

Lincoln Stein lstein@cshl.org
Tue, 5 Feb 2002 08:12:52 -0500


I appreciate your letting me check in DB::GFF.  I suppose it sticks
out a bit from the rest of Bioperl, but I am changing the
documentation to expose the Bioperl API and mute the Ace API aspects
of the thing.  Bio::Graphics shouldn't stick out since it uses
Bio::SeqI slavishly.

On the subject of the BioSQL database, Chris Mungall's Gadfly API
happened to be close enough to DB::GFF so that the Berkeley group was
able to get the generic genome browser running on top of it pretty
quickly.  Since BioSQL is now morphing into a more general purpose
annotation database, I would like to explain the parts of the DB::GFF
API that are extensions to Bio::SeqIO that the genome browser depends
on.  If you happen to borrow these API components for BioSQL, then the
browser will run on top of BioSQL from day one.

 $db = Bio::DB::GFF->new(...);          # bioperl compliant
 $seq    = $db->get_Seq_by_id($id);             # bioperl compliant
 $seq    = $db->get_Seq_by_acc($id);            # bioperl compliant
 $seq    = $db->get_Seq_by_XXX($id);            # bioperl compliant
 $stream = $db->get_Stream_by_id([$id,...]);    # "bioperl compliant"
 $stream = $db->get_Stream_by_batch([$id,...]); # "bioperl compliant"
 $seq    = $stream->next_seq;                   # bioperl compliant
 # NOTE: actually, neither of the get_Stream_by_XXX calls is part of
 #       RandomAccessI; DB::SwissProt uses the batch form and DB::GenBank
 #       uses the by_id() form

 # My extensions

 # Title: segment()
 # Construct a Bio::SeqI object based on the name of a landmark,
 # and optionally the start and/or end of the segment to retrieve.
 # Whatever needs to be done to span the assembly happens at this
 # point.  Think of this as a lightweight make_virtual_contig() 
 # call.

 $segment = $db->segment(-name=>$name,-start=>$start,-end=>$end);

 # Title: absolute()
 # Toggle on and off relative coordinate addressing.  Segments
 # start out as relative to the landmark named in the segment()
 # call; passing a true flag to absolute will force coordinates
 # derived from the segment to be absolute to the highest container.
 # (if you don't want to implement relative coordinate addressing, 
 # then just make everything absolute by default).

 $segment->absolute([$flag])

 # Title: features()
 # This is also called all_SeqFeatures() for Bioperl compatibility
 # but it has different calling conventions.  It returns all
 # features that overlap the segment, optionally filtering
 # them by their type.  The type is a string, and can be
 # a DAML/OIL path once Mike Ashburner, Suzi and I publish
 # the DAS feature ontology.

 @seq = $segment->features(-type=>['type1','type2','type3'],
                           @other_options_you_dont_care_about);

 # Title: get_feature_stream()
 # As above, but fetches a sequence stream.  This is also called
 # get_seq_stream() for Bio::SeqIO compatibility, but it takes
 # different arguments, so I thought it best to rename it.
 $stream = $segment->get_feature_stream(-type=>['type1','type2','type3'],
                                        @other_options_you_dont_care_about);

 # Title: contained_features(), contained_in(),
 #        get_contained_features_stream(), get_contained_in_stream()
 # These retrieve features based on other types of relative location
 # information

 # Title: $db->features(), $db->get_feature_stream()...
 # You can call the features() family directly on the
 # database object, to suck out all its features...

 # Title: text_search()
 # No database inspired by NCBI would be complete without a full-text search...
 $stream = $db->get_text_search_stream('text to search')

 # Title: attribute searches
 # Simple attribute search.  Keys of the hash are the attribute
 # names, values are desired values to match.  All matches are
 # exact strings, and multiple attributes are ANDed together.
 # (I'm not particularly enthusiastic about this; it's a hack)
 $stream = $db->get_feature_stream(-attributes=> \%attribute_hash)

 # Pass thru a SQL query to the database.  Must be done very
 # carefully in order to reconstruct the objects properly...
 $stream = $db->get_feature_stream(-query=>'SQL QUERY')

Lincoln

Ewan Birney writes:
 > On Mon, 4 Feb 2002, Lincoln Stein wrote:
 > 
 > > Actually Bio::Graphics does introduce a dependency on the GD module, which is 
 > > usually found on Linux distributions, but not universal.  I hadn't thought of 
 > > that.
 > > 
 > > I could add Bio::Graphics to the existing bioperl-gui package, but I'm not so 
 > > keen to do that since it is hidden on the FTP site.  From the bioperl.org 
 > > main page takes two clicks and a cut-and-paste to get to the bioperl-gui 
 > > distribution.  Yargs.
 > 
 > 
 > It is an debate point. GD dependancy is not as great as the TK dependency
 > the rest of -gui has, and is more "server side" (argument for putting it
 > into bioperl-live). But... splitting things up is good otherwise we just
 > have a behmouth of a system ---- but it is relatively well structured
 > directory-wise.
 > 
 > 
 > Of course, we let Lincoln check in DB::GFF because we like him and wanted
 > him to contribute so....
 > 
 > 
 > Oh Vey. I don't know.
 > 
 > 
 > I vote marginally for putting it into bioperl-live
 > 
 > 
 > 
 > 
 > _______________________________________________
 > Bioperl-l mailing list
 > Bioperl-l@bioperl.org
 > http://bioperl.org/mailman/listinfo/bioperl-l

-- 
========================================================================
Lincoln D. Stein                           Cold Spring Harbor Laboratory
lstein@cshl.org			                  Cold Spring Harbor, NY
Positions available at my lab: see http://stein.cshl.org/#hire
========================================================================