[Bioperl-l] my bioperl-db hacks

Hilmar Lapp hlapp at gmx.net
Tue Dec 30 14:14:09 EST 2003


Note that bioperl-db 0.1 has been outdated since about a year now. It  
won't work with the present biosql schema either. In order to use 0.1  
you will also need to use a pre-Singapore version of biosql.

The current and interoperating versions of bioperl-db and biosql are  
the respective cvs HEADs.

	-hilmar

On Tuesday, December 30, 2003, at 09:00  AM, T.D. Houfek wrote:

> I'm monkeying around with bioperl-db 0.1, trying to see what I can get
> it to do.  I set about  following some instructions that tell
> you how to use the "load_seqdatabase.pl" script to fill your bioperl
> database with sequence from a swissprot release file.  (I am using
> sprot42.dat).  This did not work for me initally, but I made some
> vicious hacks to the code and now the script seems to work more or
> less.  It's this "more or less" I'd like comments on... I suspect other
> things may have broken because of what I have done, and that someone  
> who
> knows the code can help me to find a more stable solution.
>
> I think the problem is arising when in parsing the sprot42.dat file,
> Bioperl encounters a record with a feature whose location must be
> expressed as a Bio::Location::Fuzzy object.  The inline documentation  
> of
> biosqldb-mysql indicates that Fuzzy objects are not supported yet
> (but gives you an idea of where you could start if you wished to do  
> so).
>
> Anyway, I first encountered an exception around line 169, of
> Bio/DB/SQL/SeqLocationAdaptor.pm where a check is made to see whether
> $location->isa() isa the righta kinda of object.
>
> I just added the Fuzzy objects to the list of invited guests:
>
> # --start snippet ---------------------
>         if( $location->isa('Bio::Location::SplitLocationI')  ) {
>                my $rank = 1;
>                foreach my $sub ( $location->sub_Location ) {
>                    $self->_store_component($sub,$seqfeature_id,$rank);
>                    $rank++;
>                }
>            } elsif( $location->isa('Bio::Location::Simple') ) {
>                $self->_store_component($location,$seqfeature_id,1);
>            } elsif( $location->isa('Bio::Location::Fuzzy') ) {
>                 $self->_store_component($location,$seqfeature_id,1);
>            } else {
>                $self->throw("Not a simple location nor a split nor a
> fuzzy. Says its a $location->type.  Yikes");
>
>            }
> # -- end snippet ----------------------
>
>
> Once I fixed this the only thing that broke was around line 208.
> Probably because of the normal behavior supporting Fuzzy locations (but
> of course I mention it in case it is bad behavior) some locations  
> passing
> through this section of code were missing either starts or ends.  The
> $start and $end variables were set to the null string, and the SQL  
> insert
> sequence they were passed into failed.  Failure in depositing one entry
> would terminate the script (but did not undo prior inserts).
>
> With a two-line hack circa 208 I sidestepped outright failures.  I just
> made forced uninitialized endpoints to be zero:
>
> 	# -- start snippet -------
>
> 	unless ($end) { $end=0; }       ## ADDED THESE TWO
> 	unless ($start) { $start=0; }   ## LINES HERE
>
>    	my $sth = $self->prepare("insert into seqfeature_location
> (seqfeature_location_id,seqfeature_id,seq_start,seq_end,seq_strand,loca 
> tion_rank) VALUES (NULL,$seqfeature_id,$start,$end,$strand,$rank)");
>
> 	# -- end snippet ---------
>
> Of course all I have really done is provide for a completely buggy
> persistence of Fuzzy objects.
>
> My guess is that SeqLocationAdaptor needs to be upgraded to handle the
> Fuzzy locations that Bioperl wants to make out of the Swissprot input.
> Is anyone already undertaking this?  Does anyone have any insight  
> about what
> problems this hack of mine will cause downstream?
>
>
> -------------------------------
> T.D. Houfek
> (email sound-alike: tdhoufek-AT-unity-DOT-ncsu-DOT-edu
> bioinformatics development lead
> Tobacco Genome Initiative
> North Carolina State University
> -------------------------------
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------




More information about the Bioperl-l mailing list