[Bioperl-l] Re: Bio::DB::GFF trouble + Blast question

Venky Nandagopal venky at OCF.Berkeley.EDU
Thu Jul 3 16:33:26 EDT 2003


Thanks Scott, that indeed was the problem. I dropped and rebuilt the 
database and it seems to work fine (i think) now, although I cant figure 
out how the database got screwed up in the first place. Is there a way to 
check that the database has been loaded completely/correctly?

Venky

On 03 Jul 2003 16:21:15 -0400, Scott Cain <cain at cshl.org> wrote:

> Venky,
>
> Fair enough--if I change my example script to have the @genes line like
> yours, it works for CG6667 and several other nearby numbers.
>
> I am more inclined to distrust the database itself.  Perhaps a row in
> the database is corrupted.  Try these queries:
>
> mysql> select * from fgroup where gname='CG6667';
> +------+--------+--------+
> | gid  | gclass | gname  |
> +------+--------+--------+
> | 8185 | Gene   | CG6667 |
> +------+--------+--------+
>
> mysql> select * from fdata where gid=8185; (using the gid from above)
> +-------+------+----------+----------+---------------+---------+-------- 
> +---------+--------+------+---------------+--------------+
> | fid   | fref | fstart   | fstop    | fbin          | ftypeid | fscore | 
> fstrand | fphase | gid  | ftarget_start | ftarget_stop |
> +-------+------+----------+----------+---------------+---------+-------- 
> +---------+--------+------+---------------+--------------+
> | 23255 | 2L   | 17414916 | 17428443 | 100000.000174 |       5 |   NULL | 
> -       | NULL   | 8185 |          NULL |         NULL |
> +-------+------+----------+----------+---------------+---------+-------- 
> +---------+--------+------+---------------+--------------+
>
> select * from fdna where fref='2L' and foffset>=17414000 and 
> foffset<=17430000;
>
> which should give 9 rows of dna chunks that are 2000 bases long.
>
> If any of your results are different from mine (excluding ids), then I
> think you database has a problem.
>
> Scott
>
>
>
> On Thu, 2003-07-03 at 15:56, Venky Nandagopal wrote:
>> Scott,
>>
>> Thanks for the reply. I should have been more careful with my email -- 
>> my script actually has the following lines
>>
>> @genes = $db->get_feature_by_name("Gene" => $gene_id);
>> print $genes[0]->seq;
>>
>> The script works for every other CG number I've tried --- it only fails 
>> for CG6667, which makes me think that there must be something wierd 
>> going on with Bio::DB::GFF, not the script.
>>
>> Venky
>>
>>
>>
>> On 03 Jul 2003 13:46:08 -0400, Scott Cain <cain at cshl.org> wrote:
>>
>> > Venky,
>> >
>> > It is not all that clear to me why in this case you need to, but you
>> > need to specify the class of the object, in this case 'Gene'.
>> >
>> > Here is an example script that works for me:
>> >
>> > #!/usr/bin/perl
>> > use strict;
>> > use Bio::DB::GFF;
>> > my $db = Bio::DB::GFF->new(-adaptor => 'dbi::mysql',
>> > -dsn     => 'fly');
>> > my @genes = $db->get_feature_by_name(-class=>'Gene',-name=>'CG6665');
>> > print $genes[0]->seq,"\n";
>> >
>> > Scott
>> >
>> > On Thu, 2003-07-03 at 11:58, bioperl-l-request at portal.open-bio.org
>> > wrote:
>> >> Message: 12
>> >> Date: Thu, 03 Jul 2003 01:43:57 -0700
>> >> From: Venky Nandagopal <venky at OCF.Berkeley.EDU>
>> >> Subject: [Bioperl-l] Bio::DB::GFF trouble + Blast question
>> >> To: bioperl-l at portal.open-bio.org
>> >> Message-ID: <oprrp7vjpxqwe008 at mail.ocf.berkeley.edu>
>> >> Content-Type: text/plain; charset=utf-8; format=flowed
>> >>
>> >> Hi,
>> >>
>> >> I have a couple of problems: (1) I use a database created using >> 
>> process_gadfly.pl to access the D.mel genome, via Bio::DB::GFF. I have a 
>> >> utility script that returns the sequence of a gene given the CG 
>> number, >> using 	@genes = get_feature_by_name(CG####); 	print 
>> $genes[0]->seq;
>> >> This script seems to work fine for most CG numbers, except for 
>> CG6667, >> which is the ID for the dorsal gene. For some reason, no 
>> sequence is >> returned by the seq() method. The gene object is not 
>> undefined though, >> since $genes[0]->asString returns 
>> "gene:gadfly(CG6667)"; similarly the >> start, end, strand methods work 
>> fine. I have tried getting transcripts >> instead of the gene etc etc, 
>> but CG6667 refuses to yield any sequence. >> Can anyone provide an 
>> explanation for this?
>> >>
>> >>
>> >> (2) This is not directly connect to Bioperl, but BLAST reports 
>> sometimes >> provide Expect values in the form "Expect(3)=0.0". What 
>> does the 3 refer >> to? Sometimes it says "Expect(7+)=1e-23".
>> >>
>> >>
>> >> Thanks
>> >> Venky
>> >



-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/


More information about the Bioperl-l mailing list