[Bioperl-l] question of seq() method of Bio::DB::GFF and Bio::PrimarySeq

Florent Angly florent.angly at gmail.com
Thu Jan 27 18:16:45 UTC 2011


Hi Shuai,
Scott answered your question, but you can find more information 
pertinent to your question here:
http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Getting_Sequences
Best,
Florent

On 27/01/11 08:37, Scott Cain wrote:
> Hi Shuai,
>
> The seq method does not return a sequence string, but a sequence
> object (Bio::Seq), on which you can perform many methods, including
> just getting the sequence string, but you can also doing things like
> translating and getting the reverse complement.  To get the DNA
> string, call the seq() method on your Bio::Seq object:
>
>    my $upstream->seq->seq;
>
> The first invocation of seq is calling the seq method in Bio::DB::GFF,
> which returns the Bio:Seq object, and the second call of seq calls the
> seq method in Bio::Seq, which returns a string.  The dna method in
> Bio::DB::GFF is a short cut for doing exactly this (I think--it's been
> a while since I've used that).
>
> Scott
>
>
> 2011/1/27 Zhan, Shuai<Shuai.Zhan at umassmed.edu>:
>> Dear BIOPERL developers,
>>
>> I'm a post doc in UMASS Medical School.
>>
>> Recently I carried out some genome-wide annotation work, but faced some difficulty with using Bio::DB::GFF to fetch DNA sequences, whose seq() method is required by another critical softwares.
>>
>> I found the key issue is that the method of seq() of Bio::DB::GFF (inherited from Bio::PrimarySeq?) can't exactly return the actual sequence string but something like reference. This question made me upset for several days and stoped my progress, but I have not found any related help from google or my friends. So I'm assuming you could give me some valuable information.
>>
>> My perl is 5.8.8 and system is CentOS 5.3.
>> I loaded gff file with annotation information both of CDS and reference by bp_load_gff.pl. The raw fasta sequence file was also loaded together.
>>
>> I wrote a test script just like your instruction on website.
>> [code]
>> use Bio::DB::GFF;
>> my $db = Bio::DB::GFF->new(-dsn =>  'dbi:mysql:brenneri', -user =>  'me', -password =>  '123',);
>> my $contig0 = $db->segment(-class =>  'Sequence', -name =>  'Contig0');
>> my $subcontig = $contig0->subseq(1,100);
>> print "returned by seq(): ", $upstream->seq, "\n";
>> print "returned by dna(): ", $upstream->dna, "\n\n";
>> my $transcripts = $db->segment(-class =>  'Transcript', -name =>  'fgp17221.t1');
>> my @exons = $transcripts->features('CDS');
>> print "returned by seq(): ", $exons[0]->seq "\n";
>> print "returned by seq(): ", $exons[0]->dna "\n\n";
>> my $upstream = $exons[0]->subseq(-20,0);
>> print "returned by seq(): ", $upstream->seq "\n";
>> print "returned by seq(): ", $upstream->dna "\n\n";
>> [/code]
>>
>> The result as follows:
>> $perl test.pl
>> returned by seq(): Bio::PrimarySeq=HASH(0x1afa40d0)
>> returned by dna(): GTAGATCACTTTTTATTCTCGAAGAATAATTTTTGAGCTATGTAAAAGCGTGTATACCTCCCAGTTTTCCGGTTAAAAGTCCAGGATCGCATGTCTCATG
>>
>> returned by seq(): Bio::PrimarySeq=HASH(0x1afa3ab0)
>> returned by dna(): AAGGAGCTCAACTTCAAACACCAAACACTTTGAACCAACATTTTTCTGGAGAAAACGTAAGAATTGAAAAGTCAAGAATGAATCCACATGTGAAAATTGAACAGAGAT
>> CAATGGCAATGGGTGGAGGAGGAGGGGATGAACAAATGAAAAACTTTACGGAAATGACGAATGAAGAACTCCGTGAGCGGTTAATGAAAATGCAAATGGATATGCAGAATCTTC
>> AAATGGCAATGG
>>
>> returned by seq(): Bio::PrimarySeq=HASH(0x1afa3fb0)
>> returned by dna(): GGAGTTTGGACCGTGTTTCAG
>>
>> Because the method of dna() could exactly return the actual sequence, so I think my loaded database should work. But I have no idea of why the method seq() missed the sequence.
>>
>> I'd greatly appreciate any help.
>>
>> Sincerly,
>> Shuai
>> Jan 27 2010
>>
>> 364 Plantation Str.
>> MA 01605
>> UMASS MED
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>




More information about the Bioperl-l mailing list