[Bioperl-l] Bio::DB::SeqFeature sequences with no identifier?

Scott Cain scott at scottcain.net
Thu May 15 14:21:42 UTC 2014


HI Mark,

The sequence has to be identified in some way, either with an explicit GFF
line like in my example, with a ##sequence-region directive (which I don't
care for but should work for you), or as part of a Target attribute (as
part of a match/similarity search result).  I'd say this isn't terribly
explicit in the spec but is in the examples.

Scott



On Thu, May 15, 2014 at 4:59 AM, Mark Wilkinson <markw at illuminae.com> wrote:

>  It doesn't.  Thanks!  I'll try that when I get in tomorrow.
>
> (is that a part of the GFF3 spec, or is it a BioPerl "thing"?  is it
> documented?)
>
> M
>
>
>
>
> On 14/05/2014 4:19 PM, Scott Cain wrote:
>
> Hi Mark,
>
>  Does you GFF have a line that identifies the reference sequence, like
> this:
>
>  SEQ1   .    contig    1    123456    .    .    .    Name=SEQ1
>
>  If not, that could be the problem.
>
>  Scott
>
>
>
> On Wed, May 14, 2014 at 5:22 AM, Mark Wilkinson <markw at illuminae.com>wrote:
>
>> Hi all BioPerlers!
>>
>> I'm confused by something.  In the scenario below I have a Fasta file and
>> a GFF file:
>>
>> =========
>> File:  a.fas
>>
>> >SEQ1
>> AAAATTTTCCCCGGGG
>>
>> =========
>> File:  b.gff
>>
>> SEQ1    hit1    match_part    1    5    .    .    .    .
>> SEQ1    hit2    match_part    6    10    .    .    .    .
>> =========
>>
>> I load them into a seqfeature DB:
>>
>> bp_seqfeature_load.pl -d dbi:mysql:seqdb -c -u root -p pass  a.fas b.gff
>>
>> I then explore the data as follows:
>>
>> use Bio::DB::SeqFeature::Store;
>>
>> my $db = Bio::DB::SeqFeature::Store->new(
>>     -adaptor => 'DBI::mysql',
>>     -dsn     => 'dbi:mysql:seqdb',
>>     -user => 'root',
>>     -password => 'pass');
>>
>> my $iterator = $db->get_seq_stream();
>> while (my $feature = $iterator->next_seq){
>>     print $feature->seq->seq;
>>     # THE SEQUENCE IS PRINTED
>>     print " comes from sequence named ";
>>     print $feature->seq->id;
>>     #  THE METHOD ABOVE RETURNS UNDEF
>> }
>>
>> my $seq = $db->segment('SEQ1');
>>      # $seq is undef, NOTHING IS RETURNED!?!?
>>
>> ============
>>
>> This is all very confusing.  It seems that the feature knows what
>> sequence it is attached to, because it gives me the correct string of
>> letters, but it doesn't know what the name of that sequence is... and in
>> fact, calling the sequence by name returns undef.
>>
>> Is this a bug, or is there a reason for this "disconnect" between a
>> sequence and its name?
>>
>> Help appreciated!
>>
>> Mark
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>
>
>  --
> ------------------------------------------------------------------------
> Scott Cain, Ph. D.                                   scott at scottcain
> dot net
> GMOD Coordinator (http://gmod.org/)                     216-392-3087
> Ontario Institute for Cancer Research
>
>
>
>
> ------------------------------
>    <http://www.avast.com/>
>
> This email is free from viruses and malware because avast! Antivirus<http://www.avast.com/>protection is active.
>
>


-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot
net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research



More information about the Bioperl-l mailing list