[Bioperl-l] Bio::DB::SeqFeature sequences with no identifier?

Scott Cain scott at scottcain.net
Wed May 14 10:19:15 EDT 2014


Hi Mark,

Does you GFF have a line that identifies the reference sequence, like this:

SEQ1   .    contig    1    123456    .    .    .    Name=SEQ1

If not, that could be the problem.

Scott



On Wed, May 14, 2014 at 5:22 AM, Mark Wilkinson <markw at illuminae.com> wrote:

> Hi all BioPerlers!
>
> I'm confused by something.  In the scenario below I have a Fasta file and
> a GFF file:
>
> =========
> File:  a.fas
>
> >SEQ1
> AAAATTTTCCCCGGGG
>
> =========
> File:  b.gff
>
> SEQ1    hit1    match_part    1    5    .    .    .    .
> SEQ1    hit2    match_part    6    10    .    .    .    .
> =========
>
> I load them into a seqfeature DB:
>
> bp_seqfeature_load.pl -d dbi:mysql:seqdb -c -u root -p pass  a.fas b.gff
>
> I then explore the data as follows:
>
> use Bio::DB::SeqFeature::Store;
>
> my $db = Bio::DB::SeqFeature::Store->new(
>     -adaptor => 'DBI::mysql',
>     -dsn     => 'dbi:mysql:seqdb',
>     -user => 'root',
>     -password => 'pass');
>
> my $iterator = $db->get_seq_stream();
> while (my $feature = $iterator->next_seq){
>     print $feature->seq->seq;
>     # THE SEQUENCE IS PRINTED
>     print " comes from sequence named ";
>     print $feature->seq->id;
>     #  THE METHOD ABOVE RETURNS UNDEF
> }
>
> my $seq = $db->segment('SEQ1');
>      # $seq is undef, NOTHING IS RETURNED!?!?
>
> ============
>
> This is all very confusing.  It seems that the feature knows what sequence
> it is attached to, because it gives me the correct string of letters, but
> it doesn't know what the name of that sequence is... and in fact, calling
> the sequence by name returns undef.
>
> Is this a bug, or is there a reason for this "disconnect" between a
> sequence and its name?
>
> Help appreciated!
>
> Mark
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>



-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot
net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research


More information about the Bioperl-l mailing list