[Bioperl-l] SearchIO to GFF (was: Getting 'features' from SearchIO?)

Dan Bolser dan.bolser at gmail.com
Tue May 12 09:17:56 EDT 2009


2009/5/12 Chris Fields <cjfields at illinois.edu>:
> Fixed that in svn.  We're all still learning the ropes...

In that case, I'm seeing multiple instances of...

Argument "" isn't numeric in addition (+) at Bio/Search/SearchUtils.pm line 256
Argument "" isn't numeric in addition (+) at Bio/Search/SearchUtils.pm line 412
Argument "" isn't numeric in addition (+) at Bio/Search/SearchUtils.pm line 429
Argument "" isn't numeric in addition (+) at Bio/Search/SearchUtils.pm line 465
Argument "" isn't numeric in addition (+) at Bio/Search/SearchUtils.pm line 473
Argument "" isn't numeric in addition (+) at Bio/Search/SearchUtils.pm line 494
Argument "" isn't numeric in addition (+) at Bio/Search/SearchUtils.pm line 502


Hmm... I was about to go on to complain about the weird GFF that I was
seeing, but suddenly it looks OK. My bioperl install must think your
standing over my shoulder and is therefore behaving itself!


Thanks again for all the help,
Dan.




> chris
>
> On May 12, 2009, at 5:55 AM, Dan Bolser wrote:
>
>> 2009/5/12 Dan Bolser <dan.bolser at gmail.com>:
>>>
>>> Unfortunately bp_search2gff.pl is giving me errors:
>>>
>>> bp_search2gff.pl --version 3   -i BlastResults/blast_table_filtered -f
>>> blasttable   -o BlastResults/blast_table_filtered.gff   -t hit
>>> --match   --target   --component
>>>
>>> --------------------- WARNING ---------------------
>>> MSG: Removing score value(s)
>>> ---------------------------------------------------
>>> Can't locate object method "remove_tags" via package
>>> "Bio::SeqFeature::Similarity" at
>>> /local/Scratch/dbolser/perl5/lib/perl5/Bio/SeqFeature/Generic.pm line
>>> 393, <GEN1> line 5.
>>
>>
>> I'm just learning the ropes...
>>
>> --- ~/perl5/lib/perl5/Bio/SeqFeature/Generic.pm~   2009-05-11
>> 15:25:55.000000000 +0100
>> +++ ~/perl5/lib/perl5/Bio/SeqFeature/Generic.pm    2009-05-12
>> 11:52:41.000000000 +0100
>> @@ -390,7 +390,7 @@
>>        }
>>        if ($self->has_tag('score')) {
>>            $self->warn("Removing score value(s)");
>> -            $self->remove_tags('score');
>> +            $self->remove_tag('score');
>>        }
>>        $self->add_tag_value('score',$value);
>>    }
>>
>>
>>
>>
>>
>>> Anyone seen this before?
>>>
>>> Cheers,
>>> Dan.
>>>
>>>
>>>
>>> 2009/5/12 Dan Bolser <dan.bolser at gmail.com>:
>>>>
>>>> Thanks for the info guys, I think I was naively hoping that the
>>>> feature would know how to cast itself as a 'SeqFeature' (GFF).
>>>>
>>>> I think I understand the problem better now, so I'll try to summarise:
>>>>
>>>> There is no standard way to encode a HSP as a feature (not least
>>>> because there are two choices about which sequence (query or the hit)
>>>> it should be attached to). BioPerl will try, but the result will not
>>>> be "well structured" SeqFeatures or "well formed" GFF.
>>>>
>>>>
>>>> From what I read I guess it should be possible to standardize this
>>>> mapping (based on something in one of the examples or the 'search2gff'
>>>> script), assuming you specify weather you want features put on the
>>>> query or on the hit.
>>>>
>>>> At some point last year I was trying out the bp_search2gff.pl and my
>>>> own code to write a GFF file for loading and viewing by Gbrowse. At
>>>> that time I gave up, as nothing seemed to be working. I was hoping
>>>> that doing this at a lower level (i.e. never writing any GFF myself)
>>>> it would stand a better chance of working.
>>>>
>>>> Also I was thinking that Gbrowse, if given a SeqFeature::Store, could
>>>> autoconfigure its interface to some degree. I guess its back to the
>>>> docs ;-)
>>>>
>>>>
>>>>
>>>> I'll keep trying and see if I can get anywhere.
>>>>
>>>> Thanks again,
>>>> Dan.
>>>>
>>>>
>>>>
>>>> References for the above:
>>>>
>>>> 2009/5/11 Jason Stajich <jason at bioperl.org>:
>>>>
>>>>> otherwise you need to be converting the HSPs into seqfeatures with the
>>>>> right associated information (i.e. the tag/value pairs that are in the 9th
>>>>> column) in order to have well structured data in the database.
>>>>
>>>>> You can get the individual features from the feature pair with
>>>>> $hsp->query  or $hsp->hit  which can also be passed to a GFF writer (or call
>>>>> $hsp->hit->gff_string).   Note that since the data storage is not structured
>>>>> in a GFF3 like-way this won't immediately produce well formed GFF3 for the
>>>>> 9th column.
>>>>
>>>>
>>>> 2009/5/11 Chris Fields <cjfields at illinois.edu>:
>>>>
>>>>> The main problem is the mapping is subjective based on what your
>>>>> reference sequence is within the BLAST run (e.g. whether it is the query or
>>>>> the hit), and is something that can't be automatically discerned.  I ended
>>>>> up rolling my own with SeqFeature::Store (just mapped the relevant data to
>>>>> Bio::DB::SeqFeatures), but I have long wanted to fix up the relevant scripts
>>>>> to integrate my changes in, just haven't had the time
>>>>
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>



More information about the Bioperl-l mailing list