[Bioperl-l] SearchIO to GFF (was: Getting 'features' fromSearchIO?)

Chris Fields cjfields at illinois.edu
Tue May 12 10:04:26 EDT 2009


More complicated than that, I'm afraid.  We should try to fix that at  
the source of the problem.

This appears to stem from SearchUtils HSP tiling, which in turn  
utilizes HSPI::matches(), which in turn checks num_identical/ 
num_conserved.  My guess is, since this is blasttable format, one of  
these isn't set and thus is returning the wrong value.  I'll attempt  
to track it down today, but it may take some time.

chris

On May 12, 2009, at 8:29 AM, Mark A. Jensen wrote:

> This sounds like a
>
> $sum = eval join( '+', @a);
>
> problem, which can be fixed with
>
> $sum = eval join('+', map { $_ || () } @a) ;
>
> MAJ
> ----- Original Message ----- From: "Dan Bolser" <dan.bolser at gmail.com>
> To: "Chris Fields" <cjfields at illinois.edu>
> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
> Sent: Tuesday, May 12, 2009 9:17 AM
> Subject: Re: [Bioperl-l] SearchIO to GFF (was: Getting 'features'  
> fromSearchIO?)
>
>
> 2009/5/12 Chris Fields <cjfields at illinois.edu>:
>> Fixed that in svn. We're all still learning the ropes...
>
> In that case, I'm seeing multiple instances of...
>
> Argument "" isn't numeric in addition (+) at Bio/Search/ 
> SearchUtils.pm line 256
> Argument "" isn't numeric in addition (+) at Bio/Search/ 
> SearchUtils.pm line 412
> Argument "" isn't numeric in addition (+) at Bio/Search/ 
> SearchUtils.pm line 429
> Argument "" isn't numeric in addition (+) at Bio/Search/ 
> SearchUtils.pm line 465
> Argument "" isn't numeric in addition (+) at Bio/Search/ 
> SearchUtils.pm line 473
> Argument "" isn't numeric in addition (+) at Bio/Search/ 
> SearchUtils.pm line 494
> Argument "" isn't numeric in addition (+) at Bio/Search/ 
> SearchUtils.pm line 502
>
>
> Hmm... I was about to go on to complain about the weird GFF that I was
> seeing, but suddenly it looks OK. My bioperl install must think your
> standing over my shoulder and is therefore behaving itself!
>
>
> Thanks again for all the help,
> Dan.
>
>
>
>
>> chris
>>
>> On May 12, 2009, at 5:55 AM, Dan Bolser wrote:
>>
>>> 2009/5/12 Dan Bolser <dan.bolser at gmail.com>:
>>>>
>>>> Unfortunately bp_search2gff.pl is giving me errors:
>>>>
>>>> bp_search2gff.pl --version 3 -i BlastResults/blast_table_filtered  
>>>> -f
>>>> blasttable -o BlastResults/blast_table_filtered.gff -t hit
>>>> --match --target --component
>>>>
>>>> --------------------- WARNING ---------------------
>>>> MSG: Removing score value(s)
>>>> ---------------------------------------------------
>>>> Can't locate object method "remove_tags" via package
>>>> "Bio::SeqFeature::Similarity" at
>>>> /local/Scratch/dbolser/perl5/lib/perl5/Bio/SeqFeature/Generic.pm  
>>>> line
>>>> 393, <GEN1> line 5.
>>>
>>>
>>> I'm just learning the ropes...
>>>
>>> --- ~/perl5/lib/perl5/Bio/SeqFeature/Generic.pm~ 2009-05-11
>>> 15:25:55.000000000 +0100
>>> +++ ~/perl5/lib/perl5/Bio/SeqFeature/Generic.pm 2009-05-12
>>> 11:52:41.000000000 +0100
>>> @@ -390,7 +390,7 @@
>>> }
>>> if ($self->has_tag('score')) {
>>> $self->warn("Removing score value(s)");
>>> - $self->remove_tags('score');
>>> + $self->remove_tag('score');
>>> }
>>> $self->add_tag_value('score',$value);
>>> }
>>>
>>>
>>>
>>>
>>>
>>>> Anyone seen this before?
>>>>
>>>> Cheers,
>>>> Dan.
>>>>
>>>>
>>>>
>>>> 2009/5/12 Dan Bolser <dan.bolser at gmail.com>:
>>>>>
>>>>> Thanks for the info guys, I think I was naively hoping that the
>>>>> feature would know how to cast itself as a 'SeqFeature' (GFF).
>>>>>
>>>>> I think I understand the problem better now, so I'll try to  
>>>>> summarise:
>>>>>
>>>>> There is no standard way to encode a HSP as a feature (not least
>>>>> because there are two choices about which sequence (query or the  
>>>>> hit)
>>>>> it should be attached to). BioPerl will try, but the result will  
>>>>> not
>>>>> be "well structured" SeqFeatures or "well formed" GFF.
>>>>>
>>>>>
>>>>> From what I read I guess it should be possible to standardize this
>>>>> mapping (based on something in one of the examples or the  
>>>>> 'search2gff'
>>>>> script), assuming you specify weather you want features put on the
>>>>> query or on the hit.
>>>>>
>>>>> At some point last year I was trying out the bp_search2gff.pl  
>>>>> and my
>>>>> own code to write a GFF file for loading and viewing by Gbrowse.  
>>>>> At
>>>>> that time I gave up, as nothing seemed to be working. I was hoping
>>>>> that doing this at a lower level (i.e. never writing any GFF  
>>>>> myself)
>>>>> it would stand a better chance of working.
>>>>>
>>>>> Also I was thinking that Gbrowse, if given a SeqFeature::Store,  
>>>>> could
>>>>> autoconfigure its interface to some degree. I guess its back to  
>>>>> the
>>>>> docs ;-)
>>>>>
>>>>>
>>>>>
>>>>> I'll keep trying and see if I can get anywhere.
>>>>>
>>>>> Thanks again,
>>>>> Dan.
>>>>>
>>>>>
>>>>>
>>>>> References for the above:
>>>>>
>>>>> 2009/5/11 Jason Stajich <jason at bioperl.org>:
>>>>>
>>>>>> otherwise you need to be converting the HSPs into seqfeatures  
>>>>>> with the
>>>>>> right associated information (i.e. the tag/value pairs that are  
>>>>>> in the 9th
>>>>>> column) in order to have well structured data in the database.
>>>>>
>>>>>> You can get the individual features from the feature pair with
>>>>>> $hsp->query or $hsp->hit which can also be passed to a GFF  
>>>>>> writer (or call
>>>>>> $hsp->hit->gff_string). Note that since the data storage is not  
>>>>>> structured
>>>>>> in a GFF3 like-way this won't immediately produce well formed  
>>>>>> GFF3 for the
>>>>>> 9th column.
>>>>>
>>>>>
>>>>> 2009/5/11 Chris Fields <cjfields at illinois.edu>:
>>>>>
>>>>>> The main problem is the mapping is subjective based on what your
>>>>>> reference sequence is within the BLAST run (e.g. whether it is  
>>>>>> the query or
>>>>>> the hit), and is something that can't be automatically  
>>>>>> discerned. I ended
>>>>>> up rolling my own with SeqFeature::Store (just mapped the  
>>>>>> relevant data to
>>>>>> Bio::DB::SeqFeatures), but I have long wanted to fix up the  
>>>>>> relevant scripts
>>>>>> to integrate my changes in, just haven't had the time
>>>>>
>>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>




More information about the Bioperl-l mailing list