[Bioperl-l] results problem with StandAloneBlast

Jason Stajich jason.stajich at duke.edu
Sun Jun 4 14:08:29 UTC 2006


right - you don't need rewind if you aren't going to use the iterator  
(next_XXX) -- we provide two different ways to get access to the data.
you can do
for my $hit ( $result->hits ) {

}
or
while( my $hit = $result->next_hit ) {
}


If you want to rewind the parser then (assuming you are using a  
filestream and not a data stream from the web or zcat or something)  
just reset the filehandle
seek($searchio->_fh, 0);

but then you'll have to re-parse everything and pay that cost twice -  
it makes more sense to me to just save the results and put them in  
list if you are going to deliberately make two passes over all the  
results.    You either pay the cost of memory (keeping all the  
objects) or time (reparse the results).


-jason
On Jun 4, 2006, at 1:17 AM, Chris Fields wrote:

> There's an interesting addition to this I found while checking this  
> out; looks like if you use:
>
> my @hits =  $result->hits;
>
> to get all the hits, you don't need to use '$result->rewind'.  The  
> rewind method resets the iterator for the hit list back back to the  
> beginning, but using the hits method to grab all the hits doesn't  
> use the iterator at all.  This works either pre- or post-iteration  
> through the Hit::BlastHit objects.
>
> Another thing; Genevieve was passing the SearchIO report object  
> (i.e. the parser object which was returned from StandAloneBlast,  
> $blast_report) to the methods, not the  
> Bio::Search::Result::BlastResult object; looks like there was some  
> confusion between the two object types since she refers to the  
> report as the result object when it's actually the SearchIO parser  
> object.  So, once the parser was passed into the first method, a  
> result object was generated, then destroyed.  When entering the  
> second method, the parser had already read parsed the report and  
> generated the objects, so it ended with no output.
>
> Though passing the BlastResult object is better since one should  
> only have to parse the report once and use the objects, for  
> curiosity's sake, is there a method to rewind the parser itself (in  
> other words, read through the report again)?
>
> Chris
>
>
> On Jun 3, 2006, at 2:31 PM, Jason Stajich wrote:
>
>> In the HOWTO hits() and hsps() were there, I just added rewind in the
>> table of methods.
>> If someone wanted to write a little section in the HOWTO about
>> resetting the iterator that would be great.
>>
>> -jason
>> On Jun 3, 2006, at 3:13 PM, Chris Fields wrote:
>>
>>> Nice!  Didn't know I could do that.  Maybe we should add some of  
>>> this
>>> to the HOWTO (or is it already in there?).
>>>
>>> Chris
>>>
>>> On Jun 3, 2006, at 10:29 AM, Jason Stajich wrote:
>>>
>>>> you can get all the Hits or hsps with the following method:
>>>> my @hits = $result->hits;
>>>> my @hsps = $hit->hsps;
>>>>
>>>>
>>>> You can also reset the counter since these implementations are in-
>>>> memory and already parsed (and not a stream processor per se).
>>>> next_XX just iterates through the list stored in the parent object.
>>>>
>>>> $result->rewind;
>>>>
>>>>    and
>>>>
>>>> $hit->rewind;
>>>>
>>>>
>>>> For example, the rewind needs to be called if you want to use a
>>>> ResultWriter object and filter some of the values for the final
>>>> writing after first inspecting them.
>>>>
>>>> -jason
>>>>
>>>>
>>>> On May 30, 2006, at 12:57 PM, Genevieve DeClerck wrote:
>>>>
>>>>> Thanks for your comment Sendu, it was very helpful. I think this
>>>>> must be
>>>>> what's going on.. I am using $blast_report->next_result in both
>>>>> subroutines. It appears that analyzing the blast results first  
>>>>> w/ my
>>>>> sort subroutine empties (?) the $blast_result object so that  
>>>>> when I
>>>>> try
>>>>> to print, there is nothing left to print. (and visa-versa when I
>>>>> print
>>>>> first then try to sort).
>>>>> So, from the looks of things, using next_result has the effect of
>>>>> popping the Bio::Search::Result::ResultI objects off of the  
>>>>> SearchIO
>>>>> blast report object??
>>>>>
>>>>> It seems I could get around this by making a copy of the blast
>>>>> report by
>>>>> setting it to another new variable...(not the most elegant
>>>>> solution) but
>>>>> I'm having trouble with this...
>>>>>
>>>>> If I do:
>>>>>
>>>>> 	my $blast_report_copy = $blast_report;
>>>>>
>>>>> I'm just copying the reference to the SearchIO blast result, so it
>>>>> doesn't help me. How can I make another physical copy of this  
>>>>> blast
>>>>> result object? Seems like a simple thing but how to do it is
>>>>> escaping me.
>>>>>
>>>>> But better yet, the way to go is to 'reset the counter,' or to
>>>>> find a
>>>>> way to look at/print/sort the results without removing data  
>>>>> from the
>>>>> blast result object. How is this done though??
>>>>>
>>>>> Sendu and Brian, I didn't post the sort_results subroutine because
>>>>> it is
>>>>> sprawling, as is a lot of my code. The code I provided was more
>>>>> like an
>>>>> aid for my explanation of the problem.. it doesn't actually run -
>>>>> sorry
>>>>> for the confusion, I should have more clear on that.  The  
>>>>> important
>>>>> thing to know perhaps is that both sort_results and
>>>>> print_blast_results
>>>>> contain a foreach loop where I am using the 'next_results'  
>>>>> method to
>>>>> view blast results. (And to clarify for Torsten, the blastall() is
>>>>> working just fine - the analysis/viewing of the results object is
>>>>> where
>>>>> I am encountering the problem.)
>>>>>
>>>>>
>>>>> Any other ideas would be greatly appreciated...
>>>>>
>>>>> Thank you,
>>>>> Genevieve
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Sendu Bala wrote:
>>>>>
>>>>>> Genevieve DeClerck wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>
>>>>>> [snip]
>>>>>>
>>>>>>> If I've sorted the results the sorted-results will print to
>>>>>>> screen,
>>>>>>> however when I try to print the Hit Table results nothing is
>>>>>>> returned,
>>>>>>> as if the blast results have evaporated.... and visa versa, if i
>>>>>>> comment out the part where i point my sorting subroutine to the
>>>>>>> blast
>>>>>>> results reference,  my hit table results suddenly prints to
>>>>>>> screen.
>>>>>>
>>>>>> [snip]
>>>>>>
>>>>>>> Here's an abbreviated version of my code:
>>>>>>
>>>>>> [snip]
>>>>>>
>>>>>>> #######
>>>>>>> ### the following 2 actions seem to be mutually exclusive.
>>>>>>> # 1) sort results into 1-hitter, 2-hitter, etc. groups of
>>>>>>> # SeqFeature objs stored in arrays. arrays are then printed
>>>>>>> # to stdout
>>>>>>> &sort_results($blast_report);
>>>>>>>
>>>>>>> # 2) print blast results
>>>>>>> &print_blast_results($blast_report);
>>>>>>
>>>>>>
>>>>>>> sub print_blast_results{
>>>>>>>    my $report = shift;
>>>>>>>    while(my $result = $report->next_result()){
>>>>>>
>>>>>> [snip]
>>>>>>
>>>>>> You didn't give us your sort_results subroutine, but is it as
>>>>>> simple as
>>>>>> they both use $report->next_result (and/or $result->next_hit),  
>>>>>> but
>>>>>> you
>>>>>> don't reset the internal counter back to the start, so the second
>>>>>> subroutine tries to get the next_result and finds the first
>>>>>> subroutine
>>>>>> has already looked at the last result and so next_result returns
>>>>>> false?
>>>>>>
>>>>>>  From a quick look it wasn't obvious how to reset the counter.
>>>>>> Hopefully
>>>>>> this can be done and someone else knows how.
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> --
>>>> Jason Stajich
>>>> Duke University
>>>> http://www.duke.edu/~jes12
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> Christopher Fields
>>> Postdoctoral Researcher
>>> Lab of Dr. Robert Switzer
>>> Dept of Biochemistry
>>> University of Illinois Urbana-Champaign
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> --
>> Jason Stajich
>> Duke University
>> http://www.duke.edu/~jes12
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12/





More information about the Bioperl-l mailing list