[Bioperl-l] Generic Blast parsing issue

Steve Chervitz sac@bioperl.org
Fri, 10 May 2002 12:30:30 -0700 (PDT)


John,

The old functionality that you are looking for is still in place in the 1.0
release. Just specify -format => 'psiblast' when you create the SearchIO object
and you'll get Hit and HSP objects that are the same as those created by the
old Blast.pm module. Eg:

    $in = Bio::SearchIO::new->( -format => 'psiblast', %other_params );

This will give you a parser that is competely different but roughly equivalent
to the parser that handles format => 'blast'. In addition to PSI-blast format,
the 'psiblast' parser also handles regular blast 1 and 2 (primarily NCBI, but
will handle some WU reports).

Eventually, there will be just a single blast parser and you will be able to
get different result, hit, and hsp objects by plugging in different factories
to the SearchIO object. But we're not quite there yet. 

We're also working toward a state where there is just one Blast parser (or a
family of closely related parsers) and not three completely different parsers
like we have now, which is crazy.

Steve


--- Jason Stajich <jason@cgt.mc.duke.edu> wrote:
> This exact functionality has not been ported over from the old system
> because I personally don't use it so it wasn't on my list.  If you would
> like it added feel free to implement and post a patch or port it from the
> old system.
> 
> Someone else asked about this function previously and I had a reasonable
> way to do it, but can't remember what it was off the top of my head.
> Is there any reason the following won't work - you probably would need to
> do a little more work to collapse overalapping regions?
> 
> # get a hit
> ...
> my (@q_offsets, @s_offsets);
> while( my $hsp = $hit->next_hsp ) {
> 
>  # throw some logic in to handle reverse strand if
>  # we're dealing with DNA
>  push @q_offsets, $hsp->query->start ."-". $hsp->query->end;
>  push @s_offsets, $hsp->hit->start ."-". $hsp->hit->end;
> }
> 
> print "HSP query locations are ", join(",", @q_offsets), "\n";
> print "HSP subject locations are ", join(",", @s_offsets), "\n";
> 
> 
> -jason
> 
> On Fri, 10 May 2002 CALLEY_JOHN_N@Lilly.com wrote:
> 
> > I'm not sure if this is a bug or a missing feature. I'm a long time user
> > of the old Blast.pm module. I recently showed a colleague how to set
> > things up and suggested that he use the new generic SearchIO feature
> > instead of doing things the old way. Unfortunately he needs the offset of
> > the HSP in the query and hit sequences. In the Blast.pm module this is
> > accessible via the $hsp->range or $hsp->start methods. In the GenericHSP
> > object there are no direct equivalents, but there is the get_aln call.
> > Since SimpleAlign uses LocatableSeqs internally I hoped that the start and
> > end maintained there, might have what I needed. Unfortunately, they are
> > always set to 1 and the length of the sequence. Is this a bug? Or a
> > misunderstanding on my part? Do I need to give up on using the generic
> > modules?
> >
> > Thanks,
> >   John Calley
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@bioperl.org
> > http://bioperl.org/mailman/listinfo/bioperl-l
> >
> 
> -- 
> Jason Stajich
> Duke University
> jason at cgt.mc.duke.edu
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l


=====
Steve Chervitz
sac@bioperl.org

__________________________________________________
Do You Yahoo!?
Yahoo! Shopping - Mother's Day is May 12th!
http://shopping.yahoo.com