[Bioperl-l] Getting 'features' from SearchIO?
Jason Stajich
jason at bioperl.org
Mon May 11 15:38:14 UTC 2009
Dan -
There is nice documentation on the gmod website covering the gbrowse
tutorial on the expected format of alignment features. That is what
you should probably be generating and loading with the
bp_seqfeature_load script -- otherwise you need to be converting the
HSPs into seqfeatures with the right associated information (i.e. the
tag/value pairs that are in the 9th column) in order to have well
structured data in the database.
There are boilerplate examples of how to visualize alignments on the
Gbrowse tutorial website as well so I commend that as great starting
place for GFF, data, conf files, and what kind of visualization you
can obtain with the browser.
There is also some helper scripts, that do this for you like
bp_search2gff. Just dumping the feature will take the query ( i
believe) of the feature pair that is the HSP by default, so you will
need to make some choices about what information you want. You can
get the individual features from the feature pair with $hsp->query or
$hsp->hit which can also be passed to a GFF writer (or call $hsp->hit-
>gff_string). Note that since the data storage is not structured in
a GFF3 like-way this won't immediately produce well formed GFF3 for
the 9th column.
Here's a script I use for some DNA to genome alignments, from FASTA
output for example - it assumes 1 HSP per Hit as per what you get from
SSEARCH but is a reasonable jumping off place.
http://bit.ly/fasta2gff
There is also a wublast to gff converting script in that repository as
well.
-jason
On May 11, 2009, at 8:07 AM, Dan Bolser wrote:
> 2009/5/11 Dan Bolser <dan.bolser at gmail.com>:
>> Hi,
>>
>> I am parsing a blasttable and extracting Bio::Search::HSP::GenericHSP
>> objects as a result. I read somewhere that HSP objects inherit
>> Feature
>> objects... How can I get a 'standard' representation of the HSP as a
>> feature? Basically I'd like to simply load the blast results into a
>> feature database...
>>
>> When I call feature methods on the HSP objects I just get blank or
>> undef results... I think this is because I'm trying to get at the
>> sequences existing (non existent) features, rather than get the HSP
>> object as a feature... If that makes sense... How can I confirm
>> that I
>> have a feature object containing the details of the HSP?
>>
>> I thought of trying to just pass the HSP object to the
>> Bio::DB::SeqFeature::Store, but I need to get that up and running
>> first (I'm looking into it). In the mean time I thought I'd ask if
>> this sounds like the right thing to do.
>
> Well it works... I am seeing things fill into the database as I call
>
> $db->store($p)
> or die "Couldn't store!";
>
> (I needed to upgrade bioperl to get Bio::DB::SeqFeature working).
>
>
> Here is my code;
>
> while(my $r = $s->next_result ){
> print $r->query_name, "\n";
> while(my $h = $r->next_hit){
> print "\t", $h->name, "\n";
> while(my $p = $h->next_hsp){
> $db->store($p)
> or die "Couldn't store!";
> }
> }
> }
>
>
> How can I visualize the resulting set of HSPs? i.e. If I point
> gbrowse at this location, will it automatically pick up the entry
> points and their features from the database? Or how much manual
> configuration will I need? Is there some boilerplate config I can use
> to visualize this?
>
> Cheers,
> Dan.
>
>
>> More generally I want to have features attached to sequences that are
>> themselves annotations of larger sequences (but with unknown
>> position). Is Bio::DB::SeqFeature::Store a way to go? I need to
>> manage
>> various different bits of information coming from a sequencing
>> project, and I need a solution to the whole 'assembly life cycle
>> management' problem.
>>
>> Thanks for any help,
>> Dan.
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
Jason Stajich
jason at bioperl.org
More information about the Bioperl-l
mailing list