[Bioperl-l] SearchIO: Features in/flanking this part of asubject sequence
Mark A. Jensen
maj at fortinbras.us
Wed Apr 29 23:48:10 UTC 2009
also check out http://www.bioperl.org/wiki/Parsing_BLAST_HSPs
MAJ
----- Original Message -----
From: "Chris Fields" <cjfields at illinois.edu>
To: "Razi Khaja" <razi.khaja at gmail.com>
Cc: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, April 29, 2009 3:41 PM
Subject: Re: [Bioperl-l] SearchIO: Features in/flanking this part of asubject
sequence
> I'm assuming this is from an older bioperl; this data should be accessible
> via $hsp->hit_features in the latest code fromo svn (and I believe in bioperl
> 1.6.0 in CPAN).
>
> chris
>
> On Apr 29, 2009, at 2:08 PM, Razi Khaja wrote:
>
>> Hello,
>>
>> I am generating BLAST alignments using the BLAST URL API from NCBI.
>>
>> I want to parse details from BLAST reports whenever there are
>> "Features in/flanking this part of subject sequence". A portion of
>> the BLAST report showing "Features flanking ..." is pasted below.
>>
>> I am using Bio::SearchIO to parse details. The relevant part of the
>> script is below.
>>
>> The problem I am having is that for some reason the first occurrence
>> of a "Feature flanking this part of a subject sequence" is skipped.
>> I am only able to parse/print all occurrences of a "Feature
>> in/flanking this part of a subject sequence" from the second
>> occurrence to the last occurrence.
>>
>> I believe the code responsible for parsing this information is in
>> Bio/SearchIO/blast.pm, starting on line 760.
>> I have tried fixing the code in Bio/SearchIO/blast.pm myself but was
>> not able to correct the problem.
>> Would it be possible for someone to fix the code in the
>> Bio/SearchIO/blast.pm module, or help me fix the code so that the
>> first occurrence is not skipped?
>>
>> Thanks,
>> Razi
>
>
>
>> ===== The part of the script that is relevant to parsing "Features
>> in/flanking..." ====
>> my $bio_searchio_in = Bio::SearchIO->new(
>> -file => 'blast_result.txt',
>> -format => 'blast'
>> );
>>
>> my $i = 1;
>> while( my $result = $bio_searchio_in->next_result() ){
>> while( my $hit = $result->next_hit() ){
>> while( my $hsp = $hit->next_hsp() ){
>> my $hsp_features = $hsp->hit_features();
>> if( $hsp_features ) {
>> print "HSP FEATURE $i\t$hsp_features\n";
>> $i++;
>> }
>> }
>> }
>> }
>>
>> ===== A portion of a BLAST report with "Features flanking ..." =====
>> ...
>> ...
>> Score = 54.7 bits (29), Expect = 0.003
>> Identities = 29/29 (100%), Gaps = 0/29 (0%)
>> Strand=Plus/Minus
>>
>> Query 6556 CCTGGGTGACAGAGTGAGACTCCATCTCA 6584
>> |||||||||||||||||||||||||||||
>> Sbjct 6953042 CCTGGGTGACAGAGTGAGACTCCATCTCA 6953014
>>
>>
>>> gi|51459264|ref|NT_077382.3|Hs1_77431 Homo sapiens chromosome 1 genomic
>>> contig
>> Length=237250
>>
>> Features flanking this part of subject sequence:
>> 16338 bp at 5' side: PRAME family member 8
>> 11926 bp at 3' side: PRAME family member 9
>>
>> Score = 7286 bits (3945), Expect = 0.0
>> Identities = 5437/6145 (88%), Gaps = 152/6145 (2%)
>> Strand=Plus/Plus
>>
>> Query 23225 GGTTGGTTAATATTGATAATTAAATGACTTGGTACTGAGAAGAAGCTATAGGTGCAAATG
>> 23284
>> |||||||||||||||||||||||||||||||| |||||| ||||||||||| ||||||||
>> Sbjct 86128 GGTTGGTTAATATTGATAATTAAATGACTTGGCACTGAGCAGAAGCTATAGATGCAAATG
>> 86187
>>
>> Query 23285 GGTGGCCTATGACTATTATTGATTTCATTACTGGTAATTTATCTCTATGCCTAGAAAACA
>> 23344
>> ||||||||||||||||| |||||||||||||| |||| ||||||| |||| ||| |||||
>> Sbjct 86188 GGTGGCCTATGACTATTGTTGATTTCATTACTTGTAACTTATCTCCATGCATAGGAAACA
>> 86247
>> ...
>> ...
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
More information about the Bioperl-l
mailing list