[BioPython] HSPs in Blast parser

Bzy Bee nomy2020 at yahoo.com
Fri Apr 30 00:26:00 EDT 2004


Hi

 

I am stuck on parsing a BlastN output and would appreciate some help. I am working on multiple HSPs for a single hit . For example if there are two hsps found for one hit, I need to find where query and subject ends for one hsp and then compare it with the query and subject start for the next hsp, e.g. in the following example:

 

>test_seq1
          Length = 424

 Score =  841 bits (424), Expect = 0.0
 Identities = 424/424 (100%)
 Strand = Plus / Plus

                                                                       
Query: 1   ggactggttcgtcgtttacaagctgccggcccacacagggtcgggagatgcgacgcagaa 60
           ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 1   ggactggttcgtcgtttacaagctgccggcccacacagggtcgggagatgcgacgcagaa 60

                                                                       
Query: 61  cggcctgcggtacaagtactttgacgaacactcagaagactggagcgacggcgtggggtt 120
           ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 61  cggcctgcggtacaagtactttgacgaacactcagaagactggagcgacggcgtggggtt 120

                                                                       

 Score =  226 bits (114), Expect = 2e-58
 Identities = 141/150 (94%)
 Strand = Plus / Plus

                                                                       
Query: 275 ccagctcgcctttgtgctctacaatgaccaaccgcctaaatgcagcgagtgtaaggactc 334
           ||||||||||||||||||||||||||||||||||||||||| |||||||| |||||||||
Sbjct: 513 ccagctcgcctttgtgctctacaatgaccaaccgcctaaatccagcgagtctaaggactc 572

                                                                       
Query: 335 ttgcagtcgtgggcacacgaagggtgtgctgctcctggaccaagaagggggcttgtggtt 394
           || ||||||||||||||||||||||||||||||||||||||||||||||||||| |||||
Sbjct: 573 ttccagtcgtgggcacacgaagggtgtgctgctcctggaccaagaagggggcttctggtt 632


 

I am interetsed in where Query and sbjct ended in first hsp (i.e. 120, 120) and where it started in the second hsp (i.e. 275, 513).

 

I have noticed that in the blast parser one can iterate through each hsp for every single hit, but am not too sure how to treat two hsps of a single hit as related and iterate through the two hsps of a single hit in order to find the query (and subject) end of one and query (and subject) start of the other.

 

Any help would be highly appreciated.

 

Thanks

 

Jawad Ali


		
---------------------------------
Do you Yahoo!?
Win a $20,000 Career Makeover at Yahoo! HotJobs 


More information about the BioPython mailing list