[Bioperl-l] Difference between

Alan Li immunoguest at hotmail.com
Sat Jan 31 18:25:45 EST 2004


I would like to thank everyone for their responses.

And yes, Mat is right about this being an issue with the XML output of 
stand-alone blast. I tried comparing the results of just the stand-alone 
blast using different -F flags.  The results below shows that if "-F F" is 
set the results are the same, but are different when using "-F T" for the 
XML output.

So is there anything I could do to make the XML results the same when the 
filtering option is set to true?  Perhaps either through another blast 
parameter or by doing it programmatically?

-----------------------------------------------------------------------

blastall -p blastn -m 7 -F T -d ecoli/ecoli.nt -i test.txt

<Hit>
          <Hit_num>1</Hit_num>
          <Hit_id>gi|1786181|gb|AE000111.1|AE000111</Hit_id>
          <Hit_def>Escherichia coli K-12 MG1655 section 1 of 400 of the 
complete genome</Hit_def>
          <Hit_accession>AE000111</Hit_accession>
          <Hit_len>10596</Hit_len>
          <Hit_hsps>
            <Hsp>
              <Hsp_num>1</Hsp_num>
              <Hsp_bit-score>589.253</Hsp_bit-score>
              <Hsp_score>297</Hsp_score>
              <Hsp_evalue>1.04898e-168</Hsp_evalue>
              <Hsp_query-from>237</Hsp_query-from>
              <Hsp_query-to>560</Hsp_query-to>
              <Hsp_hit-from>237</Hsp_hit-from>
              <Hsp_hit-to>560</Hsp_hit-to>
              <Hsp_query-frame>1</Hsp_query-frame>
              <Hsp_hit-frame>1</Hsp_hit-frame>
              <Hsp_identity>324</Hsp_identity>
              <Hsp_positive>324</Hsp_positive>
              <Hsp_align-len>324</Hsp_align-len>
              
<Hsp_qseq>AGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTGGCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTTTGCCGAACTTTT</Hsp_qseq>
              
<Hsp_hseq>AGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTGGCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTTTGCCGAACTTTT</Hsp_hseq>
              
<Hsp_midline>||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||</Hsp_midline>
            </Hsp>

-----------------------------------------------------------------------

blastall -p blastn -m 0 -F T -d ecoli/ecoli.nt -i test.txt

>gb|AE000111.1|AE000111 Escherichia coli K-12 MG1655 section 1 of 400 of the 
>complete
           genome
          Length = 10596

Score =  589 bits (297), Expect = e-168
Identities = 315/324 (97%)
Strand = Plus / Plus


Query: 237 aggtaacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtg 296
           ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 237 aggtaacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtg 296


Query: 297 cgggcnnnnnnnnncgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcgg 356
           |||||         ||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 297 cgggctttttttttcgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcgg 356


Query: 357 cggtacatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaa 416
           ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 357 cggtacatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaa 416


Query: 417 tgccaggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacct 476
           ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 417 tgccaggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacct 476


Query: 477 ggtggcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgc 536
           ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 477 ggtggcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgc 536


Query: 537 cgaacgtatttttgccgaactttt 560
           ||||||||||||||||||||||||
Sbjct: 537 cgaacgtatttttgccgaactttt 560

-----------------------------------------------------------------------

blastall -p blastn -m 7 -F F -d ecoli/ecoli.nt -i test.txt

<Hit>
          <Hit_num>1</Hit_num>
          <Hit_id>gi|1786181|gb|AE000111.1|AE000111</Hit_id>
          <Hit_def>Escherichia coli K-12 MG1655 section 1 of 400 of the 
complete genome</Hit_def>
          <Hit_accession>AE000111</Hit_accession>
          <Hit_len>10596</Hit_len>
          <Hit_hsps>
            <Hsp>
              <Hsp_num>1</Hsp_num>
              <Hsp_bit-score>1110.61</Hsp_bit-score>
              <Hsp_score>560</Hsp_score>
              <Hsp_evalue>0</Hsp_evalue>
              <Hsp_query-from>1</Hsp_query-from>
              <Hsp_query-to>560</Hsp_query-to>
              <Hsp_hit-from>1</Hsp_hit-from>
              <Hsp_hit-to>560</Hsp_hit-to>
              <Hsp_query-frame>1</Hsp_query-frame>
              <Hsp_hit-frame>1</Hsp_hit-frame>
              <Hsp_identity>560</Hsp_identity>
              <Hsp_positive>560</Hsp_positive>
              <Hsp_align-len>560</Hsp_align-len>
              
<Hsp_qseq>AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTTAGGTCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAAAAATTACAGAGTACACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTGGCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTTTGCCGAACTTTT</Hsp_qseq>
              
<Hsp_hseq>AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTTAGGTCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAAAAATTACAGAGTACACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTGGCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTTTGCCGAACTTTT</Hsp_hseq>
              
<Hsp_midline>||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||</Hsp_midline>
            </Hsp>

-----------------------------------------------------------------------

blastall -p blastn -m 0 -F F -d ecoli/ecoli.nt -i test.txt

>gb|AE000111.1|AE000111 Escherichia coli K-12 MG1655 section 1 of 400 of the 
>complete
           genome
          Length = 10596

Score = 1110 bits (560), Expect = 0.0
Identities = 560/560 (100%)
Strand = Plus / Plus


Query: 1   agcttttcattctgactgcaacgggcaatatgtctctgtgtggattaaaaaaagagtgtc 60
           ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 1   agcttttcattctgactgcaacgggcaatatgtctctgtgtggattaaaaaaagagtgtc 60


Query: 61  tgatagcagcttctgaactggttacctgccgtgagtaaattaaaattttattgacttagg 120
           ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 61  tgatagcagcttctgaactggttacctgccgtgagtaaattaaaattttattgacttagg 120


Query: 121 tcactaaatactttaaccaatataggcatagcgcacagacagataaaaattacagagtac 180
           ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 121 tcactaaatactttaaccaatataggcatagcgcacagacagataaaaattacagagtac 180


Query: 181 acaacatccatgaaacgcattagcaccaccattaccaccaccatcaccattaccacaggt 240
           ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 181 acaacatccatgaaacgcattagcaccaccattaccaccaccatcaccattaccacaggt 240


Query: 241 aacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtgcggg 300
           ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 241 aacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtgcggg 300


Query: 301 ctttttttttcgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcggcggt 360
           ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 301 ctttttttttcgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcggcggt 360


Query: 361 acatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaatgcc 420
           ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 361 acatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaatgcc 420


Query: 421 aggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacctggtg 480
           ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 421 aggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacctggtg 480


Query: 481 gcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgccgaa 540
           ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 481 gcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgccgaa 540


Query: 541 cgtatttttgccgaactttt 560
           ||||||||||||||||||||
Sbjct: 541 cgtatttttgccgaactttt 560


>From: "Wiepert, Mathieu" <Wiepert.Mathieu at mayo.edu>
>To: 'tai kwan do' <immunoguest at hotmail.com>,    bioperl-l at bioperl.org
>Subject: RE: [Bioperl-l] Difference between Date: Fri, 30 Jan 2004 11:13:05 
>-0600
>
>Hi,
>
>I have a vague recollection of this problem, so this answer is likely 
>wrong, but I think it has something to do with the filtered sequence?  You 
>have 9 masked NT's, so it is probably a difference in the defaults, and 
>something to do with the XML output not masked?
>
>Sorry I can't find the emails I had with NCBI on this, but I am maybe 70% 
>sure that it is a problem like that, with defaults on the local server 
>versus NCBI, and the XML not using masked data?
>
>Someone else chime in if I am way off there...
>
>HTH,
>
>-mat
>

_________________________________________________________________
There are now three new levels of MSN Hotmail Extra Storage!  Learn more. 
http://join.msn.com/?pgmarket=en-us&page=hotmail/es2&ST=1



More information about the Bioperl-l mailing list