[Bioperl-l] Difference between

Joseph Bedell jbedell at oriongenomics.com
Fri Jan 30 12:50:00 EST 2004


I think Mat is right. The default for NCBI-BLAST is to filter low
complexity sequence, which is what happened to your string of 9 T's on
the web query.

You can either uncheck the "Low Complexity" box or check the "Mask for
lookup table only" box, which will do a soft-mask to allow the T's to be
aligned during the extension stage.

Joey

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Joseph A Bedell, Ph.D.
Director, Bioinformatics
Orion Genomics, LLC
4041 Forest Park Ave.
St. Louis, MO 63108
Office:(314)615-6979
Fax:(314)615-6975
Mobile:(314)518-1343
http://www.oriongenomics.com
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 

>-----Original Message-----
>From: bioperl-l-bounces at portal.open-bio.org [mailto:bioperl-l-
>bounces at portal.open-bio.org] On Behalf Of Wiepert, Mathieu
>Sent: Friday, January 30, 2004 11:13 AM
>To: 'tai kwan do'; bioperl-l at bioperl.org
>Subject: RE: [Bioperl-l] Difference between
>
>Hi,
>
>I have a vague recollection of this problem, so this answer is likely
>wrong, but I think it has something to do with the filtered sequence?
You
>have 9 masked NT's, so it is probably a difference in the defaults, and
>something to do with the XML output not masked?
>
>Sorry I can't find the emails I had with NCBI on this, but I am maybe
70%
>sure that it is a problem like that, with defaults on the local server
>versus NCBI, and the XML not using masked data?
>
>Someone else chime in if I am way off there...
>
>HTH,
>
>-mat
>
>> -----Original Message-----
>> From: bioperl-l-bounces at portal.open-bio.org
>> [mailto:bioperl-l-bounces at portal.open-bio.org]On Behalf Of tai kwan
do
>> Sent: Friday, January 30, 2004 2:06 AM
>> To: bioperl-l at bioperl.org
>> Subject: [Bioperl-l] Difference between
>>
>>
>> Hello,
>>
>> Maybe someone could help me out with this problem, I've tried
>> to email ncbi
>> about this but they didn't bother answering.  Basically, I'm seeing a
>> difference in the data being output by stand-alone blast and
>> online blast.
>> The identities value are different between the xml output and
>> the pairwise
>> alignment output, even though I'm using the exact same input
>> values.  The
>> other difference I see is in the query and hit sequences.
>> I've included
>> below the outputs using the same input parameters, is this
>> normal behavior?
>>
>>    gb|AE000111.1|AE000111 Escherichia coli K-12 MG1655
>> section 1 of 400 of
>> the complete
>>
>>          genome
>>         Length = 10596
>>
>> Score =  589 bits (297), Expect = e-168
>> Identities = 315/324 (97%)
>> Strand = Plus / Plus
>>
>>
>> Query: 237
>> aggtaacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtg 296
>>          ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> Sbjct: 237
>> aggtaacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtg 296
>>
>>
>> Query: 297
>> cgggcnnnnnnnnncgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcgg 356
>>          |||||         ||||||||||||||||||||||||||||||||||||||||||||||
>> Sbjct: 297
>> cgggctttttttttcgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcgg 356
>>
>>
>> Query: 357
>> cggtacatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaa 416
>>          ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> Sbjct: 357
>> cggtacatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaa 416
>>
>>
>> Query: 417
>> tgccaggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacct 476
>>          ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> Sbjct: 417
>> tgccaggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacct 476
>>
>>
>> Query: 477
>> ggtggcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgc 536
>>          ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> Sbjct: 477
>> ggtggcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgc 536
>>
>>
>> Query: 537 cgaacgtatttttgccgaactttt 560
>>          ||||||||||||||||||||||||
>> Sbjct: 537 cgaacgtatttttgccgaactttt 560
>>
>>
>>       <Hit>
>>         <Hit_num>1</Hit_num>
>>         <Hit_id>gi|1786181|gb|AE000111.1|AE000111</Hit_id>
>>         <Hit_def>Escherichia coli K-12 MG1655 section 1 of 400 of the
>> complete genome</Hit_def>
>>         <Hit_accession>AE000111</Hit_accession>
>>         <Hit_len>10596</Hit_len>
>>         <Hit_hsps>
>>           <Hsp>
>>             <Hsp_num>1</Hsp_num>
>>             <Hsp_bit-score>589.253</Hsp_bit-score>
>>             <Hsp_score>297</Hsp_score>
>>             <Hsp_evalue>1.04898e-168</Hsp_evalue>
>>             <Hsp_query-from>237</Hsp_query-from>
>>             <Hsp_query-to>560</Hsp_query-to>
>>             <Hsp_hit-from>237</Hsp_hit-from>
>>             <Hsp_hit-to>560</Hsp_hit-to>
>>             <Hsp_query-frame>1</Hsp_query-frame>
>>             <Hsp_hit-frame>1</Hsp_hit-frame>
>>             <Hsp_identity>324</Hsp_identity>
>>             <Hsp_positive>324</Hsp_positive>
>>             <Hsp_align-len>324</Hsp_align-len>
>>
>> <Hsp_qseq>AGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACC
>> TGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAA
>> GTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAA
>> GCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCAC
>> CTGGTGGCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGC
>> CGAACGTATTTTTGCCGAACTTTT</Hsp_qseq>
>>
>> <Hsp_hseq>AGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACC
>> TGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAA
>> GTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAA
>> GCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCAC
>> CTGGTGGCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGC
>> CGAACGTATTTTTGCCGAACTTTT</Hsp_hseq>
>>
>> <Hsp_midline>|||||||||||||||||||||||||||||||||||||||||||||||||
>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> |||||||||||||||||||||||||||</Hsp_midline>
>>           </Hsp>
>>           <Hsp>
>>
>> Thanks in advance
>>
>> _________________________________________________________________
>> High-speed users-be more efficient online with the new MSN
>> Premium Internet
>> Software. http://join.msn.com/?pgmarket=en-us&page=byoa/prem&ST=1
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at portal.open-bio.org
>http://portal.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list