[Bioperl-l] blasting two identical seq yields only 88% identity

William Hsiao wlhsiao at yahoo.ca
Sun Dec 25 11:15:15 EST 2005


Hi Anders,
   This is due to BLAST's low complexity filter
(http://www.ncbi.nlm.nih.gov/blast/blast_FAQs.shtml#LCR)
which masks low complexity regions as X's.  These X's
are taken into consideration when calculating %
identity resulting in less than 100% identity for two
identical sequences.  You can turn the filter off then
you should see 100% identity.

Cheers,

Will

--- Anders Stegmann <anst at kvl.dk> wrote:

> Merry christmas BioPerl!
> 
> I obtained some odd result blasting a protein
> sequence against
> a chromosome I new encoded the protein using
> tblastn. 
> So I tested the problem by blasting the protein
> against a database only containing the exact same
> protein sequence using blastp (both files were fasta
> formated).
> I obtained an identity of only 88% instead of 100%?
> A lot of X'ses were incorporated in the query
> sequence.
> 
> I figured that it had something to do with the
> database formatting so I tried several possibilities
> with no luck
> (First I tried: formatdb -i SSD1pDB.txt -p T -o F).
> 
> I have had this problem before blasting nucleotides.
> What can I do about it?
> 
> Regards Anders.
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
>
http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 



	

	
		
__________________________________________________________ 
Find your next car at http://autos.yahoo.ca


More information about the Bioperl-l mailing list