[Bioperl-l] blasting two identical seq yields only 88% identity

Marc Logghe Marc.Logghe at DEVGEN.com
Mon Dec 26 05:40:32 EST 2005

Hi Anders,
This has to do with filtering of low complexity regions (see http://www.ncbi.nlm.nih.gov/Education/BLASTinfo/more3.html#Filtering). This is done by default so you have explicitely to turn it off using the blastall option -FF.

-----Original Message-----
From: bioperl-l-bounces at portal.open-bio.org on behalf of Anders Stegmann
Sent: Sun 12/25/2005 9:58 AM
To: bioperl-l at bioperl.org
Subject: [Bioperl-l] blasting two identical seq yields only 88% identity
Merry christmas BioPerl!

I obtained some odd result blasting a protein sequence against
a chromosome I new encoded the protein using tblastn. 
So I tested the problem by blasting the protein against a database only containing the exact same protein sequence using blastp (both files were fasta formated).
I obtained an identity of only 88% instead of 100%? A lot of X'ses were incorporated in the query sequence.

I figured that it had something to do with the database formatting so I tried several possibilities with no luck
(First I tried: formatdb -i SSD1pDB.txt -p T -o F).

I have had this problem before blasting nucleotides.
What can I do about it?

Regards Anders.

Bioperl-l mailing list
Bioperl-l at portal.open-bio.org

More information about the Bioperl-l mailing list