[Biopython-dev] Re: cleaning features

Michel Kerszberg mkersz at pasteur.fr
Mon Oct 15 11:59:16 EDT 2001


Dear Brad,

The Bio.GenBank FeatureValueCleaner utility is what the doctor prescribed!
Personnally, I would vote for applying it by default.

Meanwhile, I discovered that the GenBank BLAST parser stalls at the 
interactive map when this is included in the HTML file. I guess the parser 
should ignore anything enclosed in <PRE> and </PRE> flags and including a 
"#graphical-overview" string. Mind you, no big deal to take this out by 
hand, or cheking the right options when doing the BLAST!

Also, parsing of TBLASTX records stalls due to the unusual format of the 
final information:

Matrix: BLOSUM62
Number of Hits to DB: 10,447,982,379
Number of Sequences: 988209
Number of extensions: 209322370
Number of successful extensions: 17383937
Number of sequences better than 1.0e-50: 110
length of database: 1,426,479,391
effective HSP length: 60
effective length of database: 1,367,186,851
effective search space used: 869530837236
frameshift window, decay const: 50,  0.5
T: 13
A: 40
X1: 16 ( 7.3 bits)
X2: 0 ( 0.0 bits)
S1: 41 (21.7 bits)

Thanks for the good work! I appreciate biopython more and more (not to 
speak of python)

Best regards,

Michel







More information about the Biopython-dev mailing list