[BioPython] Bug in Bio.SeqUtils ?

Bin Hu hubin.keio at gmail.com
Thu Feb 2 09:35:50 EST 2006


Hi,

When using Bio.SeqUtils to estimate isoelectric point for PDB entry 1a8y, it
seems the function isoelectric_point() cannot reach an end, although it
worked pretty well for all the other entries that I've tested. Could this be
a bug in Bio.SeqUtils?

If anyone want to test it, blow is the sequence of 1a8y:

eegldfpeydgvdrvinvnaknyknvfkkyevlallyheppeddkasqrqfemeelilel
aaqvledkgvgfglvdsekdaavakklglteedsiyvfkedevieydgefsadtlvefll
dvledpveliegerelqafeniedeikligyfknkdsehykafkeaaeefhpyipffatf
dskvakkltlklneidfyeafmeepvtipdkpnseeeivnfveehrrstlrklkpesmye
tweddmdgihivafaeeadpdgyefleilksvaqdntdnpdlsiiwidpddfpllvpywe
ktfdidlsapqigvvnvtdadsvwmemddeedlpsaeeledwledvlegeintedddded
ddddddd

For PDB entry 1rb9, the hydrophilicity of this protein cannot be estimated
because its sequence starts with "X", which is not in the key list used by
SeqUtils. It will bring the following error message:

Traceback (most recent call last):
  File "./dataGen.py", line 62, in ?
    aHydrophilicityList = aSeqObj.protein_scale(ProtParamData.hw, 5)
  File "/usr/lib/python2.4/site-packages/Bio/SeqUtils/ProtParam.py", line
206, in protein_scale
    score += weight[j] * ParamDict[subsequence[j]] + weight[j] *
ParamDict[subsequence[Window-j-1]]
KeyError: 'X'

Although I can delete the "X" in this protein, could the author implement a
warning message and work around this error stop? Thank you.

Bin



More information about the BioPython mailing list