[BioPython] Bug in Bio.SeqUtils ?
Iddo Friedberg
idoerg at burnham.org
Thu Feb 2 13:35:17 EST 2006
Which version are you using? I tried the 1a8y sequence which you gave,
and also a sequence with an 'X', and they worked fine for me. CVS version.
# seq is a Record object. seq.sequence is a string with the protein sequence
>>> from Bio.SeqUtils import ProtParam
>>> ps = ProtParam.ProteinAnalysis(seq.sequence)
>>> ps.isoelectric_point()
3.9298931884765151
# and for a sequence with an 'x'
>>> ps2 = ProtParam.ProteinAnalysis('xsdfgvcrtyip')
>>> ps2.isoelectric_point()
5.8285980224609375
Bin Hu wrote:
>Hi,
>
>When using Bio.SeqUtils to estimate isoelectric point for PDB entry 1a8y, it
>seems the function isoelectric_point() cannot reach an end, although it
>worked pretty well for all the other entries that I've tested. Could this be
>a bug in Bio.SeqUtils?
>
>If anyone want to test it, blow is the sequence of 1a8y:
>
>eegldfpeydgvdrvinvnaknyknvfkkyevlallyheppeddkasqrqfemeelilel
>aaqvledkgvgfglvdsekdaavakklglteedsiyvfkedevieydgefsadtlvefll
>dvledpveliegerelqafeniedeikligyfknkdsehykafkeaaeefhpyipffatf
>dskvakkltlklneidfyeafmeepvtipdkpnseeeivnfveehrrstlrklkpesmye
>tweddmdgihivafaeeadpdgyefleilksvaqdntdnpdlsiiwidpddfpllvpywe
>ktfdidlsapqigvvnvtdadsvwmemddeedlpsaeeledwledvlegeintedddded
>ddddddd
>
>For PDB entry 1rb9, the hydrophilicity of this protein cannot be estimated
>because its sequence starts with "X", which is not in the key list used by
>SeqUtils. It will bring the following error message:
>
>Traceback (most recent call last):
> File "./dataGen.py", line 62, in ?
> aHydrophilicityList = aSeqObj.protein_scale(ProtParamData.hw, 5)
> File "/usr/lib/python2.4/site-packages/Bio/SeqUtils/ProtParam.py", line
>206, in protein_scale
> score += weight[j] * ParamDict[subsequence[j]] + weight[j] *
>ParamDict[subsequence[Window-j-1]]
>KeyError: 'X'
>
>Although I can delete the "X" in this protein, could the author implement a
>warning message and work around this error stop? Thank you.
>
>Bin
>
>_______________________________________________
>BioPython mailing list - BioPython at biopython.org
>http://biopython.org/mailman/listinfo/biopython
>
>
>
>
--
Iddo Friedberg, Ph.D.
Burnham Institute for Medical Research
10901 N. Torrey Pines Rd.
La Jolla, CA 92037 USA
Tel: +1 (858) 646 3100 x3516
Fax: +1 (858) 713 9949
http://iddo-friedberg.org
http://BioFunctionPrediction.org
More information about the BioPython
mailing list