[BioPython] Bug in Bio.SeqUtils ?

Iddo Friedberg idoerg at burnham.org
Thu Feb 2 13:35:17 EST 2006


Which version are you using? I tried the 1a8y sequence which you gave, 
and also a sequence with an 'X', and they worked fine for me. CVS version.

# seq is a Record object. seq.sequence is a string with the protein sequence

 >>> from Bio.SeqUtils import ProtParam
 >>> ps = ProtParam.ProteinAnalysis(seq.sequence)
 >>> ps.isoelectric_point()
3.9298931884765151


# and for a sequence with an 'x'
 >>> ps2 = ProtParam.ProteinAnalysis('xsdfgvcrtyip')
 >>> ps2.isoelectric_point()
5.8285980224609375

Bin Hu wrote:

>Hi,
>
>When using Bio.SeqUtils to estimate isoelectric point for PDB entry 1a8y, it
>seems the function isoelectric_point() cannot reach an end, although it
>worked pretty well for all the other entries that I've tested. Could this be
>a bug in Bio.SeqUtils?
>
>If anyone want to test it, blow is the sequence of 1a8y:
>
>eegldfpeydgvdrvinvnaknyknvfkkyevlallyheppeddkasqrqfemeelilel
>aaqvledkgvgfglvdsekdaavakklglteedsiyvfkedevieydgefsadtlvefll
>dvledpveliegerelqafeniedeikligyfknkdsehykafkeaaeefhpyipffatf
>dskvakkltlklneidfyeafmeepvtipdkpnseeeivnfveehrrstlrklkpesmye
>tweddmdgihivafaeeadpdgyefleilksvaqdntdnpdlsiiwidpddfpllvpywe
>ktfdidlsapqigvvnvtdadsvwmemddeedlpsaeeledwledvlegeintedddded
>ddddddd
>
>For PDB entry 1rb9, the hydrophilicity of this protein cannot be estimated
>because its sequence starts with "X", which is not in the key list used by
>SeqUtils. It will bring the following error message:
>
>Traceback (most recent call last):
>  File "./dataGen.py", line 62, in ?
>    aHydrophilicityList = aSeqObj.protein_scale(ProtParamData.hw, 5)
>  File "/usr/lib/python2.4/site-packages/Bio/SeqUtils/ProtParam.py", line
>206, in protein_scale
>    score += weight[j] * ParamDict[subsequence[j]] + weight[j] *
>ParamDict[subsequence[Window-j-1]]
>KeyError: 'X'
>
>Although I can delete the "X" in this protein, could the author implement a
>warning message and work around this error stop? Thank you.
>
>Bin
>
>_______________________________________________
>BioPython mailing list  -  BioPython at biopython.org
>http://biopython.org/mailman/listinfo/biopython
>
>
>  
>


-- 
Iddo Friedberg, Ph.D.
Burnham Institute for Medical Research
10901 N. Torrey Pines Rd.
La Jolla, CA 92037 USA
Tel: +1 (858) 646 3100 x3516
Fax: +1 (858) 713 9949
http://iddo-friedberg.org
http://BioFunctionPrediction.org



More information about the BioPython mailing list