[BioPython] Bug in Bio.SeqUtils ?
Iddo Friedberg
idoerg at burnham.org
Thu Feb 2 16:14:57 EST 2006
Oh, sorry.
Your second problem was with protein_scale, which does indeed break on
any letter not of the 20 regular amino acids.
I inserted this into a try/except clause which produces a warning to
stderr, instead of raising an exception. It is now in CVS.
Yair, is that OK, or would we rather leave the exception raising bit
there? There are arguments either way...
./I
Iddo Friedberg wrote:
> Which version are you using? I tried the 1a8y sequence which you gave,
> and also a sequence with an 'X', and they worked fine for me. CVS
> version.
>
> # seq is a Record object. seq.sequence is a string with the protein
> sequence
>
> >>> from Bio.SeqUtils import ProtParam
> >>> ps = ProtParam.ProteinAnalysis(seq.sequence)
> >>> ps.isoelectric_point()
> 3.9298931884765151
>
>
> # and for a sequence with an 'x'
> >>> ps2 = ProtParam.ProteinAnalysis('xsdfgvcrtyip')
> >>> ps2.isoelectric_point()
> 5.8285980224609375
>
> Bin Hu wrote:
>
>> Hi,
>>
>> When using Bio.SeqUtils to estimate isoelectric point for PDB entry
>> 1a8y, it
>> seems the function isoelectric_point() cannot reach an end, although it
>> worked pretty well for all the other entries that I've tested. Could
>> this be
>> a bug in Bio.SeqUtils?
>>
>> If anyone want to test it, blow is the sequence of 1a8y:
>>
>> eegldfpeydgvdrvinvnaknyknvfkkyevlallyheppeddkasqrqfemeelilel
>> aaqvledkgvgfglvdsekdaavakklglteedsiyvfkedevieydgefsadtlvefll
>> dvledpveliegerelqafeniedeikligyfknkdsehykafkeaaeefhpyipffatf
>> dskvakkltlklneidfyeafmeepvtipdkpnseeeivnfveehrrstlrklkpesmye
>> tweddmdgihivafaeeadpdgyefleilksvaqdntdnpdlsiiwidpddfpllvpywe
>> ktfdidlsapqigvvnvtdadsvwmemddeedlpsaeeledwledvlegeintedddded
>> ddddddd
>>
>> For PDB entry 1rb9, the hydrophilicity of this protein cannot be
>> estimated
>> because its sequence starts with "X", which is not in the key list
>> used by
>> SeqUtils. It will bring the following error message:
>>
>> Traceback (most recent call last):
>> File "./dataGen.py", line 62, in ?
>> aHydrophilicityList = aSeqObj.protein_scale(ProtParamData.hw, 5)
>> File "/usr/lib/python2.4/site-packages/Bio/SeqUtils/ProtParam.py", line
>> 206, in protein_scale
>> score += weight[j] * ParamDict[subsequence[j]] + weight[j] *
>> ParamDict[subsequence[Window-j-1]]
>> KeyError: 'X'
>>
>> Although I can delete the "X" in this protein, could the author
>> implement a
>> warning message and work around this error stop? Thank you.
>>
>> Bin
>>
>> _______________________________________________
>> BioPython mailing list - BioPython at biopython.org
>> http://biopython.org/mailman/listinfo/biopython
>>
>>
>>
>>
>
>
--
Iddo Friedberg, Ph.D.
Burnham Institute for Medical Research
10901 N. Torrey Pines Rd.
La Jolla, CA 92037 USA
Tel: +1 (858) 646 3100 x3516
Fax: +1 (858) 713 9949
http://iddo-friedberg.org
http://BioFunctionPrediction.org
More information about the BioPython
mailing list