[Biopython-dev] Re: [BioPython] Bug in Bio.SeqUtils ?
Yair Benita
yair.benita at gmail.com
Fri Feb 3 03:39:38 EST 2006
Hi,
Sorry I missed to follow up on that bug.
I need to revise the isoelectric point anyway since in some rare cases it
gets stuck in an endless while loop. I will also look into adding code to
handle the X in the amino acid sequence. For now I think its OK to produce a
warning instead of an exception.
Yair
on 2/2/06 10:14 PM, Iddo Friedberg at idoerg at burnham.org wrote:
> Oh, sorry.
>
> Your second problem was with protein_scale, which does indeed break on
> any letter not of the 20 regular amino acids.
>
> I inserted this into a try/except clause which produces a warning to
> stderr, instead of raising an exception. It is now in CVS.
>
> Yair, is that OK, or would we rather leave the exception raising bit
> there? There are arguments either way...
>
>
> ./I
>
>
> Iddo Friedberg wrote:
>
>> Which version are you using? I tried the 1a8y sequence which you gave,
>> and also a sequence with an 'X', and they worked fine for me. CVS
>> version.
>>
>> # seq is a Record object. seq.sequence is a string with the protein
>> sequence
>>
>>>>> from Bio.SeqUtils import ProtParam
>>>>> ps = ProtParam.ProteinAnalysis(seq.sequence)
>>>>> ps.isoelectric_point()
>> 3.9298931884765151
>>
>>
>> # and for a sequence with an 'x'
>>>>> ps2 = ProtParam.ProteinAnalysis('xsdfgvcrtyip')
>>>>> ps2.isoelectric_point()
>> 5.8285980224609375
>>
>> Bin Hu wrote:
>>
>>> Hi,
>>>
>>> When using Bio.SeqUtils to estimate isoelectric point for PDB entry
>>> 1a8y, it
>>> seems the function isoelectric_point() cannot reach an end, although it
>>> worked pretty well for all the other entries that I've tested. Could
>>> this be
>>> a bug in Bio.SeqUtils?
>>>
>>> If anyone want to test it, blow is the sequence of 1a8y:
>>>
>>> eegldfpeydgvdrvinvnaknyknvfkkyevlallyheppeddkasqrqfemeelilel
>>> aaqvledkgvgfglvdsekdaavakklglteedsiyvfkedevieydgefsadtlvefll
>>> dvledpveliegerelqafeniedeikligyfknkdsehykafkeaaeefhpyipffatf
>>> dskvakkltlklneidfyeafmeepvtipdkpnseeeivnfveehrrstlrklkpesmye
>>> tweddmdgihivafaeeadpdgyefleilksvaqdntdnpdlsiiwidpddfpllvpywe
>>> ktfdidlsapqigvvnvtdadsvwmemddeedlpsaeeledwledvlegeintedddded
>>> ddddddd
>>>
>>> For PDB entry 1rb9, the hydrophilicity of this protein cannot be
>>> estimated
>>> because its sequence starts with "X", which is not in the key list
>>> used by
>>> SeqUtils. It will bring the following error message:
>>>
>>> Traceback (most recent call last):
>>> File "./dataGen.py", line 62, in ?
>>> aHydrophilicityList = aSeqObj.protein_scale(ProtParamData.hw, 5)
>>> File "/usr/lib/python2.4/site-packages/Bio/SeqUtils/ProtParam.py", line
>>> 206, in protein_scale
>>> score += weight[j] * ParamDict[subsequence[j]] + weight[j] *
>>> ParamDict[subsequence[Window-j-1]]
>>> KeyError: 'X'
>>>
>>> Although I can delete the "X" in this protein, could the author
>>> implement a
>>> warning message and work around this error stop? Thank you.
>>>
>>> Bin
>>>
>>> _______________________________________________
>>> BioPython mailing list - BioPython at biopython.org
>>> http://biopython.org/mailman/listinfo/biopython
>>>
>>>
>>>
>>>
>>
>>
>
More information about the Biopython-dev
mailing list