[BioPython] Bug in Bio.SeqUtils ?

Bin Hu hubin.keio at gmail.com
Fri Feb 3 09:27:10 EST 2006


Thank you for your reply. I am using Python 2.4, BioPython 1.41
(install from src). I will check CVS version when I get some time.

Regards,
Bin

On 2/3/06, Iddo Friedberg <idoerg at burnham.org> wrote:
> Oh, sorry.
>
> Your second problem was with protein_scale, which does indeed break on
> any letter not of the 20 regular amino acids.
>
> I inserted this into a try/except clause which produces a warning to
> stderr, instead of raising an exception. It is now in CVS.
>
> Yair, is that OK, or would we rather leave the exception raising bit
> there? There are arguments either way...
>
>
> ./I
>
>
> Iddo Friedberg wrote:
>
> > Which version are you using? I tried the 1a8y sequence which you gave,
> > and also a sequence with an 'X', and they worked fine for me. CVS
> > version.
> >
> > # seq is a Record object. seq.sequence is a string with the protein
> > sequence
> >
> > >>> from Bio.SeqUtils import ProtParam
> > >>> ps = ProtParam.ProteinAnalysis(seq.sequence)
> > >>> ps.isoelectric_point()
> > 3.9298931884765151
> >
> >
> > # and for a sequence with an 'x'
> > >>> ps2 = ProtParam.ProteinAnalysis('xsdfgvcrtyip')
> > >>> ps2.isoelectric_point()
> > 5.8285980224609375
> >
> > Bin Hu wrote:
> >
> >> Hi,
> >>
> >> When using Bio.SeqUtils to estimate isoelectric point for PDB entry
> >> 1a8y, it
> >> seems the function isoelectric_point() cannot reach an end, although it
> >> worked pretty well for all the other entries that I've tested. Could
> >> this be
> >> a bug in Bio.SeqUtils?
> >>
> >> If anyone want to test it, blow is the sequence of 1a8y:
> >>
> >> eegldfpeydgvdrvinvnaknyknvfkkyevlallyheppeddkasqrqfemeelilel
> >> aaqvledkgvgfglvdsekdaavakklglteedsiyvfkedevieydgefsadtlvefll
> >> dvledpveliegerelqafeniedeikligyfknkdsehykafkeaaeefhpyipffatf
> >> dskvakkltlklneidfyeafmeepvtipdkpnseeeivnfveehrrstlrklkpesmye
> >> tweddmdgihivafaeeadpdgyefleilksvaqdntdnpdlsiiwidpddfpllvpywe
> >> ktfdidlsapqigvvnvtdadsvwmemddeedlpsaeeledwledvlegeintedddded
> >> ddddddd
> >>
> >> For PDB entry 1rb9, the hydrophilicity of this protein cannot be
> >> estimated
> >> because its sequence starts with "X", which is not in the key list
> >> used by
> >> SeqUtils. It will bring the following error message:
> >>
> >> Traceback (most recent call last):
> >>  File "./dataGen.py", line 62, in ?
> >>    aHydrophilicityList = aSeqObj.protein_scale(ProtParamData.hw, 5)
> >>  File "/usr/lib/python2.4/site-packages/Bio/SeqUtils/ProtParam.py", line
> >> 206, in protein_scale
> >>    score += weight[j] * ParamDict[subsequence[j]] + weight[j] *
> >> ParamDict[subsequence[Window-j-1]]
> >> KeyError: 'X'
> >>
> >> Although I can delete the "X" in this protein, could the author
> >> implement a
> >> warning message and work around this error stop? Thank you.
> >>
> >> Bin
> >>
> >> _______________________________________________
> >> BioPython mailing list  -  BioPython at biopython.org
> >> http://biopython.org/mailman/listinfo/biopython
> >>
> >>
> >>
> >>
> >
> >
>
>
> --
> Iddo Friedberg, Ph.D.
> Burnham Institute for Medical Research
> 10901 N. Torrey Pines Rd.
> La Jolla, CA 92037 USA
> Tel: +1 (858) 646 3100 x3516
> Fax: +1 (858) 713 9949
> http://iddo-friedberg.org
> http://BioFunctionPrediction.org
>
>



More information about the BioPython mailing list