[BioPython] Protparam using BioPython

Shameer Khadar skhadar at gmail.com
Mon Apr 30 13:01:56 UTC 2007


Dear Peter,

Thanks a  lot for you detailed reply and splendid help !!!
It worked !!
Cheers,
Shameer

On 4/27/07, Peter <biopython at maubp.freeserve.co.uk> wrote:
>
> Shameer Khadar wrote:
> > Dear Peter,
> >
> > Thanks for your reply.
>
> Sorry for the delay - I was away on a course this week.
>
> > I was looking for a script based on Bio.SeqUtils.
> > I got the following script from a website, its working perfect for me.
> But
> > the problem is i have around 1000 sequence (in raw format without
> headers)
> > and i thought to process it using a foreach equivalent in python(I am a
> > python newbie). But its only a couple of minutes back i came to know
> that
> > there is no foreach in python, but some better alternative is available
> > !!!.
>
> There is a "for each" equivalent in python!
> http://docs.python.org/tut/node6.html
>
> If you don't have a good introductory python book, that online tutorial
> is an excellent starting point.
>
> > It will be great if you can help to process my file using this
> > program.
> >
> > program :
> > from Bio.SeqUtils import ProtParam, ProtParamData
> > def PrintDictionary(MyDict):
> >         for i in MyDict.keys():
> >                 print "%s\t%.2f" %(i, MyDict[i])
> >         print "MAEGEITTFTALTEKFNLPPGNYKKPKLLYCSNGGHFL"
> > X = ProtParam.ProteinAnalysis("")
> > print "Instability index of test protein: %.2f" % X.instability_index()
>
> It seems like you have only given bits of a program, so I have tried to
> guess what you meant.
>
> > first few lines of my file :
> > AEGEFAHLYGTFRED
> > AEGEFAHLZGTFRED
> > AEGEFGATYGVYTSD
> > AEGEFGATZGVYTSD
> > AEGEFGATYGVZTSD
> > AEGEFGATZGVZTSD
> > AEGEFLYGEIQGTQD
>
> In the following example, I am assuming your sequences are in a plain
> text file, called protparam.txt, which contains each sequence on a
> single line.
>
> Try something like this first of all, and make sure that it prints out
> your sequences correctly:
>
> for line in open("protparam.txt") :
>      #Remove any trailing new lines or white space
>      seq_string = line.rstrip()
>      print "Sequence <%s>" % seq_string
>
> Then try doing the ProtParam.ProteinAnalysis of each sequence string:
>
> from Bio.SeqUtils import ProtParam, ProtParamData
> for line in open("protparam.txt") :
>      #Remove any trailing new lines or white space
>      seq_string = line.rstrip()
>      print "Sequence <%s>" % seq_string
>      X = ProtParam.ProteinAnalysis(seq_string)
>      print "Instability index: %.2f" % X.instability_index()
>
> You'll find it doesn't like the "Z" (presumably this is Glx - glutamic
> acid or glutamine? i.e. E or Q) present in many of your sequences, so
> this next version uses error handling to note this and then carry on to
> the next sequence:
>
> from Bio.SeqUtils import ProtParam, ProtParamData
> for line in open("protparam.txt") :
>      #Remove any trailing new lines or white space
>      seq_string = line.rstrip()
>
>      print #blank line
>      print "Sequence <%s>" % seq_string
>      X = ProtParam.ProteinAnalysis(seq_string)
>      try :
>          print "Instability index: %.2f" % X.instability_index()
>      except KeyError, e :
>          print "Problem with the letter %s in the sequence?" % str(e)
>
> The output is:
>
> Sequence <AEGEFAHLYGTFRED>
> Instability index: 8.39
>
> Sequence <AEGEFAHLZGTFRED>
> Problem with the letter 'Z' in the sequence?
>
> Sequence <AEGEFGATYGVYTSD>
> Instability index: -17.70
>
> Sequence <AEGEFGATZGVYTSD>
> Problem with the letter 'Z' in the sequence?
>
> Sequence <AEGEFGATYGVZTSD>
> Problem with the letter 'Z' in the sequence?
>
> Sequence <AEGEFGATZGVZTSD>
> Problem with the letter 'Z' in the sequence?
>
> Sequence <AEGEFLYGEIQGTQD>
> Instability index: 8.61
>
> You'll have to check yourself to see if these numbers are sensible.  I
> don't know what to suggest for your "Z" entries - the stability will be
> different if you try using E or Q instead.
>
> Peter
>
>



More information about the Biopython mailing list