[Biopython] Extract the similarity after using Water in Emboss

Islam Amin eng.islamamin at gmail.com
Wed Feb 22 01:16:41 UTC 2017


Many thanks Brian for your comments, I'm just new in biopython, according
to your guides, I wrote a script to get the similarity value and the output
is "*92.3*":

from Bio.Emboss.Applications import WaterCommandline
import string

water_cmd = WaterCommandline(gapopen=10, gapextend=0.5, stdout=True,
auto=True)
#water_cmd = WaterCommandline(gapopen=10, gapextend=0.5, aformat=True,
stdout=True, auto=True)
water_cmd.asequence = "asis:ACCCGGGCGCGGT"
water_cmd.bsequence = "asis:ACCCGAGCGCGGT"
output = water_cmd()
output =  str(output[0])
for line in output.splitlines():
for word in line.split():
if word == "Similarity:":
similarity = float(line[line.index("(") + 1:line.rindex("%)")])
print similarity

On Wed, Feb 22, 2017 at 5:26 AM, Brian Osborne <bosborne11 at verizon.net>
wrote:

> Islam,
>
> I’m looking at the Bio.Emboss code and it seems that, generally speaking,
> this module runs EMBOSS applications but does not parse their output, which
> is what would be required to get the “similarity” value.
>
> Also note that in your code example you haven’t yet run water, which you
> would do like "water_cmd()”. But you can always parse out this similarity
> value yourself, first by capturing the output, something like this:
>
> >>> water_cmd = WaterCommandline(gapopen=10, gapextend=0.5, stdout=True,
> auto=True)
> >>> water_cmd.asequence = "asis:ACCCGGGCGCGGT"
> >>> water_cmd.bsequence = "asis:ACCCGAGCGCGGT"
> >>> output = water_cmd()
> >>> output
> ('########################################\n# Program: water\n# Rundate:
> Tue 21 Feb 2017 13:23:38\n# Commandline: water\n#    -auto\n#
> -stdout\n#    -asequence asis:ACCCGGGCGCGGT\n#    -bsequence
> asis:ACCCGAGCGCGGT\n#    -gapopen 10\n#    -gapextend 0.5\n# Align_format:
> srspair\n# Report_file: stdout\n######################
> ##################\n\n#=======================================\n#\n#
> Aligned_sequences: 2\n# 1: asis\n# 2: asis\n# Matrix: EDNAFULL\n#
> Gap_penalty: 10.0\n# Extend_penalty: 0.5\n#\n# Length: 13\n# Identity:
> 12/13 (92.3%)\n# Similarity:    12/13 (92.3%)\n# Gaps:           0/13 (
> 0.0%)\n# Score: 56.0\n# \n#\n#=======================================\n\nasis
>               1 ACCCGGGCGCGGT     13\n
> |||||.|||||||\nasis               1 ACCCGAGCGCGGT
> 13\n\n\n#---------------------------------------\n#---------
> ------------------------------\n', '')
>
> Now just use your regex.
>
> Brian O.
>
>
>
>
> On Feb 19, 2017, at 7:28 PM, Islam Amin <eng.islamamin at gmail.com> wrote:
>
> Dear All.
> I would like to get the similarity between two sequences, I have found
> that there is attributes called "similarity" but I this it is boolean
> value, is there any way to get the similarity value between two sequences
> instead of writing a text file.
>
> from Bio.Emboss.Applications import WaterCommandline
> water_cmd = WaterCommandline(gapopen=10, gapextend=0.5)
> water_cmd.asequence = "asis:ACCCGGGCGCGGT"
> water_cmd.bsequence = "asis:ACCCGAGCGCGGT"
> print water_cmd.similarity
> > None
>
> _______________________________________________
> Biopython mailing list  -  Biopython at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/biopython
>
>
>


-- 
Best Regards,
Islam Amin.

www.egyptscience.net
Scientific Research Group in EGYPT
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20170222/47ba32cd/attachment.html>


More information about the Biopython mailing list