[Biopython] MUSCLE gap extend penalty

Peter Cock p.j.a.cock at googlemail.com
Tue Mar 14 22:24:43 UTC 2017


Hi Joshua

You're exactly right - this isn't in the Biopython wrapper (yet), only
the gapopen option is there. Perhaps gapextend was new in Muscle 3.8?

Could you have a look at the code for the wrapper and see if you think
you could open a pull request to add this?
https://github.com/biopython/biopython/blob/master/Bio/Align/Applications/_Muscle.py

In any case, please file an issue on Github about this:
https://github.com/biopython/biopython/issues

Thanks,

Peter

On Tue, Mar 14, 2017 at 1:09 PM, Joshua Meyers <Joshua.Meyers at icr.ac.uk> wrote:
> Hi All,
>
> I am using the MUSCLE command line wrapper for pairwise sequence alignments
> (using StringIO but that isn’t the issue here…).
> It works nicely, until I try to specify a gapextend penalty. It seems that
> this Option was neglected in the command line wrapper?
> The Option does exist in the MUSCLE docs:
> http://www.drive5.com/muscle/muscle_userguide3.8.html
>
> records =
> [SeqRecord(full_ref_seq,id="ref"),SeqRecord(full_fit_seq,id="fit")]
> muscle_cline = MuscleCommandline(clwstrict=True, matrix='blosum62',
> gapopen=-11.0, gapextend=-1.0, center=0.0)
> child = subprocess.Popen(str(muscle_cline),
>                          stdin=subprocess.PIPE,
>                          stdout=subprocess.PIPE,
>                          stderr=subprocess.PIPE,
>                          universal_newlines=True,
>                          shell=(sys.platform!="win32"))
> SeqIO.write(records, child.stdin, "fasta")
> child.stdin.close()
> align = AlignIO.read(child.stdout, format="clustal”)
>
> ValueError: Option name gapextend was not found.
>
>
> Upon inspecting the muscle_cline object, this kwarg is indeed absent (it
> exists in the analogous clustal wrapper).
> print dir(muscle_cline)
>
> ['__call__', '__class__', '__delattr__', '__dict__', '__doc__',
> '__format__', '__getattribute__', '__hash__', '__init__', '__module__',
> '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__',
> '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_check_value',
> '_clear_parameter', '_get_parameter', '_validate', 'anchors',
> 'anchorspacing', 'center', 'cluster1', 'cluster2', 'clw', 'clwout',
> 'clwstrict', 'clwstrictout', 'core', 'diaglength', 'diagmargin', 'diags',
> 'distance1', 'distance2', 'fasta', 'fastaout', 'gapopen', 'group', 'html',
> 'htmlout', 'hydro', 'hydrofactor', 'in1', 'in2', 'input', 'le', 'log',
> 'loga', 'maxdiagbreak', 'maxhours', 'maxiters', 'maxtrees',
> 'minbestcolscore', 'minsmoothscore', 'msf', 'msfout', 'noanchors', 'nocore',
> 'objscore', 'out', 'parameters', 'phyi', 'phyiout', 'phys', 'physout',
> 'profile', 'program_name', 'quiet', 'refine', 'root1', 'root2', 'seqtype',
> 'set_parameter', 'smoothscoreceil', 'smoothwindow', 'sp', 'spn', 'stable',
> 'sueff', 'sv', 'tree1', 'tree2', 'verbose', 'version', 'weight1', 'weight2']
>
>
>
> Is there another way around this? Any help would be much appreciated.
>
> Thanks in advance,
>
> Josh
>
> The Institute of Cancer Research: Royal Cancer Hospital, a charitable
> Company Limited by Guarantee, Registered in England under Company No. 534147
> with its Registered Office at 123 Old Brompton Road, London SW7 3RP.
>
> This e-mail message is confidential and for use by the addressee only. If
> the message is received by anyone other than the addressee, please return
> the message to the sender by replying to it and then delete the message from
> your computer and network.
>
> _______________________________________________
> Biopython mailing list  -  Biopython at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/biopython



More information about the Biopython mailing list