[Biopython] Maintaining partition order in concatenated nexus

David Winter djwinter at asu.edu
Sun Jun 14 22:09:04 UTC 2015


Hi Michael,

My apologies -- both for taking so long to get back to you and for having
sent you down the wrong track here.

The idea of using **kwargs in the snippet I sent was to get around
specifying all the arguments to the initializing function Nexus.Nexus. Any
argument in the keyword list needs to be specified by name so here you
would have had to use OrderNexus(input="FILENAME"). (In fact, input is the
_only_ non-self argument to the Nexus initializer, so it would have been
easier just to name it, sorry).

As it happens that wouldn't have got you much further, because the combine
function overwrites the character partitions anyway.

Instead, here's a hacky solution that seems to work. Do all the combining
as per usual then overwrite the partition dictionary with an OrderedDict
sorted to match your input order:


files = ["zero.nex", "one.nex"]
n = Nexus.Nexus("codonposset.nex") #biopython/Tests/Nexus/
combined = Nexus.combine([("zero.nex", n), ("one.nex", n)])
combined.charpartitions["combined"].keys()
# Not the order we want them!
# ['one.nex', 'zero.nex']

#Create an dictionay to map filename to desired order
order_map = {pos:fname for fname, pos in enumerate(files)}
#re-place the 'inner' dictionary with and ordered one, suing the map as the
sorting key
combined.charpartitions["combined"] = OrderedDict(
          sorted(combined.charpartitions["combined"].items(),
           key=lambda x: order_map[ x[0] ]))

combined.charpartitions["combined"].keys()
#This is the order we wanted
# ['zero.nex', 'one.nex']

Hope that helps,

David





On Wed, Jun 10, 2015 at 6:14 AM, Michael Gruenstaeudl <
mi.gruenstaeudl at gmail.com> wrote:

> Hi David,
> thank you for your response. However, I am not fully certain that I
> understand your suggested way forward. Is the below example what you meant
> (and if so, how would you debug it)?
> Thank you, Michael
>
>
> >>> from Bio.Nexus import Nexus
> >>> import glob
> >>> file_list = glob.glob('*.nex')
> >>> file_list.sort() # Important step
> >>> nexi = [(fname, Nexus.Nexus(fname)) for fname in file_list]
> >>> from collections import OrderedDict
> >>> class OrderedNexus(Nexus.Nexus):
> ...   def __init__(self, **kwargs):
> ...     Nexus.Nexus.__init__(self, **kwargs)
> ...     self.charpartitions = OrderedDict()
> ...
> >>> nexi_ordered = [(n[0], OrderedNexus(n[1])) for n in nexi]
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: __init__() takes exactly 1 argument (2 given)
> >>> # Would be followed by: combined = Nexus.combine(nexi_ordered)
>
>
>
> Michael Grünstäudl (Gruenstaeudl), PhD
> E-mail: mi.gruenstaeudl at gmail.com
> Website: http://blogs.fu-berlin.de/gruenstaeudl/
>
>
> On 06/09/2015 11:19 PM, David Winter wrote:
>
>> In that last message the class definition should be
>>
>> class OrderedNexus(Nexus.Nexus):
>>      """ Subclass of Bio.Nexus.Nexus used to maintain partition order """
>>
>>      def __init__(self, **kwargs):
>>          Nexus.Nexus.__init__(self, **kwargs)
>>          self.charpartitions = OrderedDict()
>>
>> (i.e. with the parentheses after OrderedDict..... you always see them
>> the second you hit "send" :)
>>
>> David
>>
>> On Tue, Jun 9, 2015 at 2:17 PM, David Winter <djwinter at asu.edu
>> <mailto:djwinter at asu.edu>> wrote:
>>
>>     Hi Michael,
>>
>>     I think this is a result of  "charpartitions" in the Nexus object
>>     being a dictionary. In python dictionaries are "unordered", so you
>>     don't usually get objects back out in the order they went in.
>>
>>     One possible workaround it to make you own class that inherits
>>     everything else from Nexus but instead uses an ordered  dictionary
>>     (from collections), something like
>>
>>
>>     from collections import OrderedDict
>>     from Bio.Nexus import Nexus
>>
>>     class OrderedNexus(Nexus.Nexus):
>>          """ Subclass of Bio.Nexus.Nexus used to maintain partition
>>     order """
>>
>>          def __init__(self, **kwargs):
>>              Nexus.Nexus.__init__(self, **kwargs)
>>              self.charpartitions = OrderedDict
>>
>>
>>
>>     o = OrderedNexus()
>>     n = Nexus.Nexus()
>>       o.charpartitions
>>     #OrderedDict()
>>     n.charpartitions
>>     #{}
>>     dir(o) # (all the stuff you expect to see in a Nexus object)
>>
>>     I haven't had a chance to test this on the example, but hope it's
>>     some help to you.
>>
>>     David
>>
>>
>>     On Tue, Jun 9, 2015 at 6:37 AM, Michael Gruenstaeudl
>>     <mi.gruenstaeudl at gmail.com <mailto:mi.gruenstaeudl at gmail.com>> wrote:
>>
>>         Hi all,
>>         like many others, I have been using the excellent example on the
>>         Biopython wiki to concatenate multiple alignments into one
>>         nexus-file using Biopython's Nexus.combine() function. However,
>>         what if I wish to maintain the order of the nexus-partitions
>>         specified in 'file_list'. While the tuple 'nexi' is still
>>         ordered according to 'file_list', 'combined.charsets.items()' is
>>         not. Moreover, sorting the charsets is not possible:
>>
>>          >>> combined.charsets.items()[0]
>>         ('partition0038_rps4_CDS.nex', [36567, 36568])
>>
>>          >>> combined.charsets.items()[1]
>>         ('partition0004_trnK_CDS.nex', [36569, 36573])
>>
>>          >>> for i in range(0,len(combined.charsets.items())):
>>         ...     combined.charsets.items()[i] = sorted_items[i]
>>         ...
>>          >>> combined.charsets.items()[0]
>>         ('partition0038_rps4_CDS.nex', [36567, 36568])
>>
>>         What procedure would you recommend to maintain the input order
>>         of 'file_list' in the output file 'combined.nex'?
>>
>>         Thank you, Michael
>>
>>         --
>>         Michael Gruenstaeudl (Grünstäudl), PhD
>>         E-mail: mi.gruenstaeudl at gmail.com <mailto:
>> mi.gruenstaeudl at gmail.com>
>>         Website: http://blogs.fu-berlin.de/gruenstaeudl/
>>         <http://u.osu.edu/gruenstaeudl/>
>>
>>         _______________________________________________
>>         Biopython mailing list  - Biopython at mailman.open-bio.org
>>         <mailto:Biopython at mailman.open-bio.org>
>>         http://mailman.open-bio.org/mailman/listinfo/biopython
>>
>>
>>
>>
>>     --
>>     David Winter
>>     Postdoctoral Research Associate
>>     Center for Evolutionary Medicine and Informatics
>>     The Biodesign Institute
>>     Arizona State University
>>
>>     ph: +1 480 519 5113 <tel:%2B1%20480%20519%205113>
>>     w: www.david-winter.info <http://www.david-winter.info>
>>     lab: http://cartwrig.ht/lab/
>>     blog: sciblogs.co.nz/the-atavism <http://sciblogs.co.nz/the-atavism>
>>
>>
>>
>>
>> --
>> David Winter
>> Postdoctoral Research Associate
>> Center for Evolutionary Medicine and Informatics
>> The Biodesign Institute
>> Arizona State University
>>
>> ph: +1 480 519 5113
>> w: www.david-winter.info <http://www.david-winter.info>
>> lab: http://cartwrig.ht/lab/
>> blog: sciblogs.co.nz/the-atavism <http://sciblogs.co.nz/the-atavism>
>>
>


-- 
David Winter
Postdoctoral Research Associate
Center for Evolutionary Medicine and Informatics
The Biodesign Institute
Arizona State University

ph: +1 480 519 5113
w: www.david-winter.info
lab: http://cartwrig.ht/lab/
blog: sciblogs.co.nz/the-atavism
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20150614/09663674/attachment.html>


More information about the Biopython mailing list