[Biopython-dev] [Bug 2531] Nexus and fasta parsers have a problem with identical taxa names

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Mon Jun 30 14:36:06 UTC 2008


http://bugzilla.open-bio.org/show_bug.cgi?id=2531





------- Comment #3 from abetanco at staffmail.ed.ac.uk  2008-06-30 10:36 EST -------
Created an attachment (id=956)
 --> (http://bugzilla.open-bio.org/attachment.cgi?id=956&action=view)
nexus file

Sorry for the overly complicated nexus file, but I can't seem to reproduce the
bug with a simple example.  In this case, HI99.Line5 is entered twice, and
differs just at three sites (249, 417, and 452).  The result I get at those
three sites is the first sequence duplicated twice. 

             249        417     452
nexus file                      
HI99.Line5      T       T       A
HI99.Line5      C       C       G
fasta output
HI99.Line5      T       T       A
HI99.Line5      T       T       A


To do the conversion, I used this, which I think is just copied off the
Biopython documentation site:

#! /usr/bin/python


if __name__ == '__main__' :

        from Bio import SeqIO
        import sys

        input_handle = open(sys.argv[1], "rU")
        output_handle = open(sys.argv[1].+"fas", "w")

        sequences = SeqIO.parse(input_handle, "nexus")
        SeqIO.write(sequences, output_handle, "fasta")

        output_handle.close()
        input_handle.close()


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list