[Biopython-dev] [Bug 2860] New: Writing GenBank files should output	features in position order
    bugzilla-daemon at portal.open-bio.org 
    bugzilla-daemon at portal.open-bio.org
       
    Fri Jun 19 12:48:39 UTC 2009
    
    
  
http://bugzilla.open-bio.org/show_bug.cgi?id=2860
           Summary: Writing GenBank files should output features in position
                    order
           Product: Biopython
           Version: 1.50b
          Platform: All
        OS/Version: Linux
            Status: NEW
          Severity: minor
          Priority: P2
         Component: Main Distribution
        AssignedTo: biopython-dev at biopython.org
        ReportedBy: n.j.loman at bham.ac.uk
Adding features to a SeqRecord object does not automatically sort them by
position. Therefore if you do something like this:
for rec in SeqIO.parse(sys.stdin, "genbank"):
        new_features = []
        for feature in rec.features:
                if feature.type == 'CDS':
                        gene_feature = copy(feature)
                        gene_feature.type = 'gene'
                        new_features.append(gene_feature)
        rec.features.extend(new_features)
        SeqIO.write([rec], sys.stdout, "genbank")
You will end up with an incorrectly sorted file with CDS features first, then
gene features.
You can sort rec.features in-place to correct this: 
        rec.features.sort(key=attrgetter('location'))
I am not sure the correct fix in terms of BioPython, whether it should
concentrate on changing the behaviour SeqRecord.features, or the GenBank output
code (which I am aware is a work in progress).
I guess the answer to this is should BioPython guarantee Seqrecord.features to
be sorted?
-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
    
    
More information about the Biopython-dev
mailing list