[Biopython-dev] [Biopython - Bug #3297] (Rejected) newline added in quated features

redmine at redmine.open-bio.org redmine at redmine.open-bio.org
Thu Nov 1 10:48:11 UTC 2012


Issue #3297 has been updated by Peter Cock.

Status changed from New to Rejected

Was this really files a year ago or is that an oddity in RedMine? All the discussion is in the last day...

This to me is a bug in the GenBank data, rather than this:

<pre>
                     /product="Glutamate synthase [NADPH] small chain (EC 1.4.1
                     .13)"
</pre> 

the data should have been line-split in a more sensible place, e.g.

<pre>
                     /product="Glutamate synthase [NADPH] small chain (EC
                     1.4.1.13)"
</pre>

In any case, the suggested fix is inappropriate for two reasons. First, as noted by Paul, it would remove the white space between words (the typical case). Second, the GenBank parser uses a scanner/consumer, with the GenBank specific consumer attempting to closely model the underlying data (and in this case keep the new lines as given) while the SeqRecord consumer (used by SeqIO) would convert the newlines into spaces. As noted by Paul, the translation value is a special case.

Closing issue.
----------------------------------------
Bug #3297: newline added in quated features
https://redmine.open-bio.org/issues/3297

Author: Jesse van Dam
Status: Rejected
Priority: Normal
Assignee: Biopython Dev Mailing List
Category: 
Target version: 
URL: 


Note: sorry for the duplicate reporting, did not notice the makeup of the bug reporting system

When I have a feature line like (which spans multiple lines) in a genbank file

<pre>
                     /product="Glutamate synthase [NADPH] small chain (EC 1.4.1
                     .13)"

</pre>

Then a space/newline will be added between 1.4.1 and .13 in the result so when printing the feature with the following code
<pre>
  print(source[0].qualifiers["product"])
</pre>

It will print (with the an unwanted space) 
<pre>
Glutamate synthase [NADPH] small chain (EC 1.4.1 .13)
</pre>

Changed the following thing in scanner.py to fix this problem
<pre>
                    elif value[0]=='"':
                        #Quoted...
                        if value[-1]!='"' or value!='"':
                            #No closing quote on the first line...
                            while value[-1] != '"':
-                               value += "\n" + iterator.next() 
+                               value += iterator.next() 
                        else:
                            #One single line (quoted)
                            assert value == '"'
                            if self.debug : print "Quoted line %s:%s" % (key, value)
                        #DO NOT remove the quotes...
                        qualifiers.append((key,value))

</pre>


-- 
You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here and login: http://redmine.open-bio.org




More information about the Biopython-dev mailing list