[Biopython] GenBank.Scanner use of os.linesep
Peter
biopython at maubp.freeserve.co.uk
Fri Apr 9 04:54:53 EDT 2010
On Fri, Apr 9, 2010 at 5:46 AM, Reece Hart <reece at berkeley.edu> wrote:
> Hi All-
>
> I recently discovered that the GenBank parser doesn't work on Google App
> Engine because os.linesep is undefined (GenBank/Scanner.py:746):
>
> 745 # if self.line[-1] == "\n" : self.line = self.line[:-1]
> 746 self.line = self.line.rstrip(os.linesep)
> 747 misc_lines.append(self.line)
>
> Defining os.linesep is sufficient to fix the problem (thanks to Brad
> Chapman).
>
> It seems to me that this use of os.linesep is probably mistaken here.
I agree.
> If the
> file comes from efetch, the line separator will be \n regardless of platform
> [1] and that is what should be used in rstrip. It's possible that the file
> might come from a dog-foresaken CRLF platform and therefore contain that
> line separator.
I think it would break in a more common setting - passing a file on
Windows with CRLF, since Python will turn that into just \n.
> So, I humbly propose that 746 be changed to either rstrip('\n') or, perhaps,
> rstrip('\n\r'). Although the need for the latter is probably rare, I don't
> see that it costs anything to cover that case by adding \r.
A plain rstrip() would also work and get rid of any trailing whitespace.
I've checked that in.
> I'm new to this community, so I don't know whether we now have ferocious
> debate about the merits of line terminators or, rather, I submit a lame
> one-liner patch against the git HEAD.
For something this trivial, your verbal patch is fine. Would you like
to be added to the NEWS and CONTRIB file?
Peter
More information about the Biopython
mailing list