[BioPython] Whitespace in sequences
Paul-Michael Agapow
biopython at agapow.net
Tue Feb 18 10:45:50 EST 2003
Possibly a known bug or even a behaviour that makes sense but ...
While recently writing a biopython script to extract subsequences from
a fasta file, I was surprised to find that whitespace was retained
within the sequence after it was read into a SeqRecord. Specifically,
carriage returns ('\r') were left embedded in the sequence, which then
made the sequence lengths inaccurate and meant I extracted the wrong
regions.
So, any ideas about this behaviour? I solved it with a simple re to
remove whitespace, but I can't think of any format in which whitespace
is significant within a sequence, so surely it should all be cleaned up.
--
Dr Paul-Michael Agapow (p.agapow at ucl.ac.uk)
Dept. Biology, University College London
More information about the BioPython
mailing list