[Biopython] Read sequence from file
Peter Cock
p.j.a.cock at googlemail.com
Wed Feb 25 17:39:08 UTC 2015
On Wed, Feb 25, 2015 at 4:03 PM, Horea Chrristian <h.chr at mail.ru> wrote:
> Hi guys, how can I read a sequence from a .txt file which contains only a
> string of letters (nucleotides)? I tried `SeqIO.read("my/file","...")` but
> if my second value is fasta or genbank, it complains about missing handles,
> and nothing like "plain", "string", or "str" worked... What can I do? It
> would be nice if I can do this via a one-liner rather than just read it
> explicitly with python and then explicitly parse it.
>
> Cheers,
Right now you'd just do something like this:
with open("my_example.txt") as handle:
my_seq_as_string = handle.read().strip()
Or, if you want a Seq object with eg DNA alphabet,
from Bio.Seq import Seq
from Bio.Alphabet import generic_dna
with open("my_example.txt") as handle:
my_seq = Seq(handle.read().strip(), generic_dna)
I'm assuming there are no line breaks or other whitespace etc.
What you are asking for sounds a bit like adding what EMBOSS calls
the "raw" file format to Biopython's SeqIO:
http://emboss.sourceforge.net/docs/themes/SequenceFormats.html
If this was added, what would you expect as the record's identifier?
Also would you expect one sequence regardless of any line breaks in the
file - or one sequence per line?
Peter
More information about the Biopython
mailing list