[BioPython] Re: question regarding writing SeqRecord objects in
Fasta format
Iddo Friedberg
idoerg at burnham.org
Mon Jul 18 20:21:04 EDT 2005
Ann Loraine wrote:
> Hello,
>
> To answer your question - I read in the fasta records like so:
>
> from Bio import Fasta
> fh = gzip.Gzipfile('seqs.fa.gz').open()
> parser = Fasta.RecordParser()
> iterator = Fasta.Iterator(fh,parser)
> curr_record = iterator.next()
>
> I was following the example in this tutorial Web page:
>
> http://www.biopython.org/docs/tutorial/Tutorial003.html#toc7
>
>
> "Let's make all of this talk more concrete by using the Iterator and
> Record interfaces to do what we did before -- extract a unique list of
> all species in our FASTA file. First we need to set up our parser and
> iterator:
> >>> from Bio import Fasta
> >>> parser = Fasta.RecordParser()
> >>> file = open("ls_orchid.fasta")
> >>> iterator = Fasta.Iterator(file, parser)"
>
> Should I be using the SeqIO method instead to read fasta records if I
> want to write some of them out to a fasta format file?
>
> -Ann
>
>
Yes, if you want to use SeqIO for output, use it for input as well. When
reading using Bio.Fasta.Iterator, you are creating Bio.Fasta.Record
instances, which do not have the 'id' attribute. When reading using
Bio.SeqIO.FastaReader, you are creating a Bio.SeqRecord instance, which
is a different representation of a sequence. But Bio.SeqIO.FASTA does
have a writing method, so you may want to use that.
The reason that Biopython has two ways of representing sequences are
basically historical: both methods were CVS deposited, approved, and
code grew around both. Not exactly optimal I know.
HTH<
./I
--
Iddo Friedberg, Ph.D.
The Burnham Institute
10901 N. Torrey Pines Rd.
La Jolla, CA 92037 USA
Tel: +1 (858) 646 3100 x3516
Fax: +1 (858) 713 9949
http://ffas.ljcrf.edu/~iddo
More information about the BioPython
mailing list