[Biopython] Write FASTA sequence on a single line

Martin Mokrejs mmokrejs at fold.natur.cuni.cz
Sun Mar 4 14:15:27 EST 2012


Hi Willis and Wibowo,
  yes, I also write the new fasta files myself, obeying the biopythons
writer. Sometimes I even parse them myself. But mainly I wanted this
issue raised up and get it implemented into biopython. And I hope that
the argument that parsing of these files is faster will be valued as well.
Not talking about the fact that one can use grep(1) to search through the
sequences, which is impossible if the sequences are split over several lines.
I would even say that one-line sequences should be default. ;)) Or at least
if len() is below e.g. 2000. ;)

  But thanks for pointer to the direct FastaWriter use. I forgot about this
and just had the feeling there was a way ... ;)
Martin

Wibowo Arindrarto wrote:
> Hi Martin,
> 
> A quick glance at Bio.SeqIO.FastaIO.FastaWriter shows that there is indeed an option to set the line wrapping length. However, the regular writing function that calls FastaWriter (SeqIO.write) only accepts three parameters (sequence, handle, and format), so if you really want to use Biopython's fasta writer, you should call FastaWriter directly.
> 
> For example, as shown in the docs:
> 
> from Bio.SeqIO.FastaIO import FastaWriter
> writer = FastaWriter(open(outfile, 'w'), wrap=0)
> writer.write_file(records)
> 
> Alternatively, you can iterate over the records manually and write them to the output file like so:
> 
> with open(outfile, 'w') as target:
> for rec in records: # records is the list containing your SeqRecord objects
>   target.write('>%s\n' % rec.id <http://rec.id/>)
>   target.write('%s\n' % rec.seq.tostring())
> 
> 
> Hope that helps!
> Bow
> 
> 
> On Sun, Mar 4, 2012 at 19:09, Martin Mokrejs <mmokrejs at fold.natur.cuni.cz <mailto:mmokrejs at fold.natur.cuni.cz>> wrote:
> 
>     Hi,
>      is there an option to tell FASTA writer to write output with a
>     sequence on a single line (so that a FASTA entry would span just
>     two lines altogether)? I see it should be faster to eventually
>     parse using SeqIO because one would avoid calls for each line in
>     the FASTAinput file.
> 
>     In my code I have
>     for _record in SeqIO.parse(fastah, 'fasta'):
> 
>     which boils down to biopython's:
>     append(line.rstrip().replace(" ","").replace("\r",""))
> 
>     per every line with _sequence_.
> 
>     Thank you for comments,
>     Martin
> 
> 
> 
>     _______________________________________________
>     Biopython mailing list  -  Biopython at lists.open-bio.org <mailto:Biopython at lists.open-bio.org>
>     http://lists.open-bio.org/mailman/listinfo/biopython
> 
> 


More information about the Biopython mailing list