[Biopython] Generator expression for SeqIO

Mic mictadlo at gmail.com
Wed Dec 7 13:11:24 UTC 2011


Thank you for the solution.

My input files is not exactly a FASTA file, but it contains information to
build it. The file looks like this:
test1\t0001\a1\tAATTCC

Output should looks like:
>test1_a1
AATTCC


On Wed, Dec 7, 2011 at 6:49 PM, Willis, Jordan R <
jordan.r.willis at vanderbilt.edu> wrote:

> Does the input.txt have fasta sequences in it and you want to write them
> to a file? This is what it looks like.
>
> from Bio import SeqIO as sio
>
> generator = sio.read('input.txt','fasta')
>
>
> for i in generator:
> print i.header
> print i.seq
>
> will give you header and sequence. You could of course write this to a
> file, but you seem to be inputting a fasta file just to write out another
> one.
>
>
>
>
> Jordan Willis
> Ph.D Candidate, CPB
> Laboratory of Dr. James Crowe and Dr. Jens Meiler
> 11475 MRBIV
> 2213 Garland Ave.
> Nashville, TN 37232
> Cell: 816-674-5340
> Office: 615-343-8263
>
>
> On Dec 7, 2011, at 12:26 AM, Peter Cock wrote:
>
> On Wed, Dec 7, 2011 at 4:41 AM, Mic <mictadlo at gmail.com> wrote:
>
> No worries is was perfect.
>
>
> I have the following code and I do not know how to combine the *header* and
>
> *seq* variables from the '*with*' statement with generator expression?
>
>
> from Bio import SeqIO
>
> from Bio.SeqRecord import SeqRecord
>
> from Bio.Seq import Seq
>
> from pprint import pprint
>
>
> if __name__ == '__main__':
>
>
>    *with* open('input.txt') as f:
>
>        for line in f:
>
>            try:
>
>                splited_line = line.split('\t')
>
>
>                *header* = splited_line[0] +'_'+ splited_line[2]
>
>                *seq* = splited_line[3]
>
>            except IndexError:
>
>                continue
>
>
>    fasta_file = open('output.fasta', 'w')
>
>    records = (SeqRecord(???), id=????, description="") for i in ???)
>
>
>    SeqIO.write(records, fasta_file, "fasta")
>
>
> Thank you in advance.
>
>
> Are you trying to parse a tabular file, with three columns
> (ID, sequence, description)?
>
> I suggest you learn about generator functions in Python.
>
> Peter
>
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>
>
>



More information about the Biopython mailing list