[BioPython] Parsing BLAST
Alex Garbino
agarbino at gmail.com
Fri Aug 29 15:39:22 UTC 2008
>> I'm now almost done. My script is to take a fasta file, run blast, and
>> output a comma-separated-values list in the following format:
>> AccessionID, Source, Length, FASTA sequence.
>
> FASTA sequence format looks like this:
>
>>name and description
> CATACGACTACGTCAACGATCCGAACT
> GACTACGATCAGCATCGACTAGCTGTG
> GTGTGGT
>>name2 and second sequence description
> AGCGACAGCGACGAGCAGCGACGAG
> AGCGAGC
>
> Its not something you can squeeze into a comma separared file. I
> think you might just mean getting the sequence itself - or have two
> files (one CVS, one FASTA).
>
> Peter
>
That's the problem I'm having... I want to keep FASTA format (so I can
plug it into ClustalW, etc), which is difficult to do because of the
newline after the fasta title.
Manually in excel, I could fit the whole FASTA into a cell, I think it
was converted to a string (when I copy-pasted it into clustalw, it
would be in " ").
Is there a way to ignore the newline between description and sequence?
Thanks,
Alex
More information about the Biopython
mailing list