[Bioperl-l] Check sequence format, question

Chris Fields cjfields at uiuc.edu
Thu Nov 2 04:28:11 UTC 2006


On Nov 1, 2006, at 6:15 PM, Eugene Bolotin wrote:

> Dear bioperl mailing list,
> I trying to get sequence from a file using Bio::SeqIO, before I do  
> anything,
> I want to make sure that the file is in a correct Fasta sequence  
> format. I
> want it to spit out an error message if it is in any other format.
> What is the easiest way to do it?
> Thanks,
> Eugene Bolotin
> Sladek Lab.

There is no formal FASTA definition that is universally accepted  
beyond having the first line start with '>' and an optional  
description, with the sequence in subsequent lines.

http://www.bioperl.org/wiki/FASTA_sequence_format

Bio::SeqIO isn't currently set up to validate sequence formats  
directly, but you could try preparsing the data using  
Bio::Tools::GuessSeqFormat.

Chris

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign






More information about the Bioperl-l mailing list