[Bioperl-l] Check sequence format, question

Brian Osborne bosborne11 at verizon.net
Thu Nov 2 12:49:01 UTC 2006


Chris et al.,

As you know the question of whether SeqIO should or should not validate or
check the given format is still an open one. In fact, some SeqIO modules do
validate to some extent. See:

http://bugzilla.open-bio.org/show_bug.cgi?id=1508

I can see that you've commented on this enhancement, I'm replying just to
bring this to the attention of others.

Brian O.


On 11/2/06 12:28 AM, "Chris Fields" <cjfields at uiuc.edu> wrote:

> On Nov 1, 2006, at 6:15 PM, Eugene Bolotin wrote:
> 
>> Dear bioperl mailing list,
>> I trying to get sequence from a file using Bio::SeqIO, before I do
>> anything,
>> I want to make sure that the file is in a correct Fasta sequence
>> format. I
>> want it to spit out an error message if it is in any other format.
>> What is the easiest way to do it?
>> Thanks,
>> Eugene Bolotin
>> Sladek Lab.
> 
> There is no formal FASTA definition that is universally accepted
> beyond having the first line start with '>' and an optional
> description, with the sequence in subsequent lines.
> 
> http://www.bioperl.org/wiki/FASTA_sequence_format
> 
> Bio::SeqIO isn't currently set up to validate sequence formats
> directly, but you could try preparsing the data using
> Bio::Tools::GuessSeqFormat.
> 
> Chris
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l





More information about the Bioperl-l mailing list