[Bioperl-l] protein or dna ?

Andreas Kahari ak at ebi.ac.uk
Mon Apr 12 06:23:28 EDT 2004


On Sun, Apr 11, 2004 at 11:14:48PM -0400, Koen van der Drift wrote:
> Hi,
> 
> Does Bioperl have a function that can determine whether a sequence is a 
> protein or DNA ?

If your definition of a DNA sequence is a sequence of characters
made up from the set "ACGTN" (and possibly any whitespace
character), and if you're willing to accept any other sequence
of characters as a protein sequence, then this should work:

    my $seq = get_me_a_sequence();

    if ($seq =~ /[^ACGTN\w]/i) {
	# It's possibly a protein sequence (or junk).
    } else {
	# It's possibly a DNA sequence.
    }


Old discussion around the same subject:

    http://bioperl.org/pipermail/bioperl-l/1999-May/003171.html



Cheers,
Andreas

-- 
|<><>| Andreas Kähäri      EMBL, European Bioinformatics Institute
|><><|                     Wellcome Trust Genome Campus
|<><>| DAS Project Leader  Hinxton, Cambridgeshire, CB10 1SD
|><><| Ensembl Developer   United Kingdom


More information about the Bioperl-l mailing list