[EMBOSS] Problem with protein caracters
Peter Rice
pmr at ebi.ac.uk
Sat Jul 11 10:54:21 UTC 2009
Radwen ANIBA wrote:
> I'm trying to use some programs that comes with emboss package to analyze
> some protein sequences but I have sometimes this message :
>
> Error: ajSeqTypeCheckIn: Sequence must be protein sequence without BZ U X or
> *: found bad character 'X'
>
> Is there any manner to force the program considering these types of residues
EMBOSS uses the type attribute of the input sequence (or seqset or
seqall) to identify the type of the input sequence (nucleotide, protein,
or any) and the characters that are allowed (gaps, stops, non-standard
residies and ambiguity characters).
Your application is expecting "pureprotein". This is only used by
applications unable to handle the ambiguity codes (it can be difficult
to define what an algorithm should do with them).
The alternative are:
protein - accepts all characters, converts stops to X
proteinstandard - converts U,O and J to 'X'
stopproteinstandard - converts stops, U, O, J to X
"protein" is probably what you want. You need to be able to do something
with the ambiguity codes X, B, Z and J and with the non-standard amino
acids U (selenocysteine) and O (pyrrolysine)
Hope this helps
Peter Rice
More information about the EMBOSS
mailing list