[EMBOSS] getorf includes unspecified amino acids as part of the ORF sequence
    Peter Rice 
    pmr at ebi.ac.uk
       
    Tue Jan 12 09:15:28 EST 2010
    
    
  
Hi Avi,
> The input is a simple fasta file with only A,C,T,G letters and
> nothing else, so I wouldn't expect any Xs. In addition, even if there
> would be Ns (and there are no Ns) the program cannot know if such Ns
> do not include stopcodons so it should not consider them as part of an ORF.
>>> 00001_3 [803 - 1120]
>> LARLRFVVLGNSFIASAKGWSTPYGPTTFGPFRSCIYPRVFRSTRVRKAMATRIGSNRVN
>> ILIRCTXXXXXXXXXXXXXXXXXXXXXXXXXNPYLGWWCYIFCIFR
That suggests the Xs have all come from stop codons.
There are other possibilities, including a badly formatted input file
(perhaps two sequences and descriptions read as one).
We do need to see the input file to know where those Xs are from.
Peter Rice
    
    
More information about the EMBOSS
mailing list