[Bioperl-l] three letter codes for amino acids?

Heikki Lehvaslaiho heikki@ebi.ac.uk
Thu, 11 Jan 2001 10:09:48 +0000


Dear Adrian,

I guess I was not too clear here. I'll post the reply to the list as
others might have misunderstood, too. 

The translate method in PrimarySeqI defaults to '*' and 'X' for stop
and any in its output, but there are arguments to the method that
allow you to change it. As The resulting protein sequence object can
have any come other characters in the one letter code stored in the
object. The same argumets are needed in the seq3 method so that the
corresponding three letter codes are always 'Ter' and 'Xaa' (IUPAC
standard).

	-Heikki

Adrian Goldman wrote:
> 
> Heikki,
> 
> I am not very good at listserv etiquette. Anyway, here is my 2c.. if you want to post it further on to the list server, it's OK by me. Or else you can just ignore what follows as my own personal opinion.
> 
> I don't think it makes much sense to use * as the default character for stop in 3-letter codes, nor X as the default for unknown, for the optional arguments you mention below. Ter (as you propose) for the termination codon and ?XXX for unknown make more sense to me.
> 
> Adrian
> 
> At 12:03 pm -0500 10/1/2001, bioperl-l-request@bioperl.org wrote:
> 
>      Message: 5
>      Date: Wed, 10 Jan 2001 12:26:53 +0000
>      From: Heikki Lehvaslaiho <heikki@ebi.ac.uk>
>      Organization: EMBL - EBI
>      To: bioperl-l <bioperl-l@bioperl.org>
>      Subject: [Bioperl-l] three letter codes for amino acids?
> 
>      I noticed that it is not possible to use three letter codes for amino
>      acids in any bioperl sequence objects. I think should be possible at
>      least to output in three letter code. Mapping three letter code back
>      to one letter code is not too hard, either, but is it a good idea to
>      have?
> 
>      I propose to put method 'seq3' into PrimarySeq.pm which is called from
>      Seq.pm, too.
> 
>      =head2 seq3
> 
>      Title : seq3
>      Usage : $string = $obj->seq3()
>      Function: Read only method that returns the amino acid sequence
>      as a string of three letter codes. moltype has to be
>      'protein'. Output follows the IUPAC standard plus
>      'Ter' for terminator.
>      Returns : A scalar
>      Args : character used for stop, optional, defaults to '*'
>      character used for unknown, optional, defaults to 'X'
> 
>      =cut
> 
>      Any opinions?
> 
>      -Heikki
> 
>      -- 
> 
> Professor Adrian Goldman, | Phone: 358-(0)9-191 58923
> Structural Biology Group, | FAX: 358-(0)9-191 58952
> Institute of Biotechnology | Sec: 358-(0)9-191 58921
> University of Helsinki, | Mobile: 358-(0)50-336 8960
> PL 56 | Home: 358-(0)9-728 7103
> 00014 Helsinki | email: Adrian.Goldman@Helsinki.fi
> 
> -- on sabbatical at Brookhaven National labs, June 2000-June 2001
> Adrian Goldman, Biology Department, Building 463 50 Bell Ave., Brookhaven National Lab., Upton NY 11973. Phone: 631-344-2671 (off) 631-344-3417 (lab), 631-344-3407 (FAX). email: agoldman@bnl.gov

-- 
______ _/      _/_____________________________________________________
      _/      _/                      http://www.ebi.ac.uk/mutations/
     _/  _/  _/  Heikki Lehvaslaiho          heikki@ebi.ac.uk
    _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
   _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
  _/  _/  _/  Cambs. CB10 1SD, United Kingdom
     _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________