[Bioperl-l] Seq::IO/clustalw

Heikki Lehvaslaiho heikki at ebi.ac.uk
Mon Mar 29 05:33:04 EST 2004


Liam,

You do not say where you are getting the your sequences. If they are EMBL or 
GenBank files, you already have the translation as part of the annotation. 
You can use the following code to create protein sequence objects from them.

---------------------------------------------------------------------------
use Bio::SeqIO;
use Bio::PrimarySeq;
use strict;

my $in  = Bio::SeqIO->new(-file => shift);
my $seq = $in->next_seq;

my @translations;
foreach ( $seq->all_SeqFeatures ) {
    next unless $_->has_tag('translation');
    push @translations, $_->get_tag_values('translation');
}
# assuming that there is only one translation per sequence
die "ERROR..." unless @translations == 1;

my $prot = new Bio::PrimarySeq (-id => $seq->id,
                                -seq => @translations[0]);
---------------------------------------------------------------------------


The other possibility is to use EMBL sequences and fish out the Swiss-Prot 
corresponding ID from the feature table or dbxrefs, and do an other database 
query using it.

Yours,

	-Heikki
 
On Monday 29 Mar 2004 03:58, Liam Elbourne wrote:
> Hi All,
>
> Is there a quick way of substituting protein ids for accession codes.
> I've just done an clustalw alignment using DNA, and decided to use the
> protein sequences instead (not a very good alignment.......). Rather
> than substitute the protein ids manually, I wondered whether at the:
>
> my $seqobj = $db->get_Stream_by_id(['x?????','af?????', etc, etc, etc);
>
> step I could get the corresponding amino acid translation for the
> accession?. The sequences then get written to a file which is passed to
> clustalw.
>
>      my $seq_out = Bio::SeqIO->new( -format => 'fasta',
>                                   -file => "> $outfile");
>
>      while( $aseq = $seqobj->next_seq() )
>      {
>         $seq_out->write_seq($aseq);
>      }
>
>
> Regards,
> Liam Elbourne.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

-- 
______ _/      _/_____________________________________________________
      _/      _/                      http://www.ebi.ac.uk/mutations/
     _/  _/  _/  Heikki Lehvaslaiho    heikki_at_ebi ac uk
    _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
   _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
  _/  _/  _/  Cambs. CB10 1SD, United Kingdom
     _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________


More information about the Bioperl-l mailing list