[Bioperl-l] retrieving coding sequences from swissprot protein accessions

Jason Stajich jason at cgt.duhs.duke.edu
Tue Jun 1 12:04:38 EDT 2004


Get the dblinks which point to an EMBL accession.

http://jason.open-bio.org/Bioperl_Tutorials/GenomeInformatics2003/Bioperl-2.pdf

The example starts on slide 31, the code to get the xrefs is on about
slide 37 or so.

Get the accessions which are mRNA or DNA depending on which annotation
you want to use, then parse out the CDS for these records.

-jason


On Tue, 1 Jun 2004, Michael Bradley wrote:

> Hello all,
>
> I would like to get at the coding sequence for a given protein with a
> swissprot accession. I have done this with GenBank file in the past
> using the following code. Does anyone know how to do this with swissprot ?
>
> my $gp = new Bio::DB::GenPept;
> my $gb = new Bio::DB::GenBank;
> my $loc_factory = new Bio::Factory::FTLocationFactory;
>
> my $prot_stream = $gp->get_Stream_by_acc($protein_gi);
> 	while ( my $prot_seq = $prot_stream->next_seq() ) {
> 		foreach my $feat ( $prot_seq->top_SeqFeatures ) {
> 		if ( $feat->primary_tag eq 'CDS' ) {
> 		# example: 'coded_by="U05729.1:1..122"'
> 		my @coded_by = $feat->each_tag_value('coded_by');
> 		my ($nuc_acc,$loc_str) = split /\:/, $coded_by[0];
> 		my $nuc_obj = $gb->get_Seq_by_acc($nuc_acc);
> 		# create Bio::Location object from a string
> 		my $loc_object = $loc_factory->from_string($loc_str);
> 		# create a Feature object by using a Location
> 		my $feat_obj = new Bio::SeqFeature::Generic(-location =>$loc_object);
> 		# associate the Feature object with the nucleotide Seq object
> 		$nuc_obj->add_SeqFeature($feat_obj);
> 		my $cds_obj = $feat_obj->spliced_seq;
> 		print "CDS sequence is ",$cds_obj->seq,"\n\n";
> 		} else {
> 		print "No CDS for ", $prot_seq->id,"\n\n";
> 		}
> 		}
> 	}
>
> Thanks,
>
> Michael Bradley
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu


More information about the Bioperl-l mailing list