[Bioperl-l] extracting coordinates from EMBL join annotation

Zayed Albertyn zayed@sanbi.ac.za
Mon, 28 Oct 2002 15:47:53 +0200


Hello All

I cant seem to get m head around this problem and I would like to know
if anybody could please help me out. I need to do comparisons of
sensitivity and to this end I need to extract the coordinates from an
embl file. I have code that extrants the sequence but I would like the
corresponding coordinates.

Here is the code I would hopefully like to alter


my $embl_file = $ARGV[0];
my $in = Bio::SeqIO->new(-file => $embl_file,
                         -format => 'EMBL');

my $seq;
my $i;

while (defined($seq = $in->next_seq())) {
  foreach my $feature ($seq->top_SeqFeatures) {

    if ($feature->primary_tag eq 'mRNA_span') {
      my $new_seq_name = $seq->display_id . '_MRNA';
      my $new_seq = Bio::Seq->new(-seq => '',
                             -id => ($seq->id . '_MRNA'),
                             -moltype => 'dna');

      my $namef = $seq->display_id;
      my @exons = $feature->sub_SeqFeature();
      open (FILE,">$namef.fa") || die "Error: $!";

      if ($exons[0]->strand() == -1) {
        @exons = sort { $b->start() <=> $a->start() } @exons;
      } else {
        @exons = sort { $a->start() <=> $b->start() } @exons;
      }
      $i = 0;
      foreach my $subfeature (@exons) {
        $i++;
        # append each span of mRNA to the new sequence
        $new_seq->seq($new_seq->seq() . $subfeature->seq->seq());
        print FILE ">$namef".".Exon$i\n";
        print FILE $subfeature->seq->seq(),"\n";
      }
    }
  }
}


Thanks

Zayed