[Bioperl-l] Pulling exons out of a Genbank mRNA
Amir Karger
akarger at CGR.Harvard.edu
Mon Feb 13 20:57:08 UTC 2006
I'm trying to get the sequences of each exon in a gene. I have a genbank
file with mRNA and exon features (among others) that look like:
mRNA join(complement(22257..22386),complement(22067..22186),
complement(16753..17101),complement(13840..13962),
complement(10649..10820),complement(502..3028))
/gene="ENSG00000005812"
/note="transcript_id=ENST00000355619"
exon complement(13840..13962)
/note="exon_id=ENSE00000802462"
I want to make a FASTA file with 6 sequences corresponding to the 6 exons in
the mRNA above. I tried writing the below code, but it doesn't do what I
want. (You'll note that the code is stolen from the Bio::Seq and Feature
HOWTOs.)
my $inseq = Bio::SeqIO->new(-file => "<$file", -format => $format );
while (my $seq = $inseq->next_seq) {
my @features = $seq->get_SeqFeatures(); # just top level
foreach my $feat ( @features ) {
my $type = $feat->primary_tag;
if ($type eq "mRNA") {
print "Feature ",$feat->primary_tag,
" starts ",$feat->start," ends ", $feat->end,
" strand ",$feat->strand,"\n";
my @feats = $feat->get_SeqFeatures();
print "Found ", scalar @feats, " sub-features\n";
} elsif ($type eq "exon") {
print "Feature ",$feat->primary_tag,
" starts ",$feat->start," ends ", $feat->end,
" strand ",$feat->strand,"\n";
}
}
}
When I run the above, it says that the mRNA features have no sub-features.
So how do I pull out the 6 sequences?
Thanks,
- Amir Karger
Computational Biology Group
Bauer Center for Genomics Research
Harvard University
617-496-0626
More information about the Bioperl-l
mailing list