[Bioperl-l] UTR regions in LiveSeq

htafer htafer at gmail.com
Thu Jun 23 12:20:54 UTC 2011


Hi
I am currently working on SNPs and would like to use BioPerl.
I am using LiveSeq objects to store the sequence and I am looking at
mutations by using the Mutation/Mutator objects.

More explicitly:
Given a set of annotation I extract the boundaries for all exons/
introns, the boundaries for the transcript as well as for the coding
sequence. Based on these boundaries and the corresponding genomic
sequence I construct the Bio::LiveSeq::{ DNA/Exon/Transcript/
Translation } objects which allow me to construct a Bio::LiveSeq::Gene
object.

Then based on a list of SNPs, I generate a set of
Bio::LiveSeq::mutation objects and a Bio::LiveSeq::mutator
which I then used mutate my gene. My main problem here is that I dont
know how to handle transcripts having UTR, or ncRNAs, i.e transcripts/
exons that do not code for protein. According to the documentation of
Bio::LiveSeq::Transcript, this class is aimed at storing information
about coding sequences (CDS) only.

The following code for transcripts with UTR is somewhat working,
delivering the expected results. Still the alignment function of
Bio::LiveSeq::Mutator do not work. So is there some plan to introduce
Bio::LiveSeq::UTR similar to Bio::SeqFeature::Gene::UTR and to better
suppot ncRNA/UTR with the mutator object? Or is it possible to do this
with current BioPerl implementation?

my $DNAsequence = Bio::LiveSeq::DNA->new( -seq =>
"GGGGGGGGGGATGAAAAAAATTTTTTTTTTAAAAAAAATAGCCCCCCCCCC");



my $utr5 = Bio::LiveSeq::Exon-> new(-seq => $DNAsequence,
                                    -start => 1,
                                    -end => 10,
                                    -strand => 1);
my $exon1 = Bio::LiveSeq::Exon-> new(-seq => $DNAsequence,
                                     -start => 11,
                                     -end => 20,
                                     -strand => 1);
my $intron1 =Bio::LiveSeq::Intron-> new(-seq => $DNAsequence,
                                     -start => 21,
                                     -end => 30,
                                     -strand => 1);

my $exon2 = Bio::LiveSeq::Exon-> new(-seq => $DNAsequence,
                                     -start => 31,
                                     -end => 41,
                                     -strand => 1);
my $utr3 = Bio::LiveSeq::Exon-> new(-seq => $DNAsequence,
                                    -start => 42,
                                    -end => 51,
                                    -strand => 1);

my @tarray = ($exon1, $exon2);
my @uarray = ($utr5, $exon1, $exon2,$utr3);
my @iarray = ($intron1);
my $Transcript = Bio::LiveSeq::Transcript->new( -exons => \@uarray);
my $translationTranscript = Bio::LiveSeq::Transcript->new( -exons =>
\@tarray);
my $Translation= Bio::LiveSeq::Translation->new( -transcript =>
$translationTranscript);
#need to do this to avoid change_error()
$Transcript->{'translation'}=$Translation;
my $features;
$features->{DNA} = $DNAsequence;
$features->{Transcripts} = [$Transcript];
$features->{Translations} = [$Translation];
$features->{Exons} = \@uarray;
$features->{Introns} = \@iarray;


my $gene=Bio::LiveSeq::Gene->new(-name => "bla",
                                 -features => $features);

my $mutation = new Bio::LiveSeq::Mutation (-seq =>'',
                                           -pos => 32,
                                           -len => 3
                                          );

my $mutate = Bio::LiveSeq::Mutator->new(-gene => $gene,
                                        -numbering => 'entry'
                                       );

$mutate->add_Mutation($mutation);

dna_mut:GGGGGGGGGGATGAAAAAAATTTTTTTTTTAAAAATAGCCCCCCCCCC
dna_ori:  GGGGGGGGGGATGAAAAAAATTTTTTTTTTAAAAAAAATAGCCCCCCCCCC
rna_mut: GGGGGGGGGGATGAAAAAAAAAAAATAGCCCCCCCCCC
rna_ori:  GGGGGGGGGGATGAAAAAAAAAAAAAAATAGCCCCCCCCCC
aa_mut: MKKKK*
aa_ori:   MKKKKK*

print $results->alignment();
Variant  :     GAT GAA AAA AAA     AAA ATA GCC CCC
Reference:     GAT GAA AAA AAA Bio AAA ATA GCC CCC
                    E   K   K   X   K   I   A



More information about the Bioperl-l mailing list