[Bioperl-l] Bio::RangeI intersection and Bio::DB::GFF

Heikki Lehvaslaiho heikki at sanbi.ac.za
Wed May 3 06:51:12 UTC 2006


On Wednesday 03 May 2006 00:31, Marco Blanchette wrote:
> Brian--
>
> I checked out last week version from the CVS.
>
> Silly question: How do I get the version of BioPerl I am using... Never had
> to check a module/bundle version number before...

It is not that silly. The syntax in not too easy:

	perl -MBio::Perl -le 'print Bio::Perl->VERSION;'

You can use any module in bioperl, of course.

     -Heikki

> Marco
>
> On 5/2/06 14:49, "Brian Osborne" <osborne1 at optonline.net> wrote:
> > Marco,
> >
> > Odd, because the intersection() code is quite simple and it's clear how
> > it should behave. What version of Bioperl are you using? I'm looking at
> > the latest, in bioperl-live...
> >
> > Brian O.
> >
> > On 5/2/06 4:32 PM, "Marco Blanchette" <mblanche at berkeley.edu> wrote:
> >> Brian--
> >>
> >> Even when both elements of intersection() are from the negative strand,
> >> the return object is from the positive strand and $overlap is actually
> >> the revervese complement of the intersection between the 2 exons. Here
> >> is part of the output from the script below:
> >>
> >> ===
> >> ex1     Strand: -1
> >> CTTTTTTCCACATACGTCGTCAACGTGATTCGACCTTTTCCGGTTTATTAGTTGAACATGGCAGTCGGCAAA
> >>AATA AAGGTCTTTCCAAGGGTGGTAAGAAGGGCGG
> >> TAAGAAGAAGGTGGTGGACCCGTTTTCTCGCAAGGACTG
> >> ex2     Strand: -1
> >> CTTTTTTCCACATACGTCGTCAACGTGATTCGACCTTTTCCGGTTTATTAGTTGAACATGGCAGTCGGCAAA
> >>AATA AAGGTCTTTCCAAGGGTGGTAAGAAGGGCGG
> >> TAAGAAGAAGGTGGTGGACCCGTTTTCTCGCAAGGACTGGTACGATGTCAAAGCTCCGAATATGTTTCAAAC
> >>CCGT CAAATCG
> >> overlap Strand: 1
> >> CAGTCCTTGCGAGAAAACGGGTCCACCACCTTCTTCTTACCGCCCTTCTTACCACCCTTGGAAAGACCTTTA
> >>TTTT TGCCGACTGCCATGTTCAACTAATAAACCGG
> >> AAAAGGTCGAATCACGTTGACGACGTATGTGGAAAAAAG
> >> ...
> >>
> >> If both are from the positive strand, the return object is positive as
> >> in:
> >>
> >> ===
> >> ex1     Strand: 1
> >> CAACGCAGACGTGGTACGGCGTTTTAAATCTGATAACATTTTGAACCGGGAATTATTTTAGAGTACCATTCT
> >>TTGT TTTGTGCCTGTTTCAGTATAAATTAATTATG
> >> CGCCTGATTTAAAGTACAAAATGTGTAAATATATCACCTTACCGTCGCGGGTGCACCCAATTGTGCTTTGAT
> >>GAAT AAATATACATATATGCAACATATATAACTTC
> >> CTGTGTTAGTATAAGTGTATGTCAGCCAAAAACAAATATATATATGAGTGTTTATCGGCATTCGTGTGCTGG
> >>CAGA GCAGCGATCAAAGCTGCGTTCGGTACTCGTT
> >> GACTGGCCCAAGAATGAATTCTCGTGCAAGTGTGTTGATAAAAAGTATACGTATGTAT
> >> ex2     Strand: 1
> >> ATCGACAGTTGCCATCGTCGTTATTCCAGCACTAATTTAAAAAAAATTCGATCAACGCAGACGTG
> >> overlap Strand: 1
> >> CAACGCAGACGTG
> >>
> >> Is there something I am missing? Here is the script generating the
> >> output
> >>
> >> Many thanks all...
> >>
> >> Marco
> >>
> >>
> >> use strict;
> >> use warnings;
> >> use Bio::DB::GFF;
> >>
> >> MAIN:{
> >>
> >>     my $db = Bio::DB::GFF->new( -adaptor => 'dbi::mysql',
> >>                                 -dsn =>
> >> 'dbi:mysql:database=dmel_43_LS;host=riolab.net',
> >>                                 -user => 'guest');
> >>     my $test_db = $db->segment('4');
> >>
> >>     # Load up the exons into $exons_p
> >>     for my $gene ($test_db->features(-types => 'gene')){
> >>
> >>         my $exons_p = extractExons($gene);
> >>
> >>         cluster($exons_p) unless ($#{$exons_p} == -1);
> >>
> >>     }
> >> }
> >>
> >> sub extractExons {
> >>     my $gene = shift;
> >>     my %ex_list;
> >>     my @tcs = $gene->features(    -type =>'processed_transcript',
> >>                                     -attributes =>{Gene =>
> >> $gene->group});
> >>
> >>     for my $tc (@tcs){
> >>         my @exons = $tc->features (-type => 'exon',
> >>                                      -attributes => {Parent =>
> >> $tc->group} );
> >>
> >>         for (@exons){
> >>             my $ex_id    = $_->id;
> >>             $ex_list{$ex_id} = $_ unless (exists $ex_list{$ex_id});
> >>
> >>         }
> >>
> >>     }
> >>     my @values = values %ex_list;
> >>     return(\@values);
> >> }
> >>
> >> sub cluster {
> >>     my $exons_p = shift;
> >>
> >>     for (my $s = 0; $s <= $#{$exons_p}; $s++){
> >>         for (my $t = $s+1; $t <= $#{$exons_p}; $t++){
> >>             my $exon1 = $exons_p->[$s];
> >>             my $exon2 = $exons_p->[$t];
> >>
> >>             if (!($exon1->equals($exon2)) && $exon1->overlaps($exon2)){
> >>
> >>                 my $overlap = $exon1->intersection($exon2);
> >>
> >>                 print "===\n";;
> >>                 print     "ex1\tStrand: ", $exon1->strand, "\n",
> >>                         $exon1->seq, "\n";
> >>                 print     "ex2\tStrand: ", $exon2->strand, "\n",
> >>                         $exon2->seq, "\n";
> >>                 print "overlap\tStrand: ", $overlap->strand, "\n",
> >>                         $overlap->seq, "\n";
> >>             }
> >>         }
> >>     }
> >> }
> >>
> >> On 5/2/06 13:17, "Brian Osborne" <osborne1 at optonline.net> wrote:
> >>> Marco,
> >>>
> >>> Yes, this is how intersection() is supposed to work. If both of the
> >>> Range objects have the same strand then the strand information is
> >>> returned as part of the result but if they aren't on the same strand
> >>> then no strand information is returned.
> >>>
> >>> Brian O.
> >>>
> >>> On 5/2/06 3:30 PM, "Marco Blanchette" <mblanche at berkeley.edu> wrote:
> >>>> Dear all--
> >>>>
> >>>> I have been trying to use the intersection function to extract
> >>>> overlapping region from alternatively spliced exons as in the
> >>>> following script. The returned object from the 'my $overlap =
> >>>> $exon1->intersection($exon2);' is actually loosing the strand of
> >>>> $exon1 if $exon1 is from the negative strand.
> >>>> Is this behavior expected? Should I check the strand of $exon1 before
> >>>> working on the object return by any Bio::RangeI function?
> >>>>
> >>>> Many thanks
> >>>>
> >>>> #!/usr/bin/perl
> >>>> use strict;
> >>>> use warnings;
> >>>> use Bio::DB::GFF;
> >>>>
> >>>> MAIN:{
> >>>>
> >>>>     my $db = Bio::DB::GFF->new( -adaptor => 'dbi::mysql',
> >>>>                                 -dsn =>
> >>>> 'dbi:mysql:database=dmel_43_LS;host=riolab.net',
> >>>>                                 -user => 'guest');
> >>>>     my $test_db = $db->segment('4');
> >>>>
> >>>>     # Load up the exons into $exons_p
> >>>>     for my $gene ($test_db->features(-types => 'gene')){
> >>>>
> >>>>         my $exons_p = extractExons($gene);
> >>>>
> >>>>         cluster($exons_p) unless ($#{$exons_p} == -1);
> >>>>
> >>>>     }
> >>>> }
> >>>>
> >>>> sub extractExons {
> >>>>     my $gene = shift;
> >>>>     my %ex_list;
> >>>>     my @tcs = $gene->features(    -type =>'processed_transcript',
> >>>>                                     -attributes =>{Gene =>
> >>>> $gene->group});
> >>>>
> >>>>     for my $tc (@tcs){
> >>>>         my @exons = $tc->features (-type => 'exon',
> >>>>                                      -attributes => {Parent =>
> >>>> $tc->group} );
> >>>>
> >>>>         for (@exons){
> >>>>             my $ex_id    = $_->id;
> >>>>             $ex_list{$ex_id} = $_ unless (exists $ex_list{$ex_id});
> >>>>
> >>>>         }
> >>>>
> >>>>     }
> >>>>     my @values = values %ex_list;
> >>>>     return(\@values);
> >>>> }
> >>>>
> >>>> sub cluster {
> >>>>     my $exons_p = shift;
> >>>>
> >>>>     for (my $s = 0; $s <= $#{$exons_p}; $s++){
> >>>>         for (my $t = $s+1; $t <= $#{$exons_p}; $t++){
> >>>>             my $exon1 = $exons_p->[$s];
> >>>>             my $exon2 = $exons_p->[$t];
> >>>>
> >>>>             if (!($exon1->equals($exon2)) &&
> >>>> $exon1->overlaps($exon2)){
> >>>>
> >>>>                 my $overlap = $exon1->intersection($exon2);
> >>>>
> >>>>                 print "===\n";;
> >>>>                 print "ex1\n", $exon1->seq, "\n";
> >>>>                 print "ex2\n", $exon2->seq, "\n";
> >>>>                 print "overlap\n", $overlap->seq, "\n";
> >>>>             }
> >>>>         }
> >>>>     }
> >>>> }
> >>>> ______________________________
> >>>> Marco Blanchette, Ph.D.
> >>>>
> >>>> mblanche at uclink.berkeley.edu
> >>>>
> >>>> Donald C. Rio's lab
> >>>> Department of Molecular and Cell Biology
> >>>> 16 Barker Hall
> >>>> University of California
> >>>> Berkeley, CA 94720-3204
> >>>>
> >>>> Tel: (510) 642-1084
> >>>> Cell: (510) 847-0996
> >>>> Fax: (510) 642-6062
> >>
> >> ______________________________
> >> Marco Blanchette, Ph.D.
> >>
> >> mblanche at uclink.berkeley.edu
> >>
> >> Donald C. Rio's lab
> >> Department of Molecular and Cell Biology
> >> 16 Barker Hall
> >> University of California
> >> Berkeley, CA 94720-3204
> >>
> >> Tel: (510) 642-1084
> >> Cell: (510) 847-0996
> >> Fax: (510) 642-6062
>
> ______________________________
> Marco Blanchette, Ph.D.
>
> mblanche at uclink.berkeley.edu
>
> Donald C. Rio's lab
> Department of Molecular and Cell Biology
> 16 Barker Hall
> University of California
> Berkeley, CA 94720-3204
>
> Tel: (510) 642-1084
> Cell: (510) 847-0996
> Fax: (510) 642-6062

-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Associate Professor    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________



More information about the Bioperl-l mailing list