[Bioperl-l] Bio::Coordinate::GeneMapper cds to peptide bug

Heikki Lehvaslaiho heikki.lehvaslaiho at gmail.com
Thu May 13 06:20:51 UTC 2010


Just a thumbs up.

Aaron's fix works. It problem seems to be limited to where he spotted it. I
am working on refreshing my memory how the code work - it has been quite a
few years since I wrote it - and will commit better tests.

As of getting values outseide the defined region, that is a feature rather
than a bug. The idea was to be able to ask what would the new coordinate be
if the feature extended beyond the known limits. The is the capability of
Bio::Coordinate::ExtrapolatingPair that is used here. That class also has a
method strict that can be used to prevent extrapolating, but the code to
access that has not been written into GeneMapper. I'll see if I can get it
to work.

    -Heikki

Heikki Lehvaslaiho - skype:heikki_lehvaslaiho
cell: +966 545 595 849  office: +966 2 808 2429

Computational Bioscience Research Centre (CBRC), Building #2, Office #4216
4700 King Abdullah University of Science and Technology (KAUST)
Thuwal 23955-6900, Kingdom of Saudi Arabia



On 12 May 2010 13:23, Heikki Lehvaslaiho <heikki.lehvaslaiho at gmail.com>wrote:

> Outch. I'll definitely have a look.
>
> Strange that none of the tests have picked this up...
>
>     -Heikki
>
> Heikki Lehvaslaiho - skype:heikki_lehvaslaiho
> cell: +966 545 595 849  office: +966 2 808 2429
>
> Computational Bioscience Research Centre (CBRC), Building #2, Office #4216
> 4700 King Abdullah University of Science and Technology (KAUST)
> Thuwal 23955-6900, Kingdom of Saudi Arabia
>
>
>
>
> On 12 May 2010 01:40, Aaron Mackey <amackey at virginia.edu> wrote:
>
>> Hi Chris,
>>
>> I was hoping Heikki might take up the cause and investigate further --
>> let's
>> give him a chance to respond.  But it's a frightening bug if it's really
>> been that way for all this time ...
>>
>> -Aaron
>>
>> On Tue, May 11, 2010 at 6:31 PM, Chris Fields <cjfields at illinois.edu>
>> wrote:
>>
>> > Aaron,
>> >
>> > Do we want to write this up as a set of tests to add to the bioperl test
>> > suite?  We can probably add it after the github migration tomorrow.
>> >
>> > chris
>> >
>> > On May 11, 2010, at 4:26 PM, Aaron Mackey wrote:
>> >
>> > > Hi Zerui (and others),
>> > >
>> > > I've confirmed there seems to be a bug in Bio::Coordinate::GeneMapper,
>> > > specifically in this code:
>> > >
>> > > lines:
>> > > 1170:               (-start   => int ($loc->start / 3 ) +1,
>> > > 1171:            -end     => int ($loc->end / 3 ) +1,
>> > >
>> > > both of those lines should look like: int (($loc->start - 1) / 3) + 1
>> > >
>> > > otherwise for CDS coordinates 1, 2, 3, 4, 5, 6, you get incorrect
>> peptide
>> > > positions 1, 1, 2, 2, 2, 3 (when you instead want 1, 1, 1, 2, 2, 2)
>> > >
>> > > There is also a problem when mapping exon coordinates that are
>> > outside/after
>> > > the CDS region (instead of getting undefined locations, you continue
>> to
>> > get
>> > > peptide coordinates, but they are invalid, larger than the protein
>> > length).
>> > >
>> > > Dennis and fringy -- this may affect the SNPtab.pl script I wrote for
>> > you,
>> > > as it uses this module to calculate codons for SNPs.
>> > >
>> > > -Aaron
>> > >
>> > > P.S. a script the demonstrates the problem:
>> > >
>> > > use Bio::Coordinate::GeneMapper;
>> > >
>> > > my $mapper =
>> > >  Bio::Coordinate::GeneMapper
>> > >  ->new( -in => "chr",
>> > >     -out => "propeptide",
>> > >     -exons => [ Bio::Location::Simple
>> > >             ->new( -start => 101,
>> > >                -end   => 109 ),
>> > >             Bio::Location::Simple
>> > >             ->new( -start => 201,
>> > >                -end   => 221 ),
>> > >           ],
>> > >     -cds => Bio::Location::Simple
>> > >             ->new(-start => 101, -end => 209),
>> > >       );
>> > >
>> > >
>> > > print join("\t", "chr", "aa"), "\n";
>> > > for my $pos (99..111,199..211) {
>> > >  my $res = $mapper->map(
>> > >    Bio::Location::Simple->new(-start => $pos, -end => $pos, -seq_id =>
>> > 1));
>> > >  my $start = $res->start; $start = "NA" unless defined $start;
>> > >  my $end = $res->end; $end = "NA" unless defined $end;
>> > >  print join("\t", $pos, $start), "\n";
>> > > }
>> > > _______________________________________________
>> > > Bioperl-l mailing list
>> > > Bioperl-l at lists.open-bio.org
>> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >
>> >
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
>



More information about the Bioperl-l mailing list