[Bioperl-l] Bio::Tools::Glimmer and genes wrapping around the origin

Francisco J. Ossandón fossandonc at hotmail.com
Sat Mar 10 14:26:08 UTC 2012


I can provide a couple of real world examples in both strands, I can search
for more if needed:

>From NC_000911.1, Synechocystis sp. PCC 6803 chromosome
(http://www.ncbi.nlm.nih.gov/nuccore/16329170):
* NP_439899.1, solanesyl diphosphate synthase =
join(3573271..3573470,1..772)

>From NC_000868.1, Pyrococcus abyssi GE5 chromosome
(http://www.ncbi.nlm.nih.gov/nuccore/14518450):
* NP_125692.1, 50S ribosomal protein L1P =
complement(join(1764520..1765118,1..61))

Cheers,

Francisco J. Ossandon


-----Mensaje original-----
De: bioperl-l-bounces at lists.open-bio.org
[mailto:bioperl-l-bounces at lists.open-bio.org] En nombre de Fields,
Christopher J
Enviado el: viernes, 09 de marzo de 2012 17:43
Para: Adam Witney
CC: <bioperl-l at bioperl.org>; Francisco J. Ossandón
Asunto: Re: [Bioperl-l] Bio::Tools::Glimmer and genes wrapping around the
origin

The best way to address this is to propose a set of tests that demonstrates
the problem, but I believe you have something that should work:

    my $location_string = 'join(117..1,135690..135187)';

We probably should cover both aspects, a join that spans the origin on both
forward and reverse strands, something like:

   join(117..1,135690..135187)
   complement(join(117..1,135690..135187))

chris

On Mar 9, 2012, at 5:02 AM, Adam Witney wrote:

> 
> Thanks for your replies.
> 
> After a little more digging there seems to be two things. In the test code
i posted, the extra strand attribute in Bio::SeqFeature::Generic (and
subsequent call to location_object->strand) changes the internal
guide_strand such that this line in Bio::Location::Split->to_FTstring
changes the order of subLocations:
> 
> 	my @locs = ($stype eq 'join' && (!$guide && $strand == -1)) ?
> 	           reverse $self->sub_Location() : $self->sub_Location() ;
> 
> The second problem is probably due to the sorting order within
Bio::SeqFeature::Generic when calling 'start' and 'end' then it doesn't do
the right thing in this case. But I am not quite sure how to fix this.
> 
> Thanks again
> 
> Adam
> 
> On 8 Mar 2012, at 21:38, Fields, Christopher J wrote:
> 
>> I agree, no sorting should be implied (a 'join' order should be based on
order of addition alone, and sort should be optional).  IIRC there were
backwards-compat problems switching this due to reliance on old behavior,
but it might be worth trying to see if anything breaks test-wise (and why it
breaks).
>> 
>> chris
>> 
>> On Mar 8, 2012, at 2:37 PM, Francisco J. Ossandón wrote:
>> 
>>> Long ago, Bio::SeqIO had an issue with genes that were split at the
origin.
>>> This was because Bioperl, when reading a Genbank file, automatically 
>>> sorted the segments coordinates of the genes that was reading (not 
>>> considering the possibility in a circular genome, where the sequence 
>>> could go from the 1st nucleotide directly to the last one), so an 
>>> extra "-nosort" argument was necessary every time to avoid Bioperl
giving the wrong sequence:
>>> 
>>> my $ nt_seq_obj = $feat->spliced_seq(-nosort => 1);
>>> 
>>> This is the same bug. In that case the code was changed so the "-nosort"
>>> were applied based on the status of "is_circular" of the genome, see
here:
>>> https://redmine.open-bio.org/issues/2579
>>> 
>>> Since your code don't have the "is_circular" information (because it 
>>> don't come from a file), I guess that the autosorting is kicking in. 
>>> I think it would be better if all the "autosorting" of the 
>>> sublocations array inside "Bio::Location::Split" were optional 
>>> instead of automatic, because of these cases.
>>> 
>>> Cheers,
>>> 
>>> Francisco J. Ossandon
>>> 
>>> -----Mensaje original-----
>>> De: bioperl-l-bounces at lists.open-bio.org
>>> [mailto:bioperl-l-bounces at lists.open-bio.org] En nombre de Adam 
>>> Witney Enviado el: jueves, 08 de marzo de 2012 13:40
>>> Para: bioperl-l at bioperl.org
>>> Asunto: [Bioperl-l] Bio::Tools::Glimmer and genes wrapping around 
>>> the origin
>>> 
>>> Hi,
>>> 
>>> I have been using Bio::Tools::Glimmer and have come across a problem 
>>> with it not handling genes that wraparound across the origin. I 
>>> think I have boiled it down to this test case of what happens 
>>> internally with Bio::Tools::Glimmer
>>> 
>>> ##############################################################
>>> #! /usr/local/bin/perl -w
>>> 
>>> use strict;
>>> use warnings;
>>> 
>>> use Bio::Factory::FTLocationFactory; use Bio::SeqFeature::Generic;
>>> 
>>> my $location_string = 'join(117..1,135690..135187)';
>>> 
>>> my $location_factory = Bio::Factory::FTLocationFactory->new();
>>> my $location_object = 
>>> $location_factory->from_string($location_string);
>>> 
>>> print "Location: ".$location_object->to_FTstring."\n";
>>> 
>>> my $gene = Bio::SeqFeature::Generic->new(
>>>               '-seq_id'      => 'Testing',
>>>               '-location'   => $location_object,
>>>               '-strand'     => -1
>>>           );
>>> 
>>> print "Location: ".$location_object->to_FTstring."\n";
>>> 
>>> ##############################################################
>>> 
>>> $ perl ../FTLocationTest.pl
>>> Location: complement(join(135187..135690,1..117))
>>> Location: complement(join(1..117,135187..135690))
>>> 
>>> This happens because by setting the '-strand' in 
>>> Bio::SeqFeature::Generic, this calls the strand method in 
>>> $location_object (Bio::Location::Split) which then causes the problem
(although I can't quite work out where...!
>>> 
>>> Is this intended behaviour?
>>> 
>>> Thanks
>>> 
>>> Adam
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> 
>>> 
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
> 


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l





More information about the Bioperl-l mailing list