[Bioperl-l] HOWTO: take a slice of a split location

Jason Stajich jason.stajich at duke.edu
Sat Dec 10 13:11:57 EST 2005


Hi Malcom -
Don't have a chance to look at your code, but my approach to this  
problem would be to first splice the sequence out from the genome
  my $feature = Bio::SeqFeature::Generic->new(-location =>  
$splitlocation);
  my $cdsseq = $feature->spliced_seq;
then just retrieve the last 1000 bases of  this sequence.
  my $threeprime = $cdsseq->subseq($cdsseq->length - 1000, $cdsseq- 
 >length);
(this might be off-by-one?)

There is also a module to map between coordinates -  
Bio::Coordinate::GeneMapper if you need to go from transcript to  
genomic coordinates.

-jason
On Dec 10, 2005, at 2:06 AM, Cook, Malcolm wrote:

> Fellow Bioperlers,
>
> I was in need of extracting the 3'-most 1000 bp of from multiple  
> genomic CDS regions (designing 70mer u-array probes).
>
> I looked in vain for Bio::Location->splice($from,$to);
>
> So I wrote one which works but suffers from actually materializing  
> the list of interger indices into the sequence for every base.
>
> Has anyone a better approach they'd care to share?
>
> Malcolm Cook - mec at stowers-institute.org
> Stowers Institute for Medical Research - Kansas City, MO  USA
>
> P.S. Here' what I wrote:
>
> package Bio::LocationI;		# Code in the interface so it works
>                                 # with both ::Split and ::Simple
>                                 # Bio::Locations
>
> sub _intspans {
>   # Purpose: for a (presumably) monotonically increasing list of
>   # integers, return list of arrays each holding min and max of
>   # the list's internal contiguous spans.
>   #
>   # Example: 1..5,10..20,30 => ([1,5],[10,20],[30,30])
>   my @i = @_;
>   die "nothing passed to intspans" unless @i;
>   my @s = ([$i[0],shift(@i)]);
>   foreach (@i) {
>     if ($_ == 1 + $s[0][1]) {
>       $s[0][1] = $_;
>     } else {
>       unshift @s, [$_, $_]
>     }}
>   reverse @s;
> }
>
> sub slice {
>   # Purpose: compute a slice of the Location, using perls normal slice
>   # semantics, expect that it trims out of range values.
>   my ($self, $from, $to) = @_;
>   my @int = eval (join ',', map {$_->start . '..' . $_->end} $self- 
> >each_Location); # build perl expression using the range (..) and  
> list (,) operators.
>   @int = @int[$from..$to];
>   @int = grep {$_} @int;	# Removing undefs (in case $from/$to out  
> of bounds).
>   my @intspans = _intspans(@int);
>   new  Bio::Location::Split (-strand => $self->strand,
> 			     -locations => [map {new Bio::Location::Simple(-start => $_-> 
> [0],
> 									   -end   => $_->[1],
> 									   -strand => $self->strand,
> 									  )
> 					       } @intspans],
> 			    );
> }
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12




More information about the Bioperl-l mailing list