[Bioperl-l] Clear range from Bio::Seq::Quality?

Dan Bolser dan.bolser at gmail.com
Fri Apr 24 12:20:23 EDT 2009


Its a bit rough and ready, but it does what I need...




=head2 get_clear_range

 Title    : get_clear_range

 Title    : subqual
 Usage    : $subobj = $obj->get_clear_range();
            $subobj = $obj->get_clear_range(20);
 Function : Get the clear range using the given quality score as a
            cutoff or a default value of 13.

 Returns  : a new Bio::Seq::Quality object
 Args     : a minimum quality value, optional, devault = 13

=cut

sub get_clear_range
{
    my $self = shift;
    my $qual = $self->qual;
    my $minQual = shift || 13;

    my (@ranges, $rangeFlag);

    for(my $i=0; $i<@$qual; $i++){
	## Are we currently within a clear range or not?
	if(defined($rangeFlag)){
	    ## Did we just leave the clear range?
	    if($qual->[$i]<$minQual){
		## Log the range
		push @ranges, [$rangeFlag, $i-1, $i-$rangeFlag];
		## and reset the range flag.
		$rangeFlag = undef;
	    }
	    ## else nothing changes
	}
	else{
	    ## Did we just enter a clear range?
	    if($qual->[$i]>=$minQual){
		## Better set the range flag!
		$rangeFlag = $i;
	    }
	    ## else nothing changes
	}
    }
    ## Did we exit the last clear range?
    if(defined($rangeFlag)){
	my $i = scalar(@$qual);
	## Log the range
	push @ranges, [$rangeFlag, $i-1, $i-$rangeFlag];
    }

    unless(@ranges){
	die "There is no clear range... I don't know what to do here!\n";
    }

    print "there are ", scalar(@ranges), " clear ranges\n";

    my $sum; map {$sum += $_->[2]} @ranges;

    print "of ", scalar(@$qual), " bases, there are $sum with ".
	"quality scores above the given threshold\n";

    for (sort {$b->[2] <=> $a->[2]} @ranges){
	if($_->[2]/$sum < 0.5){
	    warn "not so much a clear range as a clear chunk...\n";
	}
	print $_->[2], "\t", $_->[2]/$sum, "\n";
	
	return Bio::Seq::QualityDB->new( -seq => $self->subseq(  $_->[0]+1, $_->[1]+1),
					 -qual => $self->subqual($_->[0]+1, $_->[1]+1)
					 );
    }
}




Note, for testing I made a package called Bio/Seq/QualityDB.pm (which
is a copy of Bio/Seq/Quality.pm that just has the above method added).
That is why the 'new Bio::Seq::Quality object' is actually a
Bio::Seq::QualityDB object, but other than that it should slot right
in (apart from all the debugging output that I spit out).


Cheers,
Dan.


2009/4/24 Dan Bolser <dan.bolser at gmail.com>:
> Hi all,
>
> I couldn't find out how to get the 'clear range' from a
> Bio::Seq::Quality object... Am I looking in the wrong place, or should
> this method be a part of the Bio::Seq::Quality class?
>
> In the latter case I'm on my way to an implementation, but I am not
> good at navigating the bioperl docs, so I thought I should ask before
> I take the time to finish that off.
>
>
> Cheers,
> Dan.
>



More information about the Bioperl-l mailing list