From cjfields at dev.open-bio.org Tue May 1 13:43:38 2007
From: cjfields at dev.open-bio.org (Christopher John Fields)
Date: Tue, 01 May 2007 17:43:38 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/SearchIO infernal.pm,1.5,1.6
Message-ID: <200705011743.l41HhcKR006069@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/SearchIO
In directory dev.open-bio.org:/tmp/cvs-serv6044
Modified Files:
infernal.pm
Log Message:
Update on v.0.81 output; will try to add parsing support soon.
Index: infernal.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/SearchIO/infernal.pm,v
retrieving revision 1.5
retrieving revision 1.6
diff -C2 -d -r1.5 -r1.6
*** infernal.pm 30 Mar 2007 14:05:56 -0000 1.5
--- infernal.pm 1 May 2007 17:43:35 -0000 1.6
***************
*** 21,29 ****
This is a highly experimental SearchIO-based parser for Infernal output from
the cmsearch program. It currently parses cmsearch output for Infernal
! versions 0.7 and above; older versions may work but will not be supported.
As output format for cmsearch is continually changing, support for this parser
! will only be for the latest available Infernal version (v 0.72 at this time)
! or the latest stable version.
=head1 FEEDBACK
--- 21,30 ----
This is a highly experimental SearchIO-based parser for Infernal output from
the cmsearch program. It currently parses cmsearch output for Infernal
! versions 0.7-0.72; older versions may work but will not be supported.
As output format for cmsearch is continually changing, support for this parser
! will only be for the latest Infernal version (v 0.81 at this time) or the latest
! stable version. Output for v. 0.81 is currently not parsed due to significant
! changes in the output style; I am hoping to rectify this soon.
=head1 FEEDBACK
From cjfields at dev.open-bio.org Tue May 1 13:45:45 2007
From: cjfields at dev.open-bio.org (Christopher John Fields)
Date: Tue, 01 May 2007 17:45:45 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/SearchIO erpin.pm, 1.4,
1.5 rnamotif.pm, 1.5, 1.6
Message-ID: <200705011745.l41HjjNt006165@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/SearchIO
In directory dev.open-bio.org:/tmp/cvs-serv6140
Modified Files:
erpin.pm rnamotif.pm
Log Message:
small bug fixes
Index: rnamotif.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/SearchIO/rnamotif.pm,v
retrieving revision 1.5
retrieving revision 1.6
diff -C2 -d -r1.5 -r1.6
*** rnamotif.pm 26 Mar 2007 19:38:09 -0000 1.5
--- rnamotif.pm 1 May 2007 17:45:43 -0000 1.6
***************
*** 258,261 ****
--- 258,265 ----
if ($self->within_element('hit') && ($hitid ne $lastid)) {
+ $self->element(
+ {'Name' => 'Hit_score',
+ 'Data' => $lastscore}
+ ) if $lastscore;
$self->end_element({'Name' => 'Hit'});
$self->start_element({'Name' => 'Hit'});
***************
*** 284,288 ****
if (!defined($sprintf)) {
if ($score =~ m{[^0-9.-]+}gxms) {
! if ($hsp_min || $hsp_max ) {
$self->warn("HSP data likely contains custom score; ".
"ignoring min/maxscore");
--- 288,292 ----
if (!defined($sprintf)) {
if ($score =~ m{[^0-9.-]+}gxms) {
! if (defined $hsp_min || defined $hsp_max ) {
$self->warn("HSP data likely contains custom score; ".
"ignoring min/maxscore");
***************
*** 303,309 ****
# store best hit score based on the hsp min/maxscore only
! if ($hsp_min && $score > $hsp_min) {
$lastscore = $score if !$lastscore || $score > $lastscore;
! } elsif ($hsp_max && $score < $hsp_max) {
$lastscore = $score if !$lastscore || $score < $lastscore;
}
--- 307,313 ----
# store best hit score based on the hsp min/maxscore only
! if (defined $hsp_min && $score > $hsp_min) {
$lastscore = $score if !$lastscore || $score > $lastscore;
! } elsif (defined $hsp_max && $score < $hsp_max) {
$lastscore = $score if !$lastscore || $score < $lastscore;
}
***************
*** 466,470 ****
$self->throw("Must provide data hash ref") if !$data || !ref($data);
for my $nm (sort keys %{$data}) {
! next if !$data->{$nm} || $data->{$nm} =~ m{^\s*$}o;
if ( $MAPPING{$nm} ) {
if ( ref( $MAPPING{$nm} ) =~ /hash/i ) {
--- 470,474 ----
$self->throw("Must provide data hash ref") if !$data || !ref($data);
for my $nm (sort keys %{$data}) {
! next if $data->{$nm} && $data->{$nm} =~ m{^\s*$}o;
if ( $MAPPING{$nm} ) {
if ( ref( $MAPPING{$nm} ) =~ /hash/i ) {
Index: erpin.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/SearchIO/erpin.pm,v
retrieving revision 1.4
retrieving revision 1.5
diff -C2 -d -r1.4 -r1.5
*** erpin.pm 26 Mar 2007 19:38:09 -0000 1.4
--- erpin.pm 1 May 2007 17:45:43 -0000 1.5
***************
*** 436,440 ****
$self->throw("Must provide data hash ref") if !$data || !ref($data);
for my $nm (sort keys %{$data}) {
! next if $data->{$nm} =~ m{^\s*$}o;
if ( $MAPPING{$nm} ) {
if ( ref( $MAPPING{$nm} ) =~ /hash/i ) {
--- 436,440 ----
$self->throw("Must provide data hash ref") if !$data || !ref($data);
for my $nm (sort keys %{$data}) {
! next if $data->{$nm} && $data->{$nm} =~ m{^\s*$}o;
if ( $MAPPING{$nm} ) {
if ( ref( $MAPPING{$nm} ) =~ /hash/i ) {
From bugzilla-daemon at portal.open-bio.org Wed May 2 10:32:10 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 2 May 2007 10:32:10 -0400
Subject: [Bioperl-guts-l] [Bug 2286] New: spliced_seq throws exception on
Bio::DB::GFF::Feature object
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2286
Summary: spliced_seq throws exception on Bio::DB::GFF::Feature
object
Product: BioPerl
Version: 1.5 branch
Platform: PC
OS/Version: Linux
Status: NEW
Severity: major
Priority: P2
Component: Bio::DB::GFF
AssignedTo: bioperl-guts-l at bioperl.org
ReportedBy: agathman at semo.edu
Pull a Bio::DB::GFF::Feature object with multiple parts (using an aggregator
such as transcript or processed_transcript) from a MySQL db.
Put the object into a variable like $gene.
Then try $foo=$gene->spliced_seq.
It throws an exception indicating that it's finding objects when it's looking
for the sequence strings to put together.
Sample script:
**************************************************************************
#!/usr/bin/perl -w
use strict;
use Bio::DB::GFF;
my $db = Bio::DB::GFF-> new ( - adaptor => 'dbi::mysql',
-dsn =>
'dbi:mysql:database=cc;host=localhost',
-fasta => '/gbrowse/databases/cc'
);
my $seg=$db->segment('ccin_Contig120');
my @genes=$seg->features(-types=>('processed_transcript:GLEAN_alt'));
for my $gene (@genes) {
my $gid = $gene->display_id;
print STDERR "Gene is $gid\n";
print STDERR $gene->seq()->seq() . "\n";
my $splgene = $gene->spliced_seq();
print STDERR $splgene->seq()->seq() . "\n";
}
*****************************************************************************
STDERR output from sample script (printed out gene ID and sequence just to show
that it is getting a feature from the db):
Gene is Jan06m400_GLEAN_11487
ATGAGCTTGACTGAAGAGGAGAATTGGGGTGCGTCACACAGTAGAGGCAAACCACCAATTGAGACGTCCAGAAAAGAACAAAAACATACCTCGAGTCGTGAACCATTCGATTCGAAGGCGAGAAATCGCCCCAAACCTGCCCCTAGCATTCCCATTGTAGTCGAAGACTCTGCCGACATTGAGAGTATTCGAATGAAGGGCATTGAACCTGTGGAGTTCGTCCACGAGTTCACCCGCACCTGGTTTCCGGATAAGCGGATTCGAGATGAGCTGAAACGGAAACACGTACCAGTGCAAAAAATCAATAAAGCGACACGAGTGTTTCGCAAAGTCGCTGAAAAACATATTGAGGACGTAACGAGGATGGGGAAACTCACAGAGGAGGACATCCCAAGGCTACGCGAGAACTGGATCGAGAACTGCAAGGCTATGATGGAGGGTCCGCCTCCCAAGTTACCTCCCCTTCGTGACGTCAACCATCGCATTCCCCTTAAAGACGAAAGGATGCAATACAATTACCACATGCCACGGTGTCCAGATTCGCTGAAACCGCAGCTACTGGAAAAGATCAACCGGTACACGGATGCCGGTTGGTGGCAGCAGAAACAAGTGGATCAAGCTGCTCCAATGCTCTGCATCCCCAAGAAGGACGGGAGATTACGAACCGTCGTAGACTGTAGGAAACGGAACCAGAACACAGTCAAAGATGTTACCCCCTTCCCGGATCAAGACCAGATTCGCTTAGAGGTTGCGCGGGCGAAATATCGTTCGAAAATCGACCTGTCAGACGCCTATGAGCAGGTTAGAGTCGAACCAGAGGACGTCTGGAAGACTGCATTTGCCACAGTGTACGGAACCTTCGTCAGTGCTGTGATGCAGCAGGGAGATTGCAACGCACCTGCGACGTTTCAACGACTCATGAACATGATATTCCGGGAGTACATTGGGATATTTATCCATGTGTATCTTGATGATGTCTTTGTGTACAGTGACTCGATCGAAGAGCATGAAGAGCACTTACGGATCGTGTTCCAGAAGCTGCAGGAGGCATGCCTCTACCTTAAGGCTGAGAAATGTGATCTCTACGCTGAAAAGGTTGATTGTTTAGGTCATATAATCGACGAGAAAGGTATACATGCCGACGCCGATAAGATGGCACGCATCCGAGATTGGAGGACACCAAGGAACTACAACGACATTGAAAGGTTCCTTGGTCTGGTGCAGTACCTGGCTCCTTTCCTCCCGGACATTGCTGCCTACACATCGCCATTGTCCGCCATTACGAAGAATGGTCAAGCATTCCGTTGGGACCCCATCCATGAGACCTGTTTCCAAAGAATCAAGCAGATCTGCTGCTCGACTCCAGTGCTTCGCCCAATCGACCCTCGTAAAACTGACGAGCCGATTTGGGTAATTTGCGACGCCTCAGTCAGTGGTGTTGGTGCCATGTATGGCCAAGGCCCCACCTGGAAAACATGCAGACCCGCTGGTTTCATGTCCAAGAAGTTCACGGACGCCCAGATGAACTACCGAGTCTTCGAGCAGGAGACCCTGGCAATACTCGAAGCCTTGTTAAAGTGGGAAGACAAGCTGATTGGATATCGTATCCATGTGGTAACGGACCACGAAGCTTTGAAGTTCTTCGACAGACAAAAGCGACTGTCAGCTCGTCAAACTCGATGGATGGAGTACCTATCTCGTTTTGACTTCGATATCCGATATGTCGAGGGAAAACTGAACAAAGTAGCCGATGCATTGTCCAGATACTTTGAAAATGACGTCTGGGACGATGTACACGATATCTCTGAGTACTCGAATGCCGACTACAGGCTCGACAAAGCCATGGATGATCTGCCACCCCAGCGGGTTGCGGAAATCGTCAATAACGACGTCGAAATTCGGGCAATGACGCTTGAAACGCCACGTCGATCGGCCAGGCTACAGAGGAGACTCCAAGAGAATGTTGAACTTCGAGATATTGAGGCCCAGAGACTTGCAGAAGCGAACCGACGCGATAAAGGCAAGGAGACCGAGCCACA!
AGCGCCAA
CAAAGTCTCAGGACAACGAGGAGGACGATCCTACCGTGTTCGAGTCGAGAACTCGAGGGAATAGCATTGGGGAAACCATTCCCAATGCAGACGAGTTCATAGAAGCGATCAGGGCATCCTATAAGACCGATGCACTGTTCAGCAAGGTCCTGGAGTCGCCCAAGATGCATACCCAGTTCGAAGTCGAAACAGGGCTGGTGTATACGCGAAATCGAGGTGGAGAGCGCGTACTGTGCGTCCCACATGGATCGTTTAACGACAAATCGCTCCGAGGAATTGTCCTAGAGCAAGCCCACGAGATTCTCGGTCATTTCGGACCACAACGAACGTCAGACTACGTCAGACGATGGTATTGGTGGCCTAGAATCTTTGTTGACACCCAGAAGTTCTGTAGGACGTGTCAGACTTGCATGATGTCGAAAGGAGAGAATAAACGACCCCAGGGACTGCTCCATACGTTACCCGTACCAACCAAGCCATGGCAATCGATTGGTATGGACTTCATCGGTCCATTCCCAGAGGTAAACGGCAAGAACTACCTCTGGGTGATAATTTGTCGCCTCACCTCAATGGTGCACCTAATTCCTGTCCACACGACGAACAAAGCGTCAGAATTATCCGAAATCTATGTCCGAGAGATAGTGAGACTCCATGGCTTACCAGAGTCGATTGTATCGGACAGAGACTCGAAGTTTACGTCGAAATGGTGGCGAGAGATCCATCGGTTATTGGGAACGAAACTCCTAATGTCTACAATCGTTCCACCCTCAGACAGACGGAGTCACAGAGCGTGTAAATCGAAGTATCACGCAGATCCTTCGAGGTGCAGTACAACCGGACCAGAAGGACTGGATCTCGCGTTGCCCTCTGGTAGAGTTTGCTATCAACTCGAGTGTTAG
-------------------- WARNING ---------------------
MSG: Calling spliced_seq with a Bio::Das::SegmentI which does have absolute set
to 1 -- be warned you may not be getting things on the correct strand
---------------------------------------------------
-------------------- WARNING ---------------------
MSG: seq doesn't validate, mismatch is
::,(0,882,510),::,(0,8826348),::,(0,882,564),::,(0,8813,44),::,(0,882,33,),::,(0,882,360),::,(0,8813,6,),::,(0,882633,),::,(0,8826474)
---------------------------------------------------
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: Attempting to set the sequence to
[Bio::PrimarySeq=HASH(0x882c510)Bio::PrimarySeq=HASH(0x8826348)Bio::PrimarySeq=HASH(0x882c564)Bio::PrimarySeq=HASH(0x8813e44)Bio::PrimarySeq=HASH(0x882c33c)Bio::PrimarySeq=HASH(0x882c360)Bio::PrimarySeq=HASH(0x8813d6c)Bio::PrimarySeq=HASH(0x882633c)Bio::PrimarySeq=HASH(0x8826474)]
which does not look healthy
STACK: Error::throw
STACK: Bio::Root::Root::throw
/usr/lib/perl5/site_perl/5.8.6/Bio/Root/Root.pm:359
STACK: Bio::PrimarySeq::seq
/usr/lib/perl5/site_perl/5.8.6/Bio/PrimarySeq.pm:258
STACK: Bio::PrimarySeq::new
/usr/lib/perl5/site_perl/5.8.6/Bio/PrimarySeq.pm:210
STACK: Bio::Seq::new /usr/lib/perl5/site_perl/5.8.6/Bio/Seq.pm:484
STACK: Bio::SeqFeatureI::spliced_seq
/usr/lib/perl5/site_perl/5.8.6/Bio/SeqFeatureI.pm:498
STACK: /transfer/testsplice.pl:21
-----------------------------------------------------------
******************************************************************************
GFFs that went into the database to construct the feature
Jan06m400_GLEAN_11487, on which the script choked above:
##gff-version 3
ccin_Contig271 GLEAN_alt mRNA 158276 161560 0.999825 +
. ID=Gene:Jan06m300_GLEAN_11487;probFraction=0.999825
ccin_Contig271 GLEAN_alt CDS 158276 158377 0.999825 +
0 Parent=Gene:Jan06m300_GLEAN_11487
ccin_Contig271 GLEAN_alt CDS 158431 158450 0.999825 +
0 Parent=Gene:Jan06m300_GLEAN_11487
ccin_Contig271 GLEAN_alt CDS 158502 159075 0.999825 +
1 Parent=Gene:Jan06m300_GLEAN_11487
ccin_Contig271 GLEAN_alt CDS 159232 159582 0.999825 +
0 Parent=Gene:Jan06m300_GLEAN_11487
ccin_Contig271 GLEAN_alt CDS 159641 159861 0.999825 +
0 Parent=Gene:Jan06m300_GLEAN_11487
ccin_Contig271 GLEAN_alt CDS 159918 160077 0.999825 +
1 Parent=Gene:Jan06m300_GLEAN_11487
ccin_Contig271 GLEAN_alt CDS 160134 160251 0.999825 +
0 Parent=Gene:Jan06m300_GLEAN_11487
ccin_Contig271 GLEAN_alt CDS 160303 160526 0.999825 +
2 Parent=Gene:Jan06m300_GLEAN_11487
ccin_Contig271 GLEAN_alt CDS 160587 160747 0.999825 +
0 Parent=Gene:Jan06m300_GLEAN_11487
ccin_Contig271 GLEAN_alt CDS 160804 161406 0.999825 +
1 Parent=Gene:Jan06m300_GLEAN_11487
ccin_Contig271 GLEAN_alt CDS 161467 161560 0.999825 +
1 Parent=Gene:Jan06m300_GLEAN_11487
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed May 2 10:43:14 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 2 May 2007 10:43:14 -0400
Subject: [Bioperl-guts-l] [Bug 2286] spliced_seq throws exception on
Bio::DB::GFF::Feature object
In-Reply-To:
Message-ID: <200705021443.l42EhEWi014298@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2286
------- Comment #1 from cjfields at uiuc.edu 2007-05-02 10:43 EST -------
Not a big deal, but please remember to attach scripts and data using the
'Create a New Attachment' on the bug report page after generating the report,
otherwise the text wraps.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From mcook at dev.open-bio.org Thu May 3 16:01:00 2007
From: mcook at dev.open-bio.org (Malcom Cook)
Date: Thu, 03 May 2007 20:01:00 +0000
Subject: [Bioperl-guts-l] bioperl-live/scripts/Bio-SeqFeature-Store
bp_seqfeature_gff3.PLS, NONE, 1.1
Message-ID: <200705032001.l43K10Cs010372@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/scripts/Bio-SeqFeature-Store
In directory dev.open-bio.org:/tmp/cvs-serv10346
Added Files:
bp_seqfeature_gff3.PLS
Log Message:
initial checkin
--- NEW FILE: bp_seqfeature_gff3.PLS ---
#!/usr/bin/env perl
# $Id $
# AUTHOR: malcolm.cook at stowers-institute.org
use strict;
use Getopt::Long;
use File::Spec;
use Bio::DB::SeqFeature::Store;
#use Carp::Always;
my $DSN;
my $ADAPTOR;
my $VERBOSE = 1;
my $USER = '';
my $PASS = '';
my @gff3opt;
GetOptions(
'dsn=s' => \$DSN,
'adaptor=s' => \$ADAPTOR,
'user=s' => \$USER,
'password=s' => \$PASS,
'gff3opt=i{,}' => \@gff3opt,
) || die <features().
END
$ADAPTOR ||= 'DBI::mysql';
$DSN ||= $ADAPTOR eq 'DBI::mysql' ? "mysql_read_default_file=$ENV{HOME}/.my.cnf" : '';
my $store = Bio::DB::SeqFeature::Store->new(
-dsn => $DSN,
-adaptor => $ADAPTOR,
-user => $USER,
-pass => $PASS,
)
or die "Couldn't create connection to the database";
# on signals, give objects a chance to call their DESTROY methods
$SIG{TERM} = $SIG{INT} = sub { undef $store; die "Aborted..."; };
my $seq_stream = $store->get_seq_stream(@ARGV) or die "failed to get_seq_stream(@ARGV)";
while (my $seq = $seq_stream->next_seq) {
print $seq->gff3_string(@gff3opt) . "\n";
}
exit 0;
From mcook at dev.open-bio.org Thu May 3 16:04:02 2007
From: mcook at dev.open-bio.org (Malcom Cook)
Date: Thu, 03 May 2007 20:04:02 +0000
Subject: [Bioperl-guts-l] bioperl-live/scripts/Bio-SeqFeature-Store
bp_seqfeature_gff3.PLS, 1.1, 1.2
Message-ID: <200705032004.l43K4229010451@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/scripts/Bio-SeqFeature-Store
In directory dev.open-bio.org:/tmp/cvs-serv10425
Modified Files:
bp_seqfeature_gff3.PLS
Log Message:
fixing up the RCS ID line
Index: bp_seqfeature_gff3.PLS
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/scripts/Bio-SeqFeature-Store/bp_seqfeature_gff3.PLS,v
retrieving revision 1.1
retrieving revision 1.2
diff -C2 -d -r1.1 -r1.2
*** bp_seqfeature_gff3.PLS 3 May 2007 20:00:58 -0000 1.1
--- bp_seqfeature_gff3.PLS 3 May 2007 20:04:00 -0000 1.2
***************
*** 1,5 ****
#!/usr/bin/env perl
! # $Id $
!
# AUTHOR: malcolm.cook at stowers-institute.org
--- 1,4 ----
#!/usr/bin/env perl
! # $Id$
# AUTHOR: malcolm.cook at stowers-institute.org
From mcook at dev.open-bio.org Thu May 3 17:14:15 2007
From: mcook at dev.open-bio.org (Malcom Cook)
Date: Thu, 03 May 2007 21:14:15 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/Graphics FeatureBase.pm, 1.29,
1.30
Message-ID: <200705032114.l43LEFBP010592@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/Graphics
In directory dev.open-bio.org:/tmp/cvs-serv10566
Modified Files:
FeatureBase.pm
Log Message:
redefine gff3_string to:
# Return GFF3 format for the feature $self. Optionally
# $recurse to include GFF for any subfeatures of the feature. If
# recursing, provide special handling to "remove an extraneous level
# of parentage" (unless $preserveHomegenousParent) for features
# which have at least one subfeature with the same type as the
# feature itself (thus redefining Lincoln's "homogenous
# parent/child" case, which previously required all children to have
# the same type as parent). This usage is a convention for
# representing discontiguous features; they may be created by using
# the -segment directive without specifying a distinct -subtype to
# Bio::Graphics::FeatureBase->new (or to Bio::DB::SeqFeature,
# Bio::Graphics::Feature). Such homogenous subfeatures created in
# this fashion TYPICALLY do not have the parent (GFF column 9)
# attributes propogated to them; but, since they are all part of the
# same parent, the ONLY difference relevant to GFF production SHOULD
# be the $start and $end coordinates for their segment, and ALL
# THIER OTHER ATTRIBUTES should be taken from the parent (including:
# score, Name, ID, Parent, etc), which happens UNLESS
# $dontPropogateParentAttrs is passed.
Index: FeatureBase.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/Graphics/FeatureBase.pm,v
retrieving revision 1.29
retrieving revision 1.30
diff -C2 -d -r1.29 -r1.30
*** FeatureBase.pm 16 Apr 2007 19:55:33 -0000 1.29
--- FeatureBase.pm 3 May 2007 21:14:13 -0000 1.30
***************
*** 546,592 ****
sub gff3_string {
! my $self = shift;
! my ($recurse,$parent) = @_;
!
! my $name = $self->name;
! my $class = $self->class;
! my $group = $self->format_attributes($parent);
! my $strand = ('-','.','+')[$self->strand+1];
! my $p = join("\t",
! $self->ref||'.',$self->source||'.',$self->method||'.',
! $self->start||'.',$self->stop||'.',
! defined($self->score) ? $self->score : '.',
! $strand||'.',
! defined($self->phase) ? $self->phase : '.',
! $group||'');
! # the "homogeneous" flag will be true if the parent and children are all of the same type,
! # meaning that they can be collapsed into a set of children with all the same ID
! my ($parent_type,$homogeneous);
! $homogeneous = 1;
! my @children;
! if ($recurse) {
! foreach ($self->sub_SeqFeature) {
! push @children,$_->gff3_string(1,$self);
! $parent_type ||= $self->type;
! $homogeneous &&= $_->type eq $parent_type && !defined $_->primary_id;
! }
! }
- # if we get here we're dealing with a homogeneous set of Parent[child,child...]
- # where parent and child all have the same type. In this case, we omit the Parent
- # and give the children the same ID. This removes an extraneous level of parentage.
! if (@children && $homogeneous) {
! foreach (@children) {
! s/Parent=/ID=/g;
! } # replace Parent tag with ID
! return join "\n", at children;
! }
! return join("\n",$p, at children);
}
-
sub db { return }
--- 546,608 ----
sub gff3_string {
! my ($self, $recurse, $preserveHomegenousParent, $dontPropogateParentAttrs,
! # Note: the following parameters, whose name begins with '$_',
! # are intended for recursive call only.
! $_parent,
! $_parentGroup, # if so, what is the group (GFF column 9) of the parent
! ) = @_;
! # PURPOSE: Return GFF3 format for the feature $self. Optionally
! # $recurse to include GFF for any subfeatures of the feature. If
! # recursing, provide special handling to "remove an extraneous level
! # of parentage" (unless $preserveHomegenousParent) for features
! # which have at least one subfeature with the same type as the
! # feature itself (thus redefining Lincoln's "homogenous
! # parent/child" case, which previously required all children to have
! # the same type as parent). This usage is a convention for
! # representing discontiguous features; they may be created by using
! # the -segment directive without specifying a distinct -subtype to
! # Bio::Graphics::FeatureBase->new (or to Bio::DB::SeqFeature,
! # Bio::Graphics::Feature). Such homogenous subfeatures created in
! # this fashion TYPICALLY do not have the parent (GFF column 9)
! # attributes propogated to them; but, since they are all part of the
! # same parent, the ONLY difference relevant to GFF production SHOULD
! # be the $start and $end coordinates for their segment, and ALL
! # THIER OTHER ATTRIBUTES should be taken from the parent (including:
! # score, Name, ID, Parent, etc), which happens UNLESS
! # $dontPropogateParentAttrs is passed.
! my @rsf = $recurse ? $self->sub_SeqFeature : ();
! my $recurseSubfeatureWithSameType =
! # will be TRUE if we're going to recurse and at least 1 subfeature
! # has same type as $self.
! sub {($_->type eq $self->type) && return 1 for @rsf ; 0 }->();
! my $typeIsSameAsParent = $_parent && ($_parent->type eq $self->type);
! my $hparentOrSelf = ($typeIsSameAsParent && ! $dontPropogateParentAttrs) ? $_parent : $self;
! my $group = ($typeIsSameAsParent && ! $dontPropogateParentAttrs) ? $_parentGroup : $self->format_attributes($_parent);
! my @gff3 = $recurseSubfeatureWithSameType && ! $preserveHomegenousParent ? () :
! do {
! my $name = $hparentOrSelf->name;
! my $class = $hparentOrSelf->class;
! my $strand = ('-','.','+')[$hparentOrSelf->strand+1];
! # TODO: understand conditions under which $self->strand could be other than
! # $hparentOrSelf->strand. In particular, why does add_segment flip
! # the strand when start > stop? I thought this was not allowed!
! # Lincoln - any ideas?
! my $p = join("\t",
! $hparentOrSelf->ref||'.',$hparentOrSelf->source||'.',$hparentOrSelf->method||'.',
! $self->start||'.',$self->stop||'.',
! defined($hparentOrSelf->score) ? $hparentOrSelf->score : '.',
! $strand||'.',
! defined($hparentOrSelf->phase) ? $hparentOrSelf->phase : '.',
! $group||'');
! $p;
! };
! join("\n", @gff3, map {$_->gff3_string($recurse,$preserveHomegenousParent,
! $dontPropogateParentAttrs,$hparentOrSelf,$group)} @rsf);
}
sub db { return }
***************
*** 657,662 ****
push @result,join '=',$self->escape($t),join(',', map {$self->escape($_)} @values) if @values;
}
! my $id = $self->primary_id;
! my $name = $self->display_name;
unshift @result,"ID=".$self->escape($id) if defined $id;
unshift @result,"Parent=".$self->escape($parent->primary_id) if defined $parent;
--- 673,678 ----
push @result,join '=',$self->escape($t),join(',', map {$self->escape($_)} @values) if @values;
}
! my $id = ($parent || $self)->primary_id;
! my $name = ($parent || $self)->display_name;
unshift @result,"ID=".$self->escape($id) if defined $id;
unshift @result,"Parent=".$self->escape($parent->primary_id) if defined $parent;
From bugzilla-daemon at portal.open-bio.org Thu May 3 17:47:34 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 3 May 2007 17:47:34 -0400
Subject: [Bioperl-guts-l] [Bug 2287] New: performance problem with MySQL
(SpeciesAdaptorDriver::get_classification)
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2287
Summary: performance problem with MySQL
(SpeciesAdaptorDriver::get_classification)
Product: BioPerl
Version: 1.5 branch
Platform: PC
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P2
Component: bioperl-db
AssignedTo: bioperl-guts-l at bioperl.org
ReportedBy: samborsky_d at yahoo.com
Hi All,
I've found a serious performance problem with MySQL v.4.1.16
caused by BETWEEN operator in DB/BioSQL/mysql/SpeciesAdaptorDriver.pm
get_classification() method. When it's changed into a pair of '<=' & '>=',
the same search works much faster.
Actually it's an issue of MySQL server, but it can be easily avoided
by SELECT statement adaptation (look at suggested patch code in the attached
file).
Best wishes,
Dmitry Samborskiy
P.S. My MySQL version is:
$ mysql -V
mysql Ver 14.7 Distrib 4.1.16, for redhat-linux-gnu (i386) using readline 4.3
Problem: BETWEEN is much slower than >= AND <=
Below is MySql SELECT explanation log (look at rows=348651 vs. rows=245):
mysql> explain SELECT name.name, node.node_rank FROM taxon node, taxon taxon,
ta
xon_name name WHERE name.taxon_id = node.taxon_id AND taxon.left_value BETWEEN
n
ode.left_value AND node.right_value AND taxon.taxon_id = '426890' ORDER BY
node.
left_value;
+----+-------------+-------+-------+-------------------------+----------------+-
--------+--------------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key |
key_len | ref | rows | Extra |
+----+-------------+-------+-------+-------------------------+----------------+-
--------+--------------------------+--------+-------------+
| 1 | SIMPLE | taxon | const | PRIMARY,left_value | PRIMARY |
4 | const | 1 | |
| 1 | SIMPLE | node | index | PRIMARY | left_value |
5 | NULL | 348651 | Using where |
| 1 | SIMPLE | name | ref | taxon_id,taxnametaxonid | taxnametaxonid |
4 | bio_db_new.node.taxon_id | 1 | Using index |
+----+-------------+-------+-------+-------------------------+----------------+-
--------+--------------------------+--------+-------------+
3 rows in set (0.00 sec)
mysql> explain SELECT name.name, node.node_rank FROM taxon node, taxon taxon,
ta
xon_name name WHERE name.taxon_id = node.taxon_id AND taxon.left_value >=
node.l
eft_value AND taxon.left_value <= node.right_value AND taxon.taxon_id =
'426890'
ORDER BY node.left_value;
+----+-------------+-------+-------+--------------------------------+-----------
-----+---------+--------------------------+------+----------------+
| id | select_type | table | type | possible_keys | key
| key_len | ref | rows | Extra |
+----+-------------+-------+-------+--------------------------------+-----------
-----+---------+--------------------------+------+----------------+
| 1 | SIMPLE | taxon | const | PRIMARY,left_value | PRIMARY
| 4 | const | 1 | Using filesort |
| 1 | SIMPLE | node | range | PRIMARY,left_value,right_value |
right_valu
e | 5 | NULL | 245 | Using where |
| 1 | SIMPLE | name | ref | taxon_id,taxnametaxonid |
taxnametax
onid | 4 | bio_db_new.node.taxon_id | 1 | Using index |
+----+-------------+-------+-------+--------------------------------+-----------
-----+---------+--------------------------+------+----------------+
3 rows in set (0.00 sec)
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu May 3 17:48:19 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 3 May 2007 17:48:19 -0400
Subject: [Bioperl-guts-l] [Bug 2287] performance problem with MySQL
(SpeciesAdaptorDriver::get_classification)
In-Reply-To:
Message-ID: <200705032148.l43LmJRO017382@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2287
------- Comment #1 from samborsky_d at yahoo.com 2007-05-03 17:48 EST -------
Created an attachment (id=644)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=644&action=view)
suggested patch
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Mon May 7 07:41:14 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 7 May 2007 07:41:14 -0400
Subject: [Bioperl-guts-l] [Bug 2288] New: Bio::SeqIO::swiss throws when
classification includes internal period
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2288
Summary: Bio::SeqIO::swiss throws when classification includes
internal period
Product: BioPerl
Version: main-trunk
Platform: PC
OS/Version: Windows
Status: NEW
Severity: normal
Priority: P2
Component: Bio::SeqIO
AssignedTo: bioperl-guts-l at bioperl.org
ReportedBy: roy at colibase.bham.ac.uk
Bio::SeqIO throws with Uniprot entries such as Q8GBD3, where one of the
taxonomy nodes includes an internal period (in this case "Acetobacter subgen.
Acetobacter"). The regex to split the classification splits on [;\.], meaning
that the node is incorrectly split into "Acetobacter subgen" and "Acetobacter".
This results in the throw:
"The lineage 'Bacteria, Proteobacteria, Alphaproteobacteria, Rhodospirillales,
Acetobacteraceae, Acetobacter, Acetobacter subgen, Acetobacter, Acetobacter
aceti' had two non-consecutive nodes with the same name. Can't cope!"
The attached patch fixes the problem, but I'm not sure if it breaks anything
else (I assume there was a reason for the regex being as it is).
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Mon May 7 07:42:05 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 7 May 2007 07:42:05 -0400
Subject: [Bioperl-guts-l] [Bug 2288] Bio::SeqIO::swiss throws when
classification includes internal period
In-Reply-To:
Message-ID: <200705071142.l47Bg5Y9013749@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2288
------- Comment #1 from roy at colibase.bham.ac.uk 2007-05-07 07:42 EST -------
Created an attachment (id=645)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=645&action=view)
Possible patch
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From cjfields at dev.open-bio.org Tue May 8 19:00:23 2007
From: cjfields at dev.open-bio.org (Christopher John Fields)
Date: Tue, 08 May 2007 23:00:23 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio ParameterBaseI.pm,NONE,1.1
Message-ID: <200705082300.l48N0N1q028334@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio
In directory dev.open-bio.org:/tmp/cvs-serv28309
Added Files:
ParameterBaseI.pm
Log Message:
New generic interface encapsulating parameters and helper methods
--- NEW FILE: ParameterBaseI.pm ---
# $Id: ParameterBaseI.pm,v 1.1 2007/05/08 23:00:21 cjfields Exp $
#
# BioPerl module for Bio::ParameterBaseI
#
# Cared for by Chris Fields
#
# Copyright Chris Fields
#
# You may distribute this module under the same terms as perl itself
#
# POD documentation - main docs before the code
=head1 NAME
Bio::ParameterBaseI - Simple interface class for any parameter-related data such
as IDs, database name, program arguments, and other odds and ends.
=head1 SYNOPSIS
# Bio::DB::MyParams implements Bio::ParameterBaseI
@params = (-db => 'protein',
-id => \@ids,
-retmax => 10);
$pobj->Bio::DB::MyDBParams->new();
# sets only parameters passed; results in a state change if any parameter
# passed is new or differs from previously set value
$pobj->set_params(@params);
# reset all parameters (sets to undef); results in a state change
$pobj->reset_params();
# resets parameters to those in %param (sets all others to undef); resets the
# object state to indicate change.
$pobj->reset_params(@params);
# direct get/set; results in a state change if any parameter passed is new or
# differs from previously set value
$pobj->db('nucleotide');
@ids = $pobj->id();
# retrieve list containing set defined parameters
%myparams = $pobj->get_parameters();
# checks whether the state of the object has changed (i.e. parameter has
# changed, so on)
if ($pobj->parameters_changed) {
# run new search
} else {
# return cached search
}
# available parameters
@params = $pobj->available_parameters();
# retrieve string (URI, query, etc); calling to* methods changes object state
# to indicate data hasn't changed (so future calls to parameters_changed()
# will return FALSE)
$query = $pobj->to_string(); # returns raw string
$uri = $pobj->to_uri(); # returns URI-based object
$uri = $pobj->to_my_data_struct(); # returns implemenation-specific data structure
...
=head1 DESCRIPTION
This is an Observable class interface which focuses on common parameter-related
tasks such as building simple database queries, URI-related requests, program
arguments, etc.
Implementing classes use the following ways to set parameters:
1) Create a new instance of a ParameterBaseI-implementing object.
$pobj->Bio::DB::MyParamClass->new(-db => 'local', -id => \@ids);
2) Pass the parameters as a hash or array to set_parameters(), which sets the
parameters listed in the hash but leaves all others as is.
$pobj->set_parameters(-retmax => 100, -retstart => 20);
3) Pass the parameters as a hash or array to reset_parameters(), which sets the
parameters listed in the hash and resets everything else.
$pobj->reset_parameters(-term => 'pyrimidine'); # sets db and id to undef
4) Pass values using specific getter/setters.
$pobj->id(\@ids); # sets IDs
There is no restriction on what one uses to set up individual parameter
getter/setters, though there are some other options implemented in BioPerl (for
instance, Bio::Root::RootI::_set_from_args()).
A key requirement is there be a way to detect changes in the state of the
ParameterBaseI object so that any Observer object can decide whether to submit a
new request or return cached data. State changes are revealed by the returned
values of the parameters_changed() method, which is a simple boolean set to TRUE
when the object is first instantiated or parameters have changed.
When retrieving anything using the implementation-specific to_* methods (such as
to_query, to_string, to_uri, etc), the ParameterBaseI object state is set to
FALSE to indicate the data has been accessed and indicate reaccessing will
retrieve the same value. The observing object can then independently decide
whether to rerun the cached query or return a previously cached result.
One can also use indiviual getter/setters to retrieve single parameter values as
well as use parameter_hash() to retrieve all of the parameters in one go as a
hash. To check which parameters are available use available_parameters().
=head1 FEEDBACK
=head2 Mailing Lists
User feedback is an integral part of the
evolution of this and other Bioperl modules. Send
your comments and suggestions preferably to one
of the Bioperl mailing lists. Your participation
is much appreciated.
bioperl-l at lists.open-bio.org - General discussion
http://www.bioperl.org/wiki/Mailing_lists - About the mailing lists
=head2 Reporting Bugs
Report bugs to the Bioperl bug tracking system to
help us keep track the bugs and their resolution.
Bug reports can be submitted via the web.
http://bugzilla.open-bio.org/
=head1 AUTHOR
Email cjfields at uiuc dot edu
=head1 APPENDIX
The rest of the documentation details each of the
object methods. Internal methods are usually
preceded with a _
=cut
# Let the code begin...
package Bio::ParameterBaseI;
use strict;
use warnings;
use base qw(Bio::Root::RootI);
=head2
Title : set_parameters
Usage : $pobj->set_parameters(%params);
Function: sets the parameters listed in the hash or array
Returns : None
Args : [optional] hash or array of parameter/values.
=cut
sub set_parameters {
shift->throw_not_implemented;
}
=head2
Title : reset_parameters
Usage : resets values
Function: resets parameters to either undef or value in passed hash
Returns : none
Args : [optional] hash of parameter-value pairs
=cut
sub reset_parameters {
shift->throw_not_implemented;
}
=head2
Title : parameters_changed
Usage : if ($pobj->parameters_changed) {...}
Function: Returns boolean true (1) if parameters have changed
Returns : Boolean (0 or 1)
Args : [optional] Boolean
=cut
sub parameters_changed {
shift->throw_not_implemented;
}
=head2
Title : available_parameters
Usage : @params = $pobj->available_parameters()
Function: Returns a list of the available parameters
Returns : Array of parameters
Args : [optional, implementation-dependent] string for returning subset of
parameters
=cut
sub available_parameters {
shift->throw_not_implemented;
}
=head2
Title : get_parameters
Usage : %params = $pobj->get_parameters;
Function: Returns list of key-value pairs of parameter => value
Returns : List of key-value pairs
Args : [optional] A string is allowed if subsets are wanted or (if a
parameter subset is default) 'all' to return all parameters
=cut
sub get_parameters {
shift->throw_not_implemented;
}
=head1 to* methods
All to_* methods are implementation-specific
=cut
1;
From cjfields at dev.open-bio.org Tue May 8 19:01:52 2007
From: cjfields at dev.open-bio.org (Christopher John Fields)
Date: Tue, 08 May 2007 23:01:52 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/SearchIO
IteratedSearchResultEventBuilder.pm, 1.12, 1.13
Message-ID: <200705082301.l48N1qh0028432@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/SearchIO
In directory dev.open-bio.org:/tmp/cvs-serv28407
Modified Files:
IteratedSearchResultEventBuilder.pm
Log Message:
small comment for tracking bug 1986
Index: IteratedSearchResultEventBuilder.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/SearchIO/IteratedSearchResultEventBuilder.pm,v
retrieving revision 1.12
retrieving revision 1.13
diff -C2 -d -r1.12 -r1.13
*** IteratedSearchResultEventBuilder.pm 26 Sep 2006 22:03:13 -0000 1.12
--- IteratedSearchResultEventBuilder.pm 8 May 2007 23:01:50 -0000 1.13
***************
*** 304,307 ****
--- 304,308 ----
}
+ # This is the problem leading to Bug 1986...
# Title : _store_hit (private function for internal use only)
From bugzilla-daemon at portal.open-bio.org Fri May 11 14:21:29 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 11 May 2007 14:21:29 -0400
Subject: [Bioperl-guts-l] [Bug 2203] Signalp extension or new module
In-Reply-To:
Message-ID: <200705111821.l4BILTqi011009@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2203
tuco at pasteur.fr changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #591 is|0 |1
obsolete| |
Attachment #592 is|0 |1
obsolete| |
Attachment #595 is|0 |1
obsolete| |
------- Comment #17 from tuco at pasteur.fr 2007-05-11 14:21 EST -------
Created an attachment (id=649)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=649&action=view)
Last implementation of Bio::Tools::Signalp::ExtendedSiganlp
This new module do:
- parse summary(hmm and/or nn) output
- parse short (hmm and/or nn) output
- implements the Bio::SequenceAnalysisResultI by inheriting from
Bio::Tools::AnalysisResult (Chris)
- keeps default behavior from Bio::Tools::Signalp, to know, with nn
(unfortunately) only, returns features that have max Y and mean S factors set
to YES unless -factors options is used and overwrite factors.
I joined a test script and some data with it.
Regards
Emmanuel
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri May 11 14:22:29 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 11 May 2007 14:22:29 -0400
Subject: [Bioperl-guts-l] [Bug 2203] Signalp extension or new module
In-Reply-To:
Message-ID: <200705111822.l4BIMTlu011126@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2203
------- Comment #18 from tuco at pasteur.fr 2007-05-11 14:22 EST -------
Created an attachment (id=650)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=650&action=view)
Test script
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri May 11 14:24:28 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 11 May 2007 14:24:28 -0400
Subject: [Bioperl-guts-l] [Bug 2203] Signalp extension or new module
In-Reply-To:
Message-ID: <200705111824.l4BIOSMD011208@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2203
------- Comment #19 from tuco at pasteur.fr 2007-05-11 14:24 EST -------
Created an attachment (id=651)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=651&action=view)
Data files (tgz)
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From dave_messina at dev.open-bio.org Sat May 12 00:20:29 2007
From: dave_messina at dev.open-bio.org (Dave Messina)
Date: Sat, 12 May 2007 04:20:29 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/Search/Result CrossMatchResult.pm,
NONE, 1.1
Message-ID: <200705120420.l4C4KTsb002719@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/Search/Result
In directory dev.open-bio.org:/tmp/cvs-serv2687/Bio/Search/Result
Added Files:
CrossMatchResult.pm
Log Message:
NEW: initial import of Bio::Search::Result::CrossMatchResult and
Bio::SearchIO::cross_match. Many thanks to Shin Leong for these.
--- NEW FILE: CrossMatchResult.pm ---
package Bio::Search::Result::CrossMatchResult;
# $Id: CrossMatchResult.pm,v 1.1 2007/05/12 04:20:26 dave_messina Exp $
#
# BioPerl module for Bio::Search::Result::CrossMatchResult
#
# Cared for by Shin Leong
#
# Copyright Shin Leong
#
# You may distribute this module under the same terms as perl itself
# POD documentation - main docs before the code
=head1 NAME
Bio::Search::Result::CrossMatchResult - CrossMatch-specific subclass of Bio::Search::Result::GenericResult
=head1 SYNOPSIS
# Working with iterations (CrossMatch results)
$result->next_iteration();
$result->num_iterations();
$result->iteration();
$result->iterations();
# See Bio::Search::Result::GenericResult for information about working with Results.
# See L
# for details about working with iterations.
# TODO:
# * Show how to configure a SearchIO stream so that it generates
# CrossMatchResult objects.
=head1 DESCRIPTION
This object is a subclass of Bio::Search::Result::GenericResult
and provides some operations that facilitate working with CrossMatch
and CrossMatch results.
For general information about working with Results, see
Bio::Search::Result::GenericResult.
=head1 FEEDBACK
=head2 Mailing Lists
User feedback is an integral part of the evolution of this and other
Bioperl modules. Send your comments and suggestions preferably to
the Bioperl mailing list. Your participation is much appreciated.
bioperl-l at bioperl.org - General discussion
http://bioperl.org/MailList.shtml - About the mailing lists
=head2 Reporting Bugs
Report bugs to the Bioperl bug tracking system to help us keep track
of the bugs and their resolution. Bug reports can be submitted via
email or the web:
bioperl-bugs at bioperl.org
http://bugzilla.bioperl.org/
=head1 AUTHOR - Shin Leong
Email sleong at watson.wustl.edu
=head1 CONTRIBUTORS
Additional contributors names and emails here
=head1 APPENDIX
The rest of the documentation details each of the object methods.
Internal methods are usually preceded with a _
=cut
# Let the code begin...
package Bio::Search::Result::CrossMatchResult;
use vars qw(@ISA);
use strict;
use Bio::Search::Result::GenericResult;
@ISA = qw( Bio::Search::Result::GenericResult );
=head2 new
Title : new
Usage : my $obj = new Bio::Search::Result::CrossMatchResult();
Function: Builds a new Bio::Search::Result::CrossMatchResult object
Returns : Bio::Search::Result::CrossMatchResult
Args : See Bio::Search::Result::GenericResult();
The following parameters are specific to CrossMatchResult:
-iterations => array ref of Bio::Search::Iteration::IterationI objects
-inclusion_threshold => e-value threshold for inclusion in the
CrossMatch score matrix model (blastpgp)
=cut
sub new {
my($class, at args) = @_;
my $self = $class->SUPER::new(@args);
$self->{'_iterations'} = [];
$self->{'_iteration_index'} = 0;
$self->{'_iteration_count'} = 0;
my( $iters, $ithresh ) = $self->_rearrange([qw(ITERATIONS
INCLUSION_THRESHOLD)], at args);
$self->{'_inclusion_threshold'} = $ithresh; # This is a read-only variable
if( defined $iters ) {
$self->throw("Must define arrayref of Iterations when initializing a $class\n") unless ref($iters) =~ /array/i;
foreach my $i ( @{$iters} ) {
$self->add_iteration($i);
}
}
else {
# This shouldn't get called with the new SearchIO::blast.
#print STDERR "CrossMatchResult::new(): Not adding iterations.\n";
$self->{'_no_iterations'} = 1;
}
#$self->SUPER::algorithm('cross_match');
return $self;
}
=head2 hits
This method overrides L to take
into account the possibility of multiple iterations, as occurs in CrossMatch reports.
If there are multiple iterations, all 'new' hits for all iterations are returned.
These are the hits that did not occur in a previous iteration.
See Also: L
=cut
sub hits {
my ($self) = shift;
if ($self->{'_no_iterations'}) {
return $self->SUPER::hits;
}
my @hits = ();
foreach my $it ($self->iterations) {
push @hits, $it->hits;
}
return @hits;
}
=head2 next_hit
This method overrides L to take
into account the possibility of multiple iterations, as occurs in CrossMatch reports.
If there are multiple iterations, calling next_hit() traverses the
all of the hits, old and new, for each iteration, calling next_hit() on each iteration.
See Also: L
=cut
sub next_hit {
my ($self, at args) = @_;
if ($self->{'_no_iterations'}) {
return $self->SUPER::next_hit(@args);
}
my $iter_index;
if (not defined $self->{'_last_hit'}) {
$iter_index = $self->{'_iter_index'} = $self->_next_iteration_index;
} else {
$iter_index = $self->{'_iter_index'};
}
return undef if $iter_index >= scalar @{$self->{'_iterations'}};
my $it = $self->{'_iterations'}->[$iter_index];
my $hit = $self->{'_last_hit'} = $it->next_hit;
return defined($hit) ? $hit : $self->next_hit;
}
=head2 num_hits
This method overrides L to take
into account the possibility of multiple iterations, as occurs in CrossMatch reports.
If there are multiple iterations, calling num_hits() returns the number of
'new' hits for each iteration. These are the hits that did not occur
in a previous iteration.
See Also: L
=cut
sub num_hits{
my ($self) = shift;
if ($self->{'_no_iterations'}) {
return $self->SUPER::num_hits;
}
if (not defined $self->{'_iterations'}) {
$self->throw("Can't get Hits: data not collected.");
}
return scalar( $self->hits );
}
=head2 add_iteration
Title : add_iteration
Usage : $report->add_iteration($iteration)
Function: Adds a IterationI to the stored list of iterations
Returns : Number of IterationI currently stored
Args : Bio::Search::Iteration::IterationI
=cut
sub add_iteration {
my ($self,$i) = @_;
if( $i->isa('Bio::Search::Iteration::IterationI') ) {
push @{$self->{'_iterations'}}, $i;
$self->{'_iteration_count'}++;
} else {
$self->throw("Passed in a " .ref($i).
" as a Iteration which is not a Bio::Search::IterationI.");
}
return scalar @{$self->{'_iterations'}};
}
=head2 next_iteration
Title : next_iteration
Usage : while( $it = $result->next_iteration()) { ... }
Function: Returns the next Iteration object, representing all hits
found within a given CrossMatch iteration.
Returns : a Bio::Search::Iteration::IterationI object or undef if there are no more.
Args : none
=cut
sub next_iteration {
my ($self) = @_;
unless($self->{'_iter_queue_started'}) {
$self->{'_iter_queue'} = [$self->iterations()];
$self->{'_iter_queue_started'} = 1;
}
return shift @{$self->{'_iter_queue'}};
}
=head2 iteration
Usage : $iteration = $blast->iteration( $number );
Purpose : Get an IterationI object for the specified iteration
in the search result (CrossMatch).
Returns : Bio::Search::Iteration::IterationI object
Throws : Bio::Root::NoSuchThing exception if $number is not within
range of the number of iterations in this report.
Argument : integer (optional, if not specified get the last iteration)
First iteration = 1
=cut
sub iteration {
my ($self,$num) = @_;
$num = scalar @{$self->{'_iterations'}} unless defined $num;
unless ($num >= 1 and $num <= scalar $self->{'_iteration_count'}) {
$self->throw(-class=>'Bio::Root::NoSuchThing',
-text=>"No such iteration number: $num. Valid range=1-$self->{'_iteration_count'}",
-value=>$num);
}
return $self->{'_iterations'}->[$num-1];
}
=head2 num_iterations
Usage : $num_iterations = $blast->num_iterations;
Purpose : Get the number of iterations in the search result (CrossMatch).
Returns : Total number of iterations in the report
Argument : none (read-only)
=cut
sub num_iterations { shift->{'_iteration_count'} }
# Methods provided for consistency with BPpsilite.pm
=head2 number_of_iterations
Same as L.
=cut
sub number_of_iterations { shift->num_iterations }
=head2 round
Same as L.
=cut
sub round { shift->iteration(@_) }
=head2 iterations
Title : iterations
Usage : my @iterations = $result->iterations
Function: Returns the IterationI objects contained within this Result
Returns : Array of L objects
Args : none
=cut
sub iterations {
my $self = shift;
my @its = ();
if( ref($self->{'_iterations'}) =~ /ARRAY/i ) {
@its = @{$self->{'_iterations'}};
}
return @its;
}
=head2 no_hits_found
Usage : $nohits = $blast->no_hits_found( $iteration_number );
Purpose : Get boolean indicator indicating whether or not any hits
were present in the report.
This is NOT the same as determining the number of hits via
the hits() method, which will return zero hits if there were no
hits in the report or if all hits were filtered out during the parse.
Thus, this method can be used to distinguish these possibilities
for hitless reports generated when filtering.
Returns : Boolean
Argument : (optional) integer indicating the iteration number (CrossMatch)
If iteration number is not specified and this is a CrossMatch result,
then this method will return true only if all iterations had
no hits found.
=cut
sub no_hits_found {
my ($self, $round) = @_;
my $result = 0; # final return value of this method.
# Watch the double negative!
# result = 0 means "yes hits were found"
# result = 1 means "no hits were found" (for the indicated iteration or all iterations)
# If a iteration was not specified and there were multiple iterations,
# this method should return true only if all iterations had no hits found.
if( not defined $round ) {
if( $self->{'_iterations'} > 1) {
$result = 1;
foreach my $i( 1..$self->{'_iterations'} ) {
if( not defined $self->{"_iteration_$i"}->{'_no_hits_found'} ) {
$result = 0;
last;
}
}
}
else {
$result = $self->{"_iteration_1"}->{'_no_hits_found'};
}
}
else {
$result = $self->{"_iteration_$round"}->{'_no_hits_found'};
}
return $result;
}
=head2 set_no_hits_found
Usage : $blast->set_no_hits_found( $iteration_number );
Purpose : Set boolean indicator indicating whether or not any hits
were present in the report.
Returns : n/a
Argument : (optional) integer indicating the iteration number (CrossMatch)
=cut
sub set_no_hits_found {
my ($self, $round) = @_;
$round ||= 1;
$self->{"_iteration_$round"}->{'_no_hits_found'} = 1;
}
=head2 _next_iteration_index
Title : _next_iteration_index
Usage : private
=cut
sub _next_iteration_index{
my ($self, at args) = @_;
return $self->{'_iteration_index'}++;
}
=head2 rewind
Title : rewind
Usage : $result->rewind;
Function: Allow one to reset the Iteration iterator to the beginning
Since this is an in-memory implementation
Returns : none
Args : none
=cut
sub rewind {
my $self = shift;
$self->SUPER::rewind(@_);
$self->{'_iteration_index'} = 0;
foreach ($self->iterations) {
$_->rewind;
}
}
=head2 inclusion_threshold
Title : inclusion_threshold
Usage : my $incl_thresh = $result->inclusion_threshold; (read-only)
Function: Gets the e-value threshold for inclusion in the CrossMatch
score matrix model (blastpgp) that was used for generating the report
being parsed.
Returns : number (real) or undef if not a CrossMatch report.
Args : none
=cut
sub inclusion_threshold {
my $self = shift;
return $self->{'_inclusion_threshold'};
}
sub algorithm_old {
my $self = shift;
my $value = shift;
if($value) {
print STDERR "Cannot set the algorightm on this class!\n";
return $self->SUPER::algorithm;
} else {
return $self->SUPER::algorithm;
}
}
1;
#$Header: /home/repository/bioperl/bioperl-live/Bio/Search/Result/CrossMatchResult.pm,v 1.1 2007/05/12 04:20:26 dave_messina Exp $
From dave_messina at dev.open-bio.org Sat May 12 00:20:29 2007
From: dave_messina at dev.open-bio.org (Dave Messina)
Date: Sat, 12 May 2007 04:20:29 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/SearchIO cross_match.pm, NONE, 1.1
Message-ID: <200705120420.l4C4KTpq002722@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/SearchIO
In directory dev.open-bio.org:/tmp/cvs-serv2687/Bio/SearchIO
Added Files:
cross_match.pm
Log Message:
NEW: initial import of Bio::Search::Result::CrossMatchResult and
Bio::SearchIO::cross_match. Many thanks to Shin Leong for these.
--- NEW FILE: cross_match.pm ---
# $Id: cross_match.pm,v 1.1 2007/05/12 04:20:27 dave_messina Exp $
#
# BioPerl module for Bio::SearchIO::cross_match
#
# Cared for by Shin Leong
#
# Copyright Shin Leong
#
# You may distribute this module under the same terms as perl itself
# POD documentation - main docs before the code
=head1 NAME
Bio::SearchIO::cross_match - CrossMatch-specific subclass of Bio::SearchIO
=head1 SYNOPSIS
# Working with iterations (CrossMatch results)
my $searchIO = new Bio::SearchIO( -format => 'cross_match',
-file => "$file.screen.out" )
while(my $r = $searchIO->next_result) {
while(my $hit = $r->next_hit) {
while(my $hsp = $hit->next_hsp) {
#Do the processing here.
}
}
}
# See Bio::SearchIO for information about working with Results.
# See L
# for details about working with Bio::SearchIO.
=head1 DESCRIPTION
This object is a subclass of Bio::SearchIO
and provides some operations that facilitate working with CrossMatch
and CrossMatch results.
For general information about working with Results, see
L.
=head1 FEEDBACK
=head2 Mailing Lists
User feedback is an integral part of the evolution of this and other
Bioperl modules. Send your comments and suggestions preferably to
the Bioperl mailing list. Your participation is much appreciated.
bioperl-l at bioperl.org - General discussion
http://bioperl.org/MailList.shtml - About the mailing lists
=head2 Reporting Bugs
Report bugs to the Bioperl bug tracking system to help us keep track
of the bugs and their resolution. Bug reports can be submitted via
email or the web:
bioperl-bugs at bioperl.org
http://bugzilla.bioperl.org/
=head1 AUTHOR - Shin Leong
Email sleong at watson.wustl.edu
=head1 CONTRIBUTORS
Additional contributors names and emails here
=head1 APPENDIX
The rest of the documentation details each of the object methods.
Internal methods are usually preceded with a _
=cut
# Let the code begin...
package Bio::SearchIO::cross_match;
use Bio::Search::Result::CrossMatchResult;
use Bio::SearchIO;
use Bio::Search::Hit::GenericHit;
use Bio::Search::HSP::GenericHSP;
our @ISA = qw(Bio::SearchIO);
=head2 next_result
Title : next_result
Usage : $result = stream->next_result
Function: Reads the next ResultI object from the stream and returns it.
Certain driver modules may encounter entries in the stream that
are either misformatted or that use syntax not yet understood
by the driver. If such an incident is recoverable, e.g., by
dismissing a feature of a feature table or some other non-mandatory
part of an entry, the driver will issue a warning. In the case
of a non-recoverable situation an exception will be thrown.
Do not assume that you can resume parsing the same stream after
catching the exception. Note that you can always turn recoverable
errors into exceptions by calling $stream->verbose(2) (see
Bio::Root::RootI POD page).
Returns : A Bio::Search::Result::ResultI object
Args : n/a
See L
=cut
sub next_result {
my ($self) = @_;
my $start = 0;
while( defined ($_ = $self->_readline )) {
return if($self->{'_end_document'});
if(/^cross_match version\s+(.*?)$/) {
$self->{_algorithm_version} = $1;
} elsif(/^Maximal single base matches/) {
$start = 1;
} elsif(/^(\d+) matching entries/) {
$self->{'_end_document'} = 1;
return;
} elsif(($start || $self->{'_result_count'}) && /^ (\d+)/) {
$self->{'_result_count'} ++;
return $self->_parse($_);
} elsif(! $self->{_parameters}) {
if(/.*?\s+(\-.*?)$/) {
my $p = $1;
my @pp = split /\s+/, $p;
for(my $i = 0; $i < @pp; $i ++) {
if($pp[$i] =~ /^\-/) {
if($pp[$i + 1] && $pp[$i + 1] !~ /^\-/) {
$self->{_parameters}->{$pp[$i]} = $pp[$i + 1];
$i ++;
} else {
$self->{_parameters}->{$pp[$i]} = "";
}
}
}
}
} elsif(/^Query file(s):\s+(.*?)$/) {
$self->{_query_name} = $1;
} elsif(/^Subject file(s):\s+(.*?)$/) {
$self->{_subject_name} = $2;
}
}
}
=head2 _alignment
Title : _alignment
Usage : private
=cut
sub _alignment {
my $self = shift;
# C H_EO-aaa01PCR02 243 CCTCTGAATGGCTGAAGACCCCTCTGCCGAGGGAGGTTGGGGATTGTGGG 194
#
# 0284119_008.c1- 1 CCTCTGAATGGCTGAAGACCCCTCTGCCGAGGGAGGTTGGGGATTGTGGG 50
#
# C H_EO-aaa01PCR02 193 ACAAGGTCCCTTGGTGCTGATGGCCTGAAGGGGCCTGAGCTGTGGGCAGA 144
#
# 0284119_008.c1- 51 ACAAGGTCCCTTGGTGCTGATGGCCTGAAGGGGCCTGAGCTGTGGGCAGA 100
#
# C H_EO-aaa01PCR02 143 TGCAGTTTTCTGTGGGCTTGGGGAACCTCTCACGTTGCTGTGTCCTGGTG 94
#
# 0284119_008.c1- 101 TGCAGTTTTCTGTGGGCTTGGGGAACCTCTCACGTTGCTGTGTCCTGGTG 150
#
# C H_EO-aaa01PCR02 93 AGCAGCCCGACCAATAAACCTGCTTTTCTAAAAGGATCTGTGTTTGATTG 44
#
# 0284119_008.c1- 151 AGCAGCCCGACCAATAAACCTGCTTTTCTAAAAGGATCTGTGTTTGATTG 200
#
# C H_EO-aaa01PCR02 43 TATTCTCTGAAGGCAGTTACATAGGGTTACAGAGG 9
#
# 0284119_008.c1- 201 TATTCTCTGAAGGCAGTTACATAGGGTTACAGAGG 235
#LSF: Should be the blank line. Otherwise error.
my $blank = $self->_readline;
unless($blank =~ /^\s*$/) {
return;
}
my @data;
my @pad;
$count = 0;
while( defined ($_ = $self->_readline )) {
$count = 0 if($count >= 3);
next if(/^$/);
if(/^(C \S+.*?\d+ )(\S+) \d+$|^( \S+.*?\d+ )(\S+) \d+$$|^\s+$/) {
$count ++;
if($1 || $3) {
$pad[$count] = $1 ? $1 : $3;
push @{$data[$count]}, ($2 ? $2 : $4);
} else {
if(/\s{$pad[0],$pad[0]}(.*?)$/) {
push @{$data[$count]}, $1;
} else {
$self->throw("Format error for the homology line [$_].");
}
}
} else {
last;
}
}
return @data;
}
=head2 _parse
Title : _parse
Usage : private
=cut
sub _parse {
my $self = shift;
my $line = shift;
my $is_alignment = 0;
my($hit_seq, $homology_seq, $query_seq);
# 32 5.13 0.00 0.00 H_DO-0065PCR0005792_034a.b1-1 327 365 (165) C 1111547847_forward (0) 39 1
#OR
#ALIGNMENT 32 5.13 0.00 0.00 H_DO-0065PCR0005792_034a.b1-1 327 365 (165) C 1111547847_forward (0) 39 1
$line =~ s/^\s+|\s+$//g;
my @r = split /\s+/, $line;
if($r[0] eq "ALIGNMENT") {
$is_alignment = 1;
shift @r;
($hit_seq, $homology_seq, $query_seq) = $self->_alignment();
}
my $subject_seq_id;
my $query_seq_id = $r[4];
my $query_start = $r[5];
my $query_end = $r[6];
my $is_complement = 0;
my $subject_start;
my $subject_end;
if($r[8] eq "C" && $r[9] !~ /^\(\d+\)$/) {
$subject_seq_id = $r[9];
$is_complement = 1;
$subject_start = $r[11];
$subject_end = $r[12];
} else {
$subject_seq_id = $r[8];
$subject_start = $r[9];
$subject_end = $r[10];
}
my $hit = new Bio::Search::Hit::GenericHit(-name => $subject_seq_id,
-hsps => [new Bio::Search::HSP::GenericHSP(-query_name => $query_seq_id,
-query_start => $query_start,
-query_end => $query_end,
-hit_name => $subject_seq_id,
-hit_start => $subject_start,
-hit_end => $subject_end,
-query_length => 0,
-hit_length => 0,
-identical => $r[0],
-conserved => $r[0],
-query_seq => $query_seq ? (join "", @$query_seq) : "", #query sequence portion of the HSP
-hit_seq => $hit_seq ? (join "", @$hit_seq) : "", #hit sequence portion of the HSP
-homology_seq=> $homology_seq ? (join "", @$homology_seq) : "", #homology sequence for the HSP
#LSF: Need the direction, just to fool the GenericHSP module.
-algorithm => 'SW',)],
);
my $result = new Bio::Search::Result::CrossMatchResult( -query_name => $self->{_query_name},
-query_accession => '',
-query_description => '',
-query_length => 0,
-database_name => $self->{_subject_name},
-database_letters => 0,
-database_entries => 0,
-parameters => $self->{_parameters},
-statistics => { },
-algorithm => 'cross_match',
-algorithm_version => $self->{_algorithm_version},
);
$result->add_hit($hit);
return $result;
}
=head2 result_count
Title : result_count
Usage : $num = $stream->result_count;
Function: Gets the number of CrossMatch results that have been parsed.
Returns : integer
Args : none
Throws : none
=cut
sub result_count {
my $self = shift;
return $self->{'_result_count'};
}
1;
#$Header: /home/repository/bioperl/bioperl-live/Bio/SearchIO/cross_match.pm,v 1.1 2007/05/12 04:20:27 dave_messina Exp $
From cjfields at dev.open-bio.org Sat May 12 08:05:02 2007
From: cjfields at dev.open-bio.org (Christopher John Fields)
Date: Sat, 12 May 2007 12:05:02 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/DB EUtilParameters.pm,NONE,1.1
Message-ID: <200705121205.l4CC52Ug003619@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/DB
In directory dev.open-bio.org:/tmp/cvs-serv3594
Added Files:
EUtilParameters.pm
Log Message:
initial commit; part of EUtilities overhaul
--- NEW FILE: EUtilParameters.pm ---
# $Id: EUtilParameters.pm,v 1.1 2007/05/12 12:04:59 cjfields Exp $
#
# BioPerl module for Bio::DB::EUtilParameters
#
# Cared for by Chris Fields
#
# Copyright Chris Fields
#
# You may distribute this module under the same terms as perl itself
#
# POD documentation - main docs before the code
=head1 NAME
Bio::DB::EUtilParameters - Manipulation of NCBI eutil-based parameters for
remote database requests.
=head1 SYNOPSIS
# Bio::DB::EUtilParameters implements Bio::ParameterBaseI
my @params = (-eutil => 'efetch',
db => 'nucleotide',
id => \@ids,
email => 'me at foo.bar',
retmode => 'xml');
my $p = Bio::DB::EUtilParameters->new(@params);
if ($p->parameters_changed) {...} # state information
$p->set_parameters(@extra_params); # set new NCBI parameters, leaves others preset
$p->reset_parameters(@new_params); # reset NCBI parameters to original state
$p->to_string(); # get a URI-encoded string representation of the URL address
$p->to_request(); # get an HTTP::Request object (to pass on to LWP::UserAgent)
=head1 DESCRIPTION
Bio::DB::EUtilParameters is-a Bio::ParameterBaseI implementation that allows
simple manipulation of NCBI eutil parameters for CGI-based queries. SOAP-based
methods may be added in the future.
For simplicity parameters do not require dashes when passed and do not need URI
encoding (spaces are converted to '+', symbols encoded, etc). Also, the
following extra parameters can be passed to the new() constructor or via
set_parameters() or reset_parameters():
eutil - the eutil to be used. The default is 'efetch' if not set.
correspondence - Flag for how IDs are treated. Default is undef (none).
cookie - a Bio::DB::EUtilities::Cookie object. Default is undef (none).
At this point minimal checking is done for potential errors in parameter
passing, though these should be easily added in the future when necessary.
=head1 TODO
Possibly integrate SOAP-compliant methods. SOAP::Lite may be undergoing an
complete rewrite so I'm hesitant about adding this in immediately.
=head1 FEEDBACK
=head2 Mailing Lists
User feedback is an integral part of the
evolution of this and other Bioperl modules. Send
your comments and suggestions preferably to one
of the Bioperl mailing lists. Your participation
is much appreciated.
bioperl-l at lists.open-bio.org - General discussion
http://www.bioperl.org/wiki/Mailing_lists - About the mailing lists
=head2 Reporting Bugs
Report bugs to the Bioperl bug tracking system to
help us keep track the bugs and their resolution.
Bug reports can be submitted via the web.
http://bugzilla.open-bio.org/
=head1 AUTHOR
Email cjfields at uiuc dot edu
=head1 APPENDIX
The rest of the documentation details each of the
object methods. Internal methods are usually
preceded with a _
=cut
# Let the code begin...
package Bio::DB::EUtilParameters;
use strict;
use warnings;
use base qw(Bio::Root::Root Bio::ParameterBaseI);
use URI;
use HTTP::Request;
# eutils only has one hostbase URL
my $HOSTBASE = 'http://eutils.ncbi.nlm.nih.gov/entrez/eutils/';
# mode : GET or POST (HTTP::Request)
# location : CGI location
# params : allowed parameters for that eutil
my %MODE = (
'einfo' => {
'mode' => 'get',
'location' => 'einfo.fcgi',
'params' => [qw(db retmode tool email)],
},
'epost' => {
'mode' => 'post',
'location' => 'epost.fcgi',
'params' => [qw(db retmode id tool email)],
},
'efetch' => {
'mode' => 'get',
'location' => 'efetch.fcgi',
'params' => [qw(db retmode id retmax retstart rettype strand seq_start
seq_stop complexity report tool email )],
},
'esearch' => {
'mode' => 'get',
'location' => 'esearch.fcgi',
'params' => [qw(db retmode usehistory term field reldate mindate
maxdate datetype retmax retstart rettype sort tool email)],
},
'esummary' => {
'mode' => 'get',
'location' => 'esummary.fcgi',
'params' => [qw(db retmode id retmax retstart rettype tool email )],
},
'elink' => {
'mode' => 'get',
'location' => 'elink.fcgi',
'params' => [qw(db retmode id reldate mindate maxdate datetype term
dbfrom holding cmd version tool email)],
},
'egquery' => {
'mode' => 'get',
'location' => 'egquery.fcgi',
'params' => [qw(term retmode tool email)],
},
'espell' => {
'mode' => 'get',
'location' => 'espell.fcgi',
'params' => [qw(db retmode term tool email )],
}
);
# used only if cookie is present
my @COOKIE_PARAMS = qw(db sort seq_start seq_stop strand complexity rettype
retstart retmax cmd linkname retmode WebEnv query_key);
# default retmode if one is not supplied
my %NCBI_DATABASE = (
'pubmed' => 'xml',
'protein' => 'text',
'nucleotide' => 'text',
'nuccore' => 'text',
'nucgss' => 'text',
'nucest' => 'text',
'structure' => 'text',
'genome' => 'text',
'books' => 'xml',
'cancerchromosomes'=> 'xml',
'cdd' => 'xml',
'domains' => 'xml',
'gene' => 'asn1',
'genomeprj' => 'xml',
'gensat' => 'xml',
'geo' => 'xml',
'gds' => 'xml',
'homologene' => 'xml',
'journals' => 'text',
'mesh' => 'xml',
'ncbisearch' => 'xml',
'nlmcatalog' => 'xml',
'omia' => 'xml',
'omim' => 'xml',
'pmc' => 'xml',
'popset' => 'xml',
'probe' => 'xml',
'pcassay' => 'xml',
'pccompound' => 'xml',
'pcsubstance' => 'xml',
'snp' => 'xml',
'taxonomy' => 'xml',
'unigene' => 'xml',
'unists' => 'xml',
);
my @PARAMS;
# generate getter/setters (will move this into individual ones at some point)
BEGIN {
@PARAMS = qw(db id email retmode rettype usehistory term field tool
reldate mindate maxdate datetype retstart retmax sort seq_start seq_stop
strand complexity report dbfrom cmd holding version linkname WebEnv
query_key);
for my $method (@PARAMS) {
eval <{'_statechange'} = 1 if (!defined \$self->{'_$method'}) ||
(defined \$self->{'_$method'} && \$self->{'_$method'} ne \$val);
\$self->{'_$method'} = \$val;
}
return \$self->{'_$method'};
}
END
}
}
sub new {
my ($class, @args) = @_;
my $self = $class->SUPER::new(@args);
$self->_set_from_args(\@args,
-methods => [@PARAMS, qw(eutil cookie correspondence)]);
# set default retmode if not explicitly set
$self->eutil() || $self->eutil('efetch');
$self->_set_default_retmode if (!$self->retmode);
$self->{'_statechange'} = 1;
return $self;
}
=head1 Bio::ParameterBaseI implemented methods
=head2
Title : set_parameters
Usage : $pobj->set_parameters(%params);
Function: sets the NCBI parameters listed in the hash or array
Returns : None
Args : [optional] hash or array of parameter/values.
Note : This sets any parameter (i.e. doesn't screen them using $MODE or via
set cookies).
=cut
sub set_parameters {
my ($self, @args) = @_;
$self->_set_from_args(\@args, -methods => [@PARAMS]);
}
=head2
Title : reset_parameters
Usage : resets values
Function: resets parameters to either undef or value in passed hash
Returns : none
Args : [optional] hash of parameter-value pairs
Note : this also resets eutil(), correspondence(), and the cookie and request
cache
=cut
sub reset_parameters {
my ($self, @args) = @_;
# is there a better way of doing this? probably, but this works
for my $param (@PARAMS, qw(eutil correspondence cookie_cache request_cache)) {
defined $self->{"_$param"} && undef $self->{"_$param"};
}
$self->_set_from_args(\@args, -methods => [@PARAMS, qw(eutil correspondence cookie)]);
$self->eutil() || $self->eutil('efetch');
$self->_set_default_retmode if (!$self->retmode);
$self->{'_statechange'} = 1;
}
=head2
Title : parameters_changed
Usage : if ($pobj->parameters_changed) {...}
Function: Returns TRUE if parameters have changed
Returns : Boolean (0 or 1)
Args : [optional] Boolean
=cut
sub parameters_changed {
my ($self) = @_;
$self->{'_statechange'};
}
=head2
Title : available_parameters
Usage : @params = $pobj->available_parameters()
Function: Returns a list of the available parameters
Returns : Array of available parameters (no values)
Args : [optional] A string; either eutil name (for returning eutil-specific
parameters) or 'cookie' (for those parameters allowed when retrieving
data stored on the remote server using a 'Cookie').
=cut
sub available_parameters {
my ($self, $type) = @_;
$type ||= 'all';
if ($type eq 'all') {
return @PARAMS;
} elsif ($type eq 'cookie') {
return @COOKIE_PARAMS;
} else {
$self->throw("$type parameters not supported") if !exists $MODE{$type};
return @{$MODE{$type}->{params}};
}
}
=head2
Title : get_parameters
Usage : @params = $pobj->get_parameters;
%params = $pobj->get_parameters;
Function: Returns list of key/value pairs, parameter => value
Returns : Flattened list of key-value pairs. IDs are returned based on the
correspondence value (a string joined by commas or as an array ref).
Args : -type : the eutil name or 'cookie', for returning a subset of
parameters (Default: returns all)
-join_ids : Boolean; join IDs based on correspondence (Default: no join)
=cut
sub get_parameters {
my ($self, @args) = @_;
my ($type, $join) = $self->_rearrange([qw(TYPE JOIN_IDS)], @args);
$type ||= '';
my @final = $self->available_parameters($type);
my @p;
for my $param (@final) {
if ($param eq 'id' && $join) {
if ($self->correspondence && $self->eutil eq 'elink') {
for my $id_group (@{ $self->id }) {
if (ref($id_group) eq 'ARRAY') {
push @p, ('id' => join(q(,), @{ $id_group }));
}
elsif (!ref($id_group)) {
push @p, ('id' => $id_group);
}
else {
$self->throw("Unknown ID type: $id_group");
}
}
} else {
push @p, ($param => join(',', @{ $self->id }));
}
} elsif ($param eq 'retmode' && !$self->retmode) {
} else {
push @p, ($param => $self->{"_$param"}) if defined $self->{"_$param"};
}
}
return @p;
}
=head2
Title : to_string
Usage : $string = $pobj->to_string;
Function: Returns string (URL only in this case)
Returns : String (URL only for now)
Args : [optional] 'all'; build URI::http using all parameters
Default : Builds based on allowed parameters (presence of cookie data
or eutil type in %MODE).
Note : Changes state of object. Absolute string
=cut
sub to_string {
my ($self, @args) = @_;
# calling to_uri changes the state
if ($self->parameters_changed || !defined $self->{'_string_cache'}) {
my $string = $self->to_request(@args)->uri->as_string;
$self->{'_statechange'} = 0;
$self->{'_string_cache'} = $string;
}
return $self->{'_string_cache'};
}
=head2
Title : to_request
Usage : $uri = $pobj->to_request;
Function: Returns HTTP::Request object
Returns : HTTP::Request
Args : [optional] 'all'; builds request using all parameters
Default : Builds based on allowed parameters (presence of cookie data
or eutil type in %MODE).
Note : Changes state of object. Used for CGI-based GET/POST
=cut
sub to_request {
my ($self, $type) = @_;
if ($self->parameters_changed || !defined $self->{'_uri_cache'}) {
my $eutil = $self->eutil;
$self->throw("No eutil set") if !$eutil;
#set default retmode
my $cookie = ($self->cookie) ? 1 : 0;
$type ||= ($cookie) ? 'cookie' : $eutil;
my $uri = URI->new($HOSTBASE . $MODE{$eutil}->{location});
$uri->query_form($self->get_parameters(-type => $type, -join_ids => 1) );
my $method = ($eutil eq 'epost') ? 'POST' : 'GET';
my $request = HTTP::Request->new($method => $uri);
$self->{'_statechange'} = 0;
$self->{'_request_cache'} = $request;
}
return $self->{'_request_cache'};
}
=head1 Implementation specific-methods
=head2
Title : eutil
Usage : $p->eutil('efetch')
Function: gets/sets the eutil for this set of parameters
Returns : string (eutil)
Args : [optional] string (eutil)
Throws : '$eutil not supported' if eutil not present
=cut
sub eutil {
my ($self, $eutil) = @_;
if ($eutil) {
$self->throw("$eutil not supported") if !exists $MODE{$eutil};
$self->{'_eutil'} = $eutil;
$self->{'_statechange'} = 1;
}
return $self->{'_eutil'};
}
=head2
Title : cookie
Usage : $p->cookie($cookie);
Function: gets/sets the cookie (history) to be used for these parameters
Returns : Bio::DB::EUtilities::Cookie (if set)
Args : [optional] Bio::DB::EUtilities::Cookie
Throws : Passed something other than a Bio::DB::EUtilities::Cookie
Note : This overrides WebEnv() and query_key() if set
=cut
# cookie not changed over to ParameterBaseI yet...
sub cookie {
my ($self, $cookie) = @_;
if ($cookie) {
$self->throw('Not a Bio::DB::EUtilities::Cookie object!') if
!$cookie->isa('Bio::DB::EUtilities::Cookie');
my ($webenv, $qkey) = @{$cookie->cookie};
$webenv && $self->WebEnv($webenv);
$qkey && $self->query_key($qkey);
#TODO: set db(), dbfrom() based on eutil
$self->{'_statechange'} = 1;
$self->{'_cookie_cache'} = $cookie;
}
return $self->{'_cookie_cache'};
}
=head2
Title : correspondence
Usage : $p->correspondence(1);
Function: Sets flag for posting IDs for one-to-one correspondence
Returns : Boolean
Args : [optional] boolean value
=cut
sub correspondence {
my ($self, $corr) = @_;
if (defined $corr) {
$self->{'_correspondence'} = $corr;
$self->{'_statechange'} = 1;
}
return $self->{'_correspondence'};
}
# Title : _set_default_retmode
# Usage : $p->_set_default_retmode();
# Function: sets retmode to default value if called
# Returns : none
# Args : none
sub _set_default_retmode {
my $self = shift;
if ($self->eutil eq 'efetch') {
my $db = $self->db || $self->throw('No database defined for efetch!');
$self->throw('Database $db not recognized') if !exists $NCBI_DATABASE{$db};
# set efetch-based retmode
$self->retmode($NCBI_DATABASE{$db});
} else {
$self->retmode('xml');
}
}
1;
From jason at dev.open-bio.org Sat May 12 18:58:09 2007
From: jason at dev.open-bio.org (Jason Stajich)
Date: Sat, 12 May 2007 22:58:09 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/Tree TreeFunctionsI.pm, 1.35, 1.36
Message-ID: <200705122258.l4CMw9RB004357@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/Tree
In directory dev.open-bio.org:/tmp/cvs-serv4331/Bio/Tree
Modified Files:
TreeFunctionsI.pm
Log Message:
method to move bootstrap values from id values for internal nodes automatically
Index: TreeFunctionsI.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/Tree/TreeFunctionsI.pm,v
retrieving revision 1.35
retrieving revision 1.36
diff -C2 -d -r1.35 -r1.36
*** TreeFunctionsI.pm 13 Apr 2007 05:43:17 -0000 1.35
--- TreeFunctionsI.pm 12 May 2007 22:58:07 -0000 1.36
***************
*** 992,994 ****
--- 992,1013 ----
}
+ =head2 move_id_to_bootstrap
+
+ Title : move_id_to_bootstrap
+ Usage : $tree->move_id_to_bootstrap
+ Function: Move internal IDs to bootstrap slot
+ Returns : undef
+ Args : undef
+
+
+ =cut
+
+ sub move_id_to_bootstrap{
+ my ($tree) = shift;
+ for my $node ( grep { ! $_->is_Leaf } $tree->get_nodes ) {
+ $node->bootstrap($node->id);
+ $node->id('');
+ }
+ }
+
1;
From jason at dev.open-bio.org Sat May 12 18:58:28 2007
From: jason at dev.open-bio.org (Jason Stajich)
Date: Sat, 12 May 2007 22:58:28 +0000
Subject: [Bioperl-guts-l] bioperl-live/t Tree.t,1.13,1.14
Message-ID: <200705122258.l4CMwSOP004402@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/t
In directory dev.open-bio.org:/tmp/cvs-serv4365/t
Modified Files:
Tree.t
Log Message:
test for method to move bootstrap values from id values for internal nodes automatically
Index: Tree.t
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/t/Tree.t,v
retrieving revision 1.13
retrieving revision 1.14
diff -C2 -d -r1.13 -r1.14
*** Tree.t 12 Aug 2006 10:51:39 -0000 1.13
--- Tree.t 12 May 2007 22:58:25 -0000 1.14
***************
*** 117,121 ****
# test for rerooting the tree
! my $out = Bio::TreeIO->new(-format => 'newick', -fh => \*STDERR, -noclose => 1);
$tree = $in->next_tree;
$tree->verbose( -1 ) unless $DEBUG;
--- 117,123 ----
# test for rerooting the tree
! my $out = Bio::TreeIO->new(-format => 'newick',
! -fh => \*STDERR,
! -noclose => 1);
$tree = $in->next_tree;
$tree->verbose( -1 ) unless $DEBUG;
***************
*** 212,215 ****
--- 214,247 ----
#get_lca, merge_lineage, contract_linear_paths tested in in Taxonomy.t
+
+ # try out the id to bootstrap copy method
+ $treeio = Bio::TreeIO->new(-format => 'newick',
+ -file => Bio::Root::IO->catfile('t','data',
+ 'bootstrap.tre'));
+ $tree = $treeio->next_tree;
+ my ($test_node) = $tree->find_node(-id => 'A');
+ ok($test_node->ancestor->id, '90');
+ ok($test_node->ancestor->ancestor->id, '25');
+ $tree->move_id_to_bootstrap;
+ ok($test_node->ancestor->id, '');
+ ok($test_node->ancestor->bootstrap, '90');
+ ok($test_node->ancestor->ancestor->id, '');
+ ok($test_node->ancestor->ancestor->bootstrap, '25');
+
+ # change TreeIO to parse
+ $treeio = Bio::TreeIO->new(-format => 'newick',
+ -file => Bio::Root::IO->catfile('t','data',
+ 'bootstrap.tre'),
+ -internal_node_id => 'bootstrap');
+ $tree = $treeio->next_tree;
+ ($test_node) = $tree->find_node(-id => 'A');
+ ok($test_node->ancestor->id, '');
+ ok($test_node->ancestor->ancestor->id, '');
+ ok($test_node->ancestor->bootstrap, '90');
+ ok($test_node->ancestor->ancestor->bootstrap, '25');
+
+
+
+
__DATA__
(D,(C,(A,B)));
From jason at dev.open-bio.org Sat May 12 18:58:27 2007
From: jason at dev.open-bio.org (Jason Stajich)
Date: Sat, 12 May 2007 22:58:27 +0000
Subject: [Bioperl-guts-l] bioperl-live/t/data bootstrap.tre,NONE,1.1
Message-ID: <200705122258.l4CMwRZZ004398@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/t/data
In directory dev.open-bio.org:/tmp/cvs-serv4365/t/data
Added Files:
bootstrap.tre
Log Message:
test for method to move bootstrap values from id values for internal nodes automatically
--- NEW FILE: bootstrap.tre ---
(((A:5,B:5)90:2,C:4)25:3,D:10);
From jason at dev.open-bio.org Sat May 12 18:59:36 2007
From: jason at dev.open-bio.org (Jason Stajich)
Date: Sat, 12 May 2007 22:59:36 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio TreeIO.pm,1.21,1.22
Message-ID: <200705122259.l4CMxau3004443@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio
In directory dev.open-bio.org:/tmp/cvs-serv4410/Bio
Modified Files:
TreeIO.pm
Log Message:
TreeIO magic to now automatically move bootstrap values over if the user requests this - made it a global Bio::TreeIO flag but is really only used in Bio::TreeIO::newick and nexus
Index: TreeIO.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/TreeIO.pm,v
retrieving revision 1.21
retrieving revision 1.22
diff -C2 -d -r1.21 -r1.22
*** TreeIO.pm 26 Sep 2006 22:03:05 -0000 1.21
--- TreeIO.pm 12 May 2007 22:59:34 -0000 1.22
***************
*** 79,82 ****
--- 79,84 ----
use base qw(Bio::Root::Root Bio::Root::IO Bio::Event::EventGeneratorI Bio::Factory::TreeFactoryI);
+ use constant INTERNAL_NODE_ID => 'id'; # id or bootstrap, default is 'id'
+
=head2 new
***************
*** 94,98 ****
tabtree ASCII text representation of tree
lintree lintree output format
!
=cut
--- 96,102 ----
tabtree ASCII text representation of tree
lintree lintree output format
! -internal_node_id : what is stored in the internal node ids,
! bootstrap values or ids, coded as
! 'bootstrap' or 'id'
=cut
***************
*** 194,199 ****
my($self, @args) = @_;
$self->{'_handler'} = undef;
! ($self->{'newline_each_node'}) = $self->_rearrange
! ([qw(NEWLINE_EACH_NODE)], at args);
# initialize the IO part
--- 198,205 ----
my($self, @args) = @_;
$self->{'_handler'} = undef;
! my $internal_node_id;
! $self->{'internal_node_id'} = INTERNAL_NODE_ID;
! ($self->{'newline_each_node'},$internal_node_id) = $self->_rearrange
! ([qw(NEWLINE_EACH_NODE INTERNAL_NODE_ID)], at args);
# initialize the IO part
***************
*** 201,204 ****
--- 207,211 ----
$self->attach_EventHandler(Bio::TreeIO::TreeEventBuilder->new
(-verbose => $self->verbose(), @args));
+ $self->internal_node_id($internal_node_id) if defined $internal_node_id;
}
***************
*** 253,256 ****
--- 260,291 ----
}
+ =head2 internal_node_id
+
+ Title : internal_node_id
+ Usage : $obj->internal_node_id($newval)
+ Function: Internal Node Id type, coded as 'bootstrap' or 'id'
+ Default is 'id'
+ Returns : value of internal_node_id (a scalar)
+ Args : on set, new value (a scalar or undef, optional)
+
+
+ =cut
+
+ sub internal_node_id{
+ my $self = shift;
+ my $val = shift;
+ if( defined $val ) {
+ if( $val =~ /^b/i ) {
+ $val = 'bootstrap';
+ } elsif( $val =~ /^i/ ) {
+ $val = 'id';
+ } else {
+ $self->warn("Unknown value $val for internal_node_id not resetting value\n");
+ }
+ return $self->{'internal_node_id'} = $val;
+ }
+ return $self->{'internal_node_id'};
+ }
+
=head2 _guess_format
From jason at dev.open-bio.org Sat May 12 18:59:36 2007
From: jason at dev.open-bio.org (Jason Stajich)
Date: Sat, 12 May 2007 22:59:36 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/TreeIO newick.pm,1.40,1.41
Message-ID: <200705122259.l4CMxa85004448@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/TreeIO
In directory dev.open-bio.org:/tmp/cvs-serv4410/Bio/TreeIO
Modified Files:
newick.pm
Log Message:
TreeIO magic to now automatically move bootstrap values over if the user requests this - made it a global Bio::TreeIO flag but is really only used in Bio::TreeIO::newick and nexus
Index: newick.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/TreeIO/newick.pm,v
retrieving revision 1.40
retrieving revision 1.41
diff -C2 -d -r1.40 -r1.41
*** newick.pm 19 Dec 2006 16:32:08 -0000 1.40
--- newick.pm 12 May 2007 22:59:34 -0000 1.41
***************
*** 145,148 ****
--- 145,151 ----
my $tree = $self->_eventHandler->end_document($chars);
$tree->score($score) if defined $score;
+ if( $self->internal_node_id eq 'bootstrap' ) {
+ $tree->move_id_to_bootstrap;
+ }
return $tree;
} elsif( $ch eq '(' ) {
From sendu at dev.open-bio.org Mon May 14 08:23:44 2007
From: sendu at dev.open-bio.org (Senduran Balasubramaniam)
Date: Mon, 14 May 2007 12:23:44 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/Root RootI.pm,1.75,1.76
Message-ID: <200705141223.l4ECNhTu000715@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/Root
In directory dev.open-bio.org:/tmp/cvs-serv690/Bio/Root
Modified Files:
RootI.pm
Log Message:
added -code and -case_sensitve options to set_from_args()
Index: RootI.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/Root/RootI.pm,v
retrieving revision 1.75
retrieving revision 1.76
diff -C2 -d -r1.75 -r1.76
*** RootI.pm 11 Jan 2007 14:37:59 -0000 1.75
--- RootI.pm 14 May 2007 12:23:41 -0000 1.76
***************
*** 379,383 ****
Argument : \%args | \@args : a hash ref or associative array ref of arguments
: where keys are any-case strings corresponding to
! : method names but optionally prefixed with
: hyphens, and values are the values the method
: should be supplied. If keys contain internal
--- 379,383 ----
Argument : \%args | \@args : a hash ref or associative array ref of arguments
: where keys are any-case strings corresponding to
! : method names but optionally prefixed with
: hyphens, and values are the values the method
: should be supplied. If keys contain internal
***************
*** 400,403 ****
--- 400,416 ----
: supplied methods that didn't exist, even if not
: mentioned in the supplied %args)
+ : -code => '' | {}: (optional) when creating methods use the supplied
+ : code (a string which will be evaulated as a sub).
+ : The default code is a simple get/setter.
+ : Alternatively you can supply a hash ref where
+ : the keys are method names and the values are
+ : code strings. The variable '$method' will be
+ : available at evaluation time, so can be used in
+ : your code strings. Beware that the strict pragma
+ : will be in effect.
+ : -case_sensitive => bool : require case sensitivity on the part of
+ : user (ie. a() and A() are two different
+ : methods and the user must be careful
+ : which they use).
Comments :
: The \%args argument will usually be the args received during new()
***************
*** 431,441 ****
$self->throw("a hash/array ref of arguments must be supplied") unless ref($args);
! my ($methods, $force, $create);
if (@own_args) {
! ($methods, $force, $create) =
$self->_rearrange([qw(METHODS
FORCE
! CREATE)], @own_args);
}
my %method_names = ();
--- 444,459 ----
$self->throw("a hash/array ref of arguments must be supplied") unless ref($args);
! my ($methods, $force, $create, $code, $case);
if (@own_args) {
! ($methods, $force, $create, $code, $case) =
$self->_rearrange([qw(METHODS
FORCE
! CREATE
! CODE
! CASE_SENSITIVE)], @own_args);
}
+ my $default_code = 'my $self = shift;
+ if (@_) { $self->{\'_\'.$method} = shift }
+ return $self->{\'_\'.$method} || return;';
my %method_names = ();
***************
*** 451,455 ****
%syns = map { $_ => $_ } @names;
}
! %method_names = map { lc($_) => $_ } @names;
}
--- 469,473 ----
%syns = map { $_ => $_ } @names;
}
! %method_names = map { $case ? $_ : lc($_) => $_ } @names;
}
***************
*** 466,479 ****
if ($create) {
unless ($methods) {
! %syns = map { $_ => lc($_) } keys %args;
}
foreach my $method (keys %syns) {
$self->can($method) && next;
! # create get/setter method
no strict 'refs';
! *{ref($self).'::'.$method} = sub { my $self = shift;
! if (@_) { $self->{'_'.$method} = shift }
! return $self->{'_'.$method} || return; };
}
}
--- 484,503 ----
if ($create) {
unless ($methods) {
! %syns = map { $_ => $case ? $_ : lc($_) } keys %args;
}
foreach my $method (keys %syns) {
$self->can($method) && next;
!
! my $string = $code || $default_code;
! if (ref($code) && ref($code) eq 'HASH') {
! $string = $code->{$method} || $default_code;
! }
!
! my $sub = eval "sub { $string }";
! $self->throw("Compilation error for $method : $@") if $@;
!
no strict 'refs';
! *{ref($self).'::'.$method} = $sub;
}
}
***************
*** 485,489 ****
foreach my $syn (@{ ref($syn_ref) ? $syn_ref : [$syn_ref] }) {
next if $syn eq $method;
! $method_names{lc($syn)} = $syn;
next if $self->can($syn);
no strict 'refs';
--- 509,513 ----
foreach my $syn (@{ ref($syn_ref) ? $syn_ref : [$syn_ref] }) {
next if $syn eq $method;
! $method_names{$case ? $syn : lc($syn)} = $syn;
next if $self->can($syn);
no strict 'refs';
***************
*** 494,498 ****
# set values for methods
while (my ($method, $value) = each %args) {
! $method = $method_names{lc($method)} || ($methods ? next : $method);
$self->can($method) || next unless $force;
$self->$method($value);
--- 518,522 ----
# set values for methods
while (my ($method, $value) = each %args) {
! $method = $method_names{$case ? $method : lc($method)} || ($methods ? next : $method);
$self->can($method) || next unless $force;
$self->$method($value);
From sendu at dev.open-bio.org Mon May 14 08:37:22 2007
From: sendu at dev.open-bio.org (Senduran Balasubramaniam)
Date: Mon, 14 May 2007 12:37:22 +0000
Subject: [Bioperl-guts-l] bioperl-run/Bio/Tools/Run/Phylo/Forester SDI.pm,
1.2, NONE
Message-ID: <200705141237.l4ECbMO0000823@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-run/Bio/Tools/Run/Phylo/Forester
In directory dev.open-bio.org:/tmp/cvs-serv787/Bio/Tools/Run/Phylo/Forester
Removed Files:
SDI.pm
Log Message:
removed incomplete, non-functional, untested modules
--- SDI.pm DELETED ---
From sendu at dev.open-bio.org Mon May 14 08:37:22 2007
From: sendu at dev.open-bio.org (Senduran Balasubramaniam)
Date: Mon, 14 May 2007 12:37:22 +0000
Subject: [Bioperl-guts-l] bioperl-run/Bio/Tools/Run AbstractRunner.pm, 1.3,
NONE JavaRunner.pm, 1.2, NONE
Message-ID: <200705141237.l4ECbMFV000819@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-run/Bio/Tools/Run
In directory dev.open-bio.org:/tmp/cvs-serv787/Bio/Tools/Run
Removed Files:
AbstractRunner.pm JavaRunner.pm
Log Message:
removed incomplete, non-functional, untested modules
--- AbstractRunner.pm DELETED ---
--- JavaRunner.pm DELETED ---
From sendu at dev.open-bio.org Mon May 14 08:46:03 2007
From: sendu at dev.open-bio.org (Senduran Balasubramaniam)
Date: Mon, 14 May 2007 12:46:03 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio PullParserI.pm,1.10,1.11
Message-ID: <200705141246.l4ECk3dB000877@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio
In directory dev.open-bio.org:/tmp/cvs-serv852/Bio
Modified Files:
PullParserI.pm
Log Message:
bug fix: get chunk methods no longer travel outside chunk bounds
Index: PullParserI.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/PullParserI.pm,v
retrieving revision 1.10
retrieving revision 1.11
diff -C2 -d -r1.10 -r1.11
*** PullParserI.pm 28 Sep 2006 18:42:21 -0000 1.10
--- PullParserI.pm 14 May 2007 12:46:01 -0000 1.11
***************
*** 196,201 ****
my $dependency = $self->_dependencies($desired);
if ($dependency && ! defined $self->_fields->{$dependency}) {
! my $dep_method = '_discover_'.$dependency;
! $self->$dep_method;
}
--- 196,202 ----
my $dependency = $self->_dependencies($desired);
if ($dependency && ! defined $self->_fields->{$dependency}) {
! #my $dep_method = '_discover_'.$dependency;
! #$self->$dep_method;
! $self->get_field($dependency);
}
***************
*** 344,352 ****
# determine our line ending
! if ($first_line =~ /\015\012/) {
! $self->_line_ending("\015\012");
}
! elsif ($first_line =~ /\015/) {
! $self->_line_ending("\015");
}
else {
--- 345,353 ----
# determine our line ending
! if ($first_line =~ /\r\n/) {
! $self->_line_ending("\r\n");
}
! elsif ($first_line =~ /\r/) {
! $self->_line_ending("\r");
}
else {
***************
*** 533,537 ****
}
! my $end = $self->_chunk_tell;
if ($self->_chunk_true_end ? $end <= $self->_chunk_true_end : 1) {
return $line;
--- 534,538 ----
}
! my $end = $self->_chunk_tell + $self->_chunk_true_start;
if ($self->_chunk_true_end ? $end <= $self->_chunk_true_end : 1) {
return $line;
***************
*** 554,557 ****
--- 555,561 ----
sub _get_chunk_by_end {
my ($self, $chunk_ending) = @_;
+
+ my $start = $self->_chunk_tell;
+
my $line_ending = $self->_line_ending;
$chunk_ending =~ s/\n/$line_ending/g;
***************
*** 559,566 ****
my $line = $self->chunk->_readline;
! my $end = $self->_chunk_tell;
if ($self->_chunk_true_end ? $end <= $self->_chunk_true_end : 1) {
return $line;
}
return;
}
--- 563,572 ----
my $line = $self->chunk->_readline;
! my $end = $self->_chunk_tell + $self->_chunk_true_start;
if ($self->_chunk_true_end ? $end <= $self->_chunk_true_end : 1) {
return $line;
}
+
+ $self->_chunk_seek($start);
return;
}
***************
*** 594,600 ****
my $end = $self->_chunk_tell;
! if ($self->_chunk_true_end ? $end <= $self->_chunk_true_end : 1) {
return ($start, $end);
}
return;
}
--- 600,609 ----
my $end = $self->_chunk_tell;
! my $comp_end = $end + $self->_chunk_true_start;
! if ($self->_chunk_true_end ? $comp_end <= $self->_chunk_true_end : 1) {
return ($start, $end);
}
+
+ $self->_chunk_seek($start);
return;
}
From sendu at dev.open-bio.org Mon May 14 08:49:54 2007
From: sendu at dev.open-bio.org (Senduran Balasubramaniam)
Date: Mon, 14 May 2007 12:49:54 +0000
Subject: [Bioperl-guts-l] bioperl-live/t/data frac_problems.blast, 1.1,
1.2 frac_problems3.blast, 1.1, 1.2
Message-ID: <200705141249.l4ECns91000945@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/t/data
In directory dev.open-bio.org:/tmp/cvs-serv885/t/data
Modified Files:
frac_problems.blast frac_problems3.blast
Log Message:
initial commit of blast pull modules and tests
Index: frac_problems.blast
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/t/data/frac_problems.blast,v
retrieving revision 1.1
retrieving revision 1.2
diff -C2 -d -r1.1 -r1.2
*** frac_problems.blast 15 Sep 2006 11:25:11 -0000 1.1
--- frac_problems.blast 14 May 2007 12:49:52 -0000 1.2
***************
*** 351,485 ****
X3: 25 (49.6 bits)
S1: 15 (30.2 bits)
! S2: 24 (48.1 bits)
!
! Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
! Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
! "Gapped BLAST and PSI-BLAST: a new generation of protein database search
! programs", Nucleic Acids Res. 25:3389-3402.
! Database: LTR_STRUC_8x_na
! 791 sequences; 6,808,925 total letters
!
! Searching
!
! Query= jgi|Phypa1|96131|fgenesh1_pg.scaffold_296000015
! (456 letters)
!
!
!
! Score E
! Sequences producing significant alignments: (bits) Value
!
! scaffold_5_6_Scaffolds_seq_8392_PT_B8_L9_282_rprt_txt 276 2e-74
!
! >scaffold_5_6_Scaffolds_seq_8392_PT_B8_L9_282_rprt_txt
! Length = 8149
!
! Score = 276 bits (139), Expect = 2e-74
! Identities = 157/163 (96%)
! Strand = Plus / Minus
!
!
! Query: 1 atgaatatgaaagaccttaatgaaggtacagagattccacagcaaaaagatgaaagtgat 60
! ||||||||| |||||||||||||||||||||||||||| |||||| ||||||||||||||
! Sbjct: 1524 atgaatatggaagaccttaatgaaggtacagagattccgcagcaagaagatgaaagtgat 1465
!
!
! Query: 61 atggataatgaaggtgacgaagacgaactggacaaagacgtaggggatatattcagaatc 120
! |||||||||||||||||||||||||||||||||||||||||||||||||||||| |||||
! Sbjct: 1464 atggataatgaaggtgacgaagacgaactggacaaagacgtaggggatatattctgaatc 1405
!
!
! Query: 121 aaacagcaattgggtaaattggtgcataccggaggatgcacag 163
! ||| ||||||||| |||||||||||||||||||||||||||||
! Sbjct: 1404 aaatagcaattggataaattggtgcataccggaggatgcacag 1362
!
!
!
! Score = 151 bits (76), Expect = 9e-37
! Identities = 100/108 (92%)
! Strand = Plus / Minus
!
!
! Query: 164 tcgtttatattccagcagtcaaaactgatagactgcttgactatcgacgaatctccatac 223
! ||||||||||||||||||| |||||||||||||||||||||||||| || ||||||||||
! Sbjct: 6919 tcgtttatattccagcagtaaaaactgatagactgcttgactatcggcggatctccatac 6860
!
!
! Query: 224 taccttctacagctagacgcccgacgcgttctcgagcaggaaagcacg 271
! ||||||||| ||| ||| ||||| ||||||||||||||||| ||||||
! Sbjct: 6859 taccttctagagccagaggcccggcgcgttctcgagcaggagagcacg 6812
!
!
!
! Score = 143 bits (72), Expect = 2e-34
! Identities = 96/104 (92%)
! Strand = Plus / Minus
!
!
! Query: 161 cagtcgtttatattccagcagtcaaaactgatagactgcttgactatcgacgaatctcca 220
! |||||||||||||| ||||||| |||||||||||||||||||||||||| || |||||||
! Sbjct: 120 cagtcgtttatatttcagcagtaaaaactgatagactgcttgactatcggcggatctcca 61
!
!
! Query: 221 tactaccttctacagctagacgcccgacgcgttctcgagcagga 264
! |||||||||||| ||| ||||||||| | |||||||||||||||
! Sbjct: 60 tactaccttctagagccagacgcccggctcgttctcgagcagga 17
!
!
!
! Score = 60.0 bits (30), Expect = 3e-09
! Identities = 42/46 (91%)
! Strand = Plus / Minus
!
!
! Query: 312 ggagccaattctagatccaagacggaagattgaagaaattcagagg 357
! ||||||||||| |||| |||| |||||||||||||||||||||||
! Sbjct: 739 ggagccaattccggatctaagatggaagattgaagaaattcagagg 694
!
!
!
! Score = 52.0 bits (26), Expect = 6e-07
! Identities = 41/46 (89%)
! Strand = Plus / Minus
!
!
! Query: 312 ggagccaattctagatccaagacggaagattgaagaaattcagagg 357
! |||||||||| |||| |||| |||||||||||||||||||||||
! Sbjct: 7543 ggagccaatttcggatctaagatggaagattgaagaaattcagagg 7498
!
!
! Database: LTR_STRUC_8x_na
! Posted date: Sep 4, 2006 3:29 PM
! Number of letters in database: 6,808,925
! Number of sequences in database: 791
!
! Lambda K H
! 1.37 0.711 1.31
!
! Gapped
! Lambda K H
! 1.37 0.711 1.31
!
!
! Matrix: blastn matrix:1 -3
! Gap Penalties: Existence: 5, Extension: 2
! Number of Sequences: 791
! Number of Hits to DB: 50,062
! Number of extensions: 3603
! Number of successful extensions: 1243
! Number of sequences better than 1.0e-04: 157
! Number of HSP's gapped: 985
! Number of HSP's successfully gapped: 621
! Length of query: 456
! Length of database: 6,808,925
! Length adjustment: 16
! Effective length of query: 440
! Effective length of database: 6,796,269
! Effective search space: 2990358360
! Effective search space used: 2990358360
! X1: 11 (21.8 bits)
! X2: 15 (29.7 bits)
! X3: 25 (49.6 bits)
! S1: 14 (28.2 bits)
! S2: 23 (46.1 bits)
--- 351,485 ----
X3: 25 (49.6 bits)
S1: 15 (30.2 bits)
! S2: 24 (48.1 bits)
!
! Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
! Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
! "Gapped BLAST and PSI-BLAST: a new generation of protein database search
! programs", Nucleic Acids Res. 25:3389-3402.
! Database: LTR_STRUC_8x_na
! 791 sequences; 6,808,925 total letters
!
! Searching
!
! Query= jgi|Phypa1|96131|fgenesh1_pg.scaffold_296000015
! (456 letters)
!
!
!
! Score E
! Sequences producing significant alignments: (bits) Value
!
! scaffold_5_6_Scaffolds_seq_8392_PT_B8_L9_282_rprt_txt 276 2e-74
!
! >scaffold_5_6_Scaffolds_seq_8392_PT_B8_L9_282_rprt_txt
! Length = 8149
!
! Score = 276 bits (139), Expect = 2e-74
! Identities = 157/163 (96%)
! Strand = Plus / Minus
!
!
! Query: 1 atgaatatgaaagaccttaatgaaggtacagagattccacagcaaaaagatgaaagtgat 60
! ||||||||| |||||||||||||||||||||||||||| |||||| ||||||||||||||
! Sbjct: 1524 atgaatatggaagaccttaatgaaggtacagagattccgcagcaagaagatgaaagtgat 1465
!
!
! Query: 61 atggataatgaaggtgacgaagacgaactggacaaagacgtaggggatatattcagaatc 120
! |||||||||||||||||||||||||||||||||||||||||||||||||||||| |||||
! Sbjct: 1464 atggataatgaaggtgacgaagacgaactggacaaagacgtaggggatatattctgaatc 1405
!
!
! Query: 121 aaacagcaattgggtaaattggtgcataccggaggatgcacag 163
! ||| ||||||||| |||||||||||||||||||||||||||||
! Sbjct: 1404 aaatagcaattggataaattggtgcataccggaggatgcacag 1362
!
!
!
! Score = 151 bits (76), Expect = 9e-37
! Identities = 100/108 (92%)
! Strand = Plus / Minus
!
!
! Query: 164 tcgtttatattccagcagtcaaaactgatagactgcttgactatcgacgaatctccatac 223
! ||||||||||||||||||| |||||||||||||||||||||||||| || ||||||||||
! Sbjct: 6919 tcgtttatattccagcagtaaaaactgatagactgcttgactatcggcggatctccatac 6860
!
!
! Query: 224 taccttctacagctagacgcccgacgcgttctcgagcaggaaagcacg 271
! ||||||||| ||| ||| ||||| ||||||||||||||||| ||||||
! Sbjct: 6859 taccttctagagccagaggcccggcgcgttctcgagcaggagagcacg 6812
!
!
!
! Score = 143 bits (72), Expect = 2e-34
! Identities = 96/104 (92%)
! Strand = Plus / Minus
!
!
! Query: 161 cagtcgtttatattccagcagtcaaaactgatagactgcttgactatcgacgaatctcca 220
! |||||||||||||| ||||||| |||||||||||||||||||||||||| || |||||||
! Sbjct: 120 cagtcgtttatatttcagcagtaaaaactgatagactgcttgactatcggcggatctcca 61
!
!
! Query: 221 tactaccttctacagctagacgcccgacgcgttctcgagcagga 264
! |||||||||||| ||| ||||||||| | |||||||||||||||
! Sbjct: 60 tactaccttctagagccagacgcccggctcgttctcgagcagga 17
!
!
!
! Score = 60.0 bits (30), Expect = 3e-09
! Identities = 42/46 (91%)
! Strand = Plus / Minus
!
!
! Query: 312 ggagccaattctagatccaagacggaagattgaagaaattcagagg 357
! ||||||||||| |||| |||| |||||||||||||||||||||||
! Sbjct: 739 ggagccaattccggatctaagatggaagattgaagaaattcagagg 694
!
!
!
! Score = 52.0 bits (26), Expect = 6e-07
! Identities = 41/46 (89%)
! Strand = Plus / Minus
!
!
! Query: 312 ggagccaattctagatccaagacggaagattgaagaaattcagagg 357
! |||||||||| |||| |||| |||||||||||||||||||||||
! Sbjct: 7543 ggagccaatttcggatctaagatggaagattgaagaaattcagagg 7498
!
!
! Database: LTR_STRUC_8x_na
! Posted date: Sep 4, 2006 3:29 PM
! Number of letters in database: 6,808,925
! Number of sequences in database: 791
!
! Lambda K H
! 1.37 0.711 1.31
!
! Gapped
! Lambda K H
! 1.37 0.711 1.31
!
!
! Matrix: blastn matrix:1 -3
! Gap Penalties: Existence: 5, Extension: 2
! Number of Sequences: 791
! Number of Hits to DB: 50,062
! Number of extensions: 3603
! Number of successful extensions: 1243
! Number of sequences better than 1.0e-04: 157
! Number of HSP's gapped: 985
! Number of HSP's successfully gapped: 621
! Length of query: 456
! Length of database: 6,808,925
! Length adjustment: 16
! Effective length of query: 440
! Effective length of database: 6,796,269
! Effective search space: 2990358360
! Effective search space used: 2990358360
! X1: 11 (21.8 bits)
! X2: 15 (29.7 bits)
! X3: 25 (49.6 bits)
! S1: 14 (28.2 bits)
! S2: 23 (46.1 bits)
Index: frac_problems3.blast
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/t/data/frac_problems3.blast,v
retrieving revision 1.1
retrieving revision 1.2
diff -C2 -d -r1.1 -r1.2
*** frac_problems3.blast 7 Mar 2007 14:35:27 -0000 1.1
--- frac_problems3.blast 14 May 2007 12:49:52 -0000 1.2
***************
*** 147,149 ****
Query: 223 EYETQVSLKGAEGERFQPDVLIRLPGDKQVVVDAKVSLTAYQQYIAADDDLLRQQALKQH 282
+ Q L+ AE L +Q + + + QQ + D LR +H
! Sbjct: 729 TQQLQQELRAAE---------TILQQRQQALTEQRQRYEHLQQQVEEDSQQLRPLLSNEH 779
\ No newline at end of file
--- 147,150 ----
Query: 223 EYETQVSLKGAEGERFQPDVLIRLPGDKQVVVDAKVSLTAYQQYIAADDDLLRQQALKQH 282
+ Q L+ AE L +Q + + + QQ + D LR +H
! Sbjct: 729 TQQLQQELRAAE---------TILQQRQQALTEQRQRYEHLQQQVEEDSQQLRPLLSNEH 779
!
From sendu at dev.open-bio.org Mon May 14 08:49:54 2007
From: sendu at dev.open-bio.org (Senduran Balasubramaniam)
Date: Mon, 14 May 2007 12:49:54 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/Search/Hit BlastPullHit.pm, NONE,
1.1 PullHitI.pm, 1.2, 1.3
Message-ID: <200705141249.l4ECns3S000957@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/Search/Hit
In directory dev.open-bio.org:/tmp/cvs-serv885/Bio/Search/Hit
Modified Files:
PullHitI.pm
Added Files:
BlastPullHit.pm
Log Message:
initial commit of blast pull modules and tests
--- NEW FILE: BlastPullHit.pm ---
# $Id: BlastPullHit.pm,v 1.1 2007/05/14 12:49:52 sendu Exp $
#
# BioPerl module for Bio::Search::Hit::BlastNHit
#
# Cared for by Sendu Bala
#
# Copyright Sendu Bala
#
# You may distribute this module under the same terms as perl itself
# POD documentation - main docs before the code
=head1 NAME
Bio::Search::Hit::BlastNHit - A parser and hit object for BLASTN hits
=head1 SYNOPSIS
# generally we use Bio::SearchIO to build these objects
use Bio::SearchIO;
my $in = new Bio::SearchIO(-format => 'blast_pull',
-file => 'result.blast');
while (my $result = $in->next_result) {
while (my $hit = $result->next_hit) {
print $hit->name, "\n";
print $hit->score, "\n";
print $hit->significance, "\n";
while (my $hsp = $hit->next_hsp) {
# process HSPI objects
}
}
}
=head1 DESCRIPTION
This object implements a parser for BLASTN hit output.
=head1 FEEDBACK
=head2 Mailing Lists
User feedback is an integral part of the evolution of this and other
Bioperl modules. Send your comments and suggestions preferably to
the Bioperl mailing list. Your participation is much appreciated.
bioperl-l at bioperl.org - General discussion
http://bioperl.org/wiki/Mailing_lists - About the mailing lists
=head2 Reporting Bugs
Report bugs to the Bioperl bug tracking system to help us keep track
of the bugs and their resolution. Bug reports can be submitted via the
web:
http://bugzilla.open-bio.org/
=head1 AUTHOR - Sendu Bala
Email bix at sendu.me.uk
=head1 CONTRIBUTORS
Additional contributors names and emails here
=head1 APPENDIX
The rest of the documentation details each of the object methods.
Internal methods are usually preceded with a _
=cut
# Let the code begin...
package Bio::Search::Hit::BlastPullHit;
use strict;
use Bio::Search::HSP::BlastPullHSP;
use base qw(Bio::Root::Root Bio::Search::Hit::PullHitI);
=head2 new
Title : new
Usage : my $obj = new Bio::Search::Hit::BlastNHit();
Function: Builds a new Bio::Search::Hit::BlastNHit object.
Returns : Bio::Search::Hit::BlastNHit
Args : -chunk => [Bio::Root::IO, $start, $end] (required if no -parent)
-parent => Bio::PullParserI object (required if no -chunk)
-hit_data => array ref with [name description score significance]
where the array ref provided to -chunk contains an IO object
for a filehandle to something representing the raw data of the
hit, and $start and $end define the tell() position within the
filehandle that the hit data starts and ends (optional; defaults
to start and end of the entire thing described by the filehandle)
=cut
sub new {
my ($class, @args) = @_;
my $self = $class->SUPER::new(@args);
$self->_setup(@args);
my $fields = $self->_fields;
foreach my $field (qw( header start_end )) {
$fields->{$field} = undef;
}
my $hit_data = $self->_raw_hit_data;
if ($hit_data && ref($hit_data) eq 'ARRAY') {
foreach my $field (qw(name description score significance)) {
$fields->{$field} = shift(@{$hit_data});
}
}
$self->_dependencies( { ( name => 'header',
length => 'header',
description => 'header',
accession => 'header',
next_hsp => 'header',
query_start => 'start_end',
query_end => 'start_end',
hit_start => 'start_end',
hit_end => 'start_end' ) } );
return $self;
}
#
# PullParserI discovery methods so we can answer all HitI questions
#
sub _discover_header {
my $self = shift;
$self->_chunk_seek(0);
my $header = $self->_get_chunk_by_end("\n Score = ");
unless ($header) {
# no alignment or other data; all information was in the hit table of
# the result
$self->_calculate_accession_from_name;
$self->_fields->{header} = 1;
return;
}
$self->{_after_header} = $self->_chunk_tell;
($self->_fields->{name}, $self->_fields->{description}, $self->_fields->{length}) = $header =~ /^(\S+)\s(.+?)\s+Length\s*=\s*(\d+)/sm;
$self->_fields->{description} =~ s/\n//g;
$self->_calculate_accession_from_name;
$self->_fields->{header} = 1;
}
sub _calculate_accession_from_name {
my $self = shift;
my $name = $self->get_field('name');
if ($name =~ /.+?\|.+?\|.+?\|(\w+)/) {
$self->_fields->{accession} = $1;
}
elsif ($self->_fields->{name} =~ /.+?\|(\w+)?\./) {
# old form?
$self->_fields->{accession} = $1;
}
else {
$self->_fields->{accession} = $name;
}
}
sub _discover_start_end {
my $self = shift;
my ($q_start, $q_end, $h_start, $h_end);
foreach my $hsp ($self->hsps) {
my ($this_q_start, $this_h_start) = $hsp->start;
my ($this_q_end, $this_h_end) = $hsp->end;
if (! defined $q_start || $this_q_start < $q_start) {
$q_start = $this_q_start;
}
if (! defined $h_start || $this_h_start < $h_start) {
$h_start = $this_h_start;
}
if (! defined $q_end || $this_q_end > $q_end) {
$q_end = $this_q_end;
}
if (! defined $h_end || $this_h_end > $h_end) {
$h_end = $this_h_end;
}
}
$self->_fields->{query_start} = $q_start;
$self->_fields->{query_end} = $q_end;
$self->_fields->{hit_start} = $h_start;
$self->_fields->{hit_end} = $h_end;
}
sub _discover_next_hsp {
my $self = shift;
$self->_chunk_seek($self->{_end_of_previous_hsp} || $self->{_after_header});
my ($start, $end) = $self->_find_chunk_by_end("\n Score = ");
if ((defined $end && ($end + $self->_chunk_true_start) > $self->_chunk_true_end) || ! $end) {
$start = $self->{_end_of_previous_hsp} || $self->{_after_header};
$end = $self->_chunk_true_end;
}
else {
$end += $self->_chunk_true_start;
}
$start += $self->_chunk_true_start;
return if $start >= $self->_chunk_true_end;
$self->{_end_of_previous_hsp} = $end - $self->_chunk_true_start;
#*** needs to inherit piped_behaviour, and we need to deal with _sequential
# ourselves
$self->_fields->{next_hsp} = new Bio::Search::HSP::BlastPullHSP(-parent => $self,
-chunk => [$self->chunk, $start, $end]);
}
sub _discover_num_hsps {
my $self = shift;
$self->_fields->{num_hsps} = $self->hsps;
}
=head2 next_hsp
Title : next_hsp
Usage : while( $hsp = $obj->next_hsp()) { ... }
Function : Returns the next available High Scoring Pair
Example :
Returns : L object or null if finished
Args : none
=cut
sub next_hsp {
my $self = shift;
my $hsp = $self->get_field('next_hsp');
undef $self->_fields->{next_hsp};
return $hsp;
}
=head2 hsps
Usage : $hit_object->hsps();
Purpose : Get a list containing all HSP objects.
Example : @hsps = $hit_object->hsps();
Returns : list of L objects.
Argument : none
=cut
sub hsps {
my $self = shift;
my $old = $self->{_end_of_previous_hsp};
$self->rewind;
my @hsps;
while (defined(my $hsp = $self->next_hsp)) {
push(@hsps, $hsp);
}
$self->{_end_of_previous_hsp} = $old;
return @hsps;
}
=head2 hsp
Usage : $hit_object->hsp( [string] );
Purpose : Get a single HSPI object for the present HitI object.
Example : $hspObj = $hit_object->hsp; # same as 'best'
: $hspObj = $hit_object->hsp('best');
: $hspObj = $hit_object->hsp('worst');
Returns : Object reference for a L object.
Argument : String (or no argument).
: No argument (default) = highest scoring HSP (same as 'best').
: 'best' = highest scoring HSP.
: 'worst' = lowest scoring HSP.
Throws : Exception if an unrecognized argument is used.
See Also : L, L()
=cut
sub hsp {
my ($self, $type) = @_;
$type ||= 'best';
$self->throw_not_implemented;
}
=head2 rewind
Title : rewind
Usage : $result->rewind;
Function: Allow one to reset the HSP iterator to the beginning, so that
next_hsp() will subsequently return the first hsp and so on.
Returns : n/a
Args : none
=cut
sub rewind {
my $self = shift;
delete $self->{_end_of_previous_hsp};
}
# have p() a synonym of significance()
sub p {
return shift->significance;
}
1;
Index: PullHitI.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-live/Bio/Search/Hit/PullHitI.pm,v
retrieving revision 1.2
retrieving revision 1.3
diff -C2 -d -r1.2 -r1.3
*** PullHitI.pm 26 Sep 2006 22:03:13 -0000 1.2
--- PullHitI.pm 14 May 2007 12:49:52 -0000 1.3
***************
*** 421,425 ****
foreach my $hsp ($self->hsps) {
# This will merge data for all HSPs together.
! push @inds, $hsp->seq_inds($seqType, $class);
}
--- 421,426 ----
foreach my $hsp ($self->hsps) {
# This will merge data for all HSPs together.
! my @these_inds = $hsp->seq_inds($seqType, $class);
! push @inds, @these_inds;
}
***************
*** 434,438 ****
my @inds = @{$self->{$storage_name}};
-
$collapse ? &Bio::Search::SearchUtils::collapse_nums(@inds) : @inds;
}
--- 435,438 ----
***************
*** 1017,1021 ****
my $ratio_rounded = sprintf( "%.3f", $ratio);
! # Round down iff normal rounding yields 1 (just like blast)
$ratio_rounded = 0.999 if (($ratio_rounded == 1) && ($ratio < 1));
return $ratio_rounded;
--- 1017,1021 ----
my $ratio_rounded = sprintf( "%.3f", $ratio);
! # Round down if normal rounding yields 1 (just like blast)
$ratio_rounded = 0.999 if (($ratio_rounded == 1) && ($ratio < 1));
return $ratio_rounded;
From sendu at dev.open-bio.org Mon May 14 08:49:54 2007
From: sendu at dev.open-bio.org (Senduran Balasubramaniam)
Date: Mon, 14 May 2007 12:49:54 +0000
Subject: [Bioperl-guts-l] bioperl-live/Bio/SearchIO blast_pull.pm,NONE,1.1
Message-ID: <200705141249.l4ECns8h000974@dev.open-bio.org>
Update of /home/repository/bioperl/bioperl-live/Bio/SearchIO
In directory dev.open-bio.org:/tmp/cvs-serv885/Bio/SearchIO
Added Files:
blast_pull.pm
Log Message:
initial commit of blast pull modules and tests
--- NEW FILE: blast_pull.pm ---
# $Id: blast_pull.pm,v 1.1 2007/05/14 12:49:52 sendu Exp $
#
# BioPerl module for Bio::SearchIO::blast_pull
#
# Cared for by Sendu Bala
#
# Copyright Sendu Bala
#
# You may distribute this module under the same terms as perl itself
# POD documentation - main docs before the code
=head1 NAME
Bio::SearchIO::blast_pull - A parser for BLAST output
=head1 SYNOPSIS
# do not use this class directly it is available through Bio::SearchIO
use Bio::SearchIO;
my $in = new Bio::SearchIO(-format => 'blast_pull',
-file => 't/data/new_blastn.txt');
while (my $result = $in->next_result) {
# this is a Bio::Search::Result::BlastNResult object
print "Results for ", $result->query_name(), "\n";
while (my $hit = $result->next_hit) {
print $hit->name(), "\n";
while (my $hsp = $hit->next_hsp) {
print "length is ", $hsp->length(), "\n";
}
}
}
=head1 DESCRIPTION
This object implements a pull-parser for BLAST output. It is fast since it
only does work on request (hence 'pull').
Currently only NCBI BLASTN and BLASTP are supported.
=head1 FEEDBACK
=head2 Mailing Lists
User feedback is an integral part of the evolution of this and other
Bioperl modules. Send your comments and suggestions preferably to
the Bioperl mailing list. Your participation is much appreciated.
bioperl-l at bioperl.org - General discussion
http://bioperl.org/wiki/Mailing_lists - About the mailing lists
=head2 Reporting Bugs
Report bugs to the Bioperl bug tracking system to help us keep track
of the bugs and their resolution. Bug reports can be submitted via the
web:
http://bugzilla.open-bio.org/
=head1 AUTHOR - Sendu Bala
Email bix at sendu.me.uk
=head1 APPENDIX
The re