From hartzell at dev.open-bio.org Wed Apr 1 13:25:19 2009 From: hartzell at dev.open-bio.org (George Hartzell) Date: Wed, 1 Apr 2009 13:25:19 -0400 Subject: [Bioperl-guts-l] [15621] bioperl-live/trunk/Bio/LocatableSeq.pm: Message-ID: <200904011725.n31HPJnx008652@dev.open-bio.org> Revision: 15621 Author: hartzell Date: 2009-04-01 13:25:18 -0400 (Wed, 01 Apr 2009) Log Message: ----------- * small documentation fix, make the description of the type of the return value match what's actually returned. Modified Paths: -------------- bioperl-live/trunk/Bio/LocatableSeq.pm Modified: bioperl-live/trunk/Bio/LocatableSeq.pm =================================================================== --- bioperl-live/trunk/Bio/LocatableSeq.pm 2009-03-30 17:49:50 UTC (rev 15620) +++ bioperl-live/trunk/Bio/LocatableSeq.pm 2009-04-01 17:25:18 UTC (rev 15621) @@ -459,8 +459,8 @@ This function gives the residue number for a given position in the alignment (i.e. column number) of the given. Gaps complicate this process and force the output to be a - L where values can be undefined. For example, - for the sequence: + L where values can be undefined. + For example, for the sequence: Seq/91-96 .AC..DEF.G. From maj at dev.open-bio.org Thu Apr 2 00:20:36 2009 From: maj at dev.open-bio.org (Mark Allen Jensen) Date: Thu, 2 Apr 2009 00:20:36 -0400 Subject: [Bioperl-guts-l] [15622] bioperl-live/trunk/Bio/SimpleAlign.pm: uniq_seq: removed commented print STDERR lines Message-ID: <200904020420.n324KaUq017119@dev.open-bio.org> Revision: 15622 Author: maj Date: 2009-04-02 00:20:36 -0400 (Thu, 02 Apr 2009) Log Message: ----------- uniq_seq: removed commented print STDERR lines changed POD to reflect the correct way to obtain ST output. See the thread: http://lists.open-bio.org/pipermail/bioperl-l/2009-April/029691.html Modified Paths: -------------- bioperl-live/trunk/Bio/SimpleAlign.pm Modified: bioperl-live/trunk/Bio/SimpleAlign.pm =================================================================== --- bioperl-live/trunk/Bio/SimpleAlign.pm 2009-04-01 17:25:18 UTC (rev 15621) +++ bioperl-live/trunk/Bio/SimpleAlign.pm 2009-04-02 04:20:36 UTC (rev 15622) @@ -620,7 +620,8 @@ differences. Function : Make a new alignment of unique sequence types (STs) Returns : 1. a new Bio::SimpleAlign object (all sequences renamed as "ST") - 2. ST of each sequence in STDERR + 2. if $aln->verbose > 0, ST of each sequence is sent to + STDERR Argument : None =cut @@ -684,12 +685,10 @@ -end =>$end ); $aln->add_seq($new); -# print STDERR "ST".$order{$str}, "\t=>"; foreach (@{$member{$str}}) { $self->debug($_->id(), "\t", "ST", $order{$str}, "\n"); + } } -# print STDERR "\n"; - } return $aln; } From bugzilla-daemon at portal.open-bio.org Thu Apr 2 10:02:21 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 2 Apr 2009 10:02:21 -0400 Subject: [Bioperl-guts-l] [Bug 2805] New: Bio::SimpleAlign, uniq_seq, and ST composition Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2805 Summary: Bio::SimpleAlign, uniq_seq, and ST composition Product: BioPerl Version: 1.6 branch Platform: PC OS/Version: Linux Status: NEW Severity: enhancement Priority: P2 Component: Core Components AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: tristan.lefebure at gmail.com Hello, The uniq_seq function of Bio::SimpleAlign, after collapsing identical sequences into sequence types (ST), will return a new aln object, and will print to STDERR the ST composition. The proposed enhancement would keep the ST composition information internally so that it can be re-used elsewhere in the script. Here is a simple, may be not Bioperl compliant hack to do so: diff /usr/local/share/perl/5.10.0/Bio/SimpleAlign.pm /usr/local/share/perl/5.10.0/Bio/SimpleAlignMod.pm 590a591,592 > #modified to also returned an array of the ST composition > my %st; 651a654 > push @{$st{$order{$str}}}, $_->id(); 655c658 < return $aln; --- > return ($aln, %st); See the mailing list archive for a more detailed description: http://bioperl.org/pipermail/bioperl-l/2009-April/029691.html --Tristan -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 2 10:04:43 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 2 Apr 2009 10:04:43 -0400 Subject: [Bioperl-guts-l] [Bug 2805] Bio::SimpleAlign, uniq_seq, and ST composition In-Reply-To: Message-ID: <200904021404.n32E4hDO029837@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2805 maj at fortinbras.us changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED ------- Comment #1 from maj at fortinbras.us 2009-04-02 10:04 EST ------- Thanks for your input, Tristan- MAJ -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From jason at dev.open-bio.org Thu Apr 2 11:51:08 2009 From: jason at dev.open-bio.org (Jason Stajich) Date: Thu, 2 Apr 2009 11:51:08 -0400 Subject: [Bioperl-guts-l] [15623] bioperl-live/trunk/scripts/searchio/fastam9_to_table.PLS: MPI output slightly different Message-ID: <200904021551.n32Fp8Gx000817@dev.open-bio.org> Revision: 15623 Author: jason Date: 2009-04-02 11:51:07 -0400 (Thu, 02 Apr 2009) Log Message: ----------- MPI output slightly different Modified Paths: -------------- bioperl-live/trunk/scripts/searchio/fastam9_to_table.PLS Modified: bioperl-live/trunk/scripts/searchio/fastam9_to_table.PLS =================================================================== --- bioperl-live/trunk/scripts/searchio/fastam9_to_table.PLS 2009-04-02 04:20:36 UTC (rev 15622) +++ bioperl-live/trunk/scripts/searchio/fastam9_to_table.PLS 2009-04-02 15:51:07 UTC (rev 15623) @@ -85,10 +85,10 @@ my $linestr = $_; if( /^\s*\d+>>>(\S+).+/ ) { $data{'qname'} = $1; - if( /\-\s+(\d+)\s+(aa|nt)\s+$/ ){ + if( /\-?\s+(\d+)\s+(aa|nt)\s+$/ ){ $data{'qlen'} = $1; } - } elsif( $hitsection && /^>>>\Q$data{'qname'}/ ) { + } elsif( $hitsection && (/^>>>\Q$data{'qname'}/ || /^>>>/) ) { $hitsection = 0; } elsif( /^The best scores are:/ ) { $hitsection = 1; From lstein at dev.open-bio.org Sat Apr 4 04:57:41 2009 From: lstein at dev.open-bio.org (Lincoln Stein) Date: Sat, 4 Apr 2009 04:57:41 -0400 Subject: [Bioperl-guts-l] [15624] bioperl-live/trunk: fixed round-tripping of gff3 format when a feature has multiple parentage. Message-ID: <200904040857.n348vfRs014990@dev.open-bio.org> Revision: 15624 Author: lstein Date: 2009-04-04 04:57:40 -0400 (Sat, 04 Apr 2009) Log Message: ----------- fixed round-tripping of gff3 format when a feature has multiple parentage. Modified Paths: -------------- bioperl-live/trunk/Bio/DB/SeqFeature/NormalizedFeature.pm bioperl-live/trunk/Bio/DB/SeqFeature/Store/GFF3Loader.pm bioperl-live/trunk/Bio/SeqFeature/Lite.pm bioperl-live/trunk/t/LocalDB/SeqFeature.t bioperl-live/trunk/t/data/seqfeaturedb/test.gff3 Modified: bioperl-live/trunk/Bio/DB/SeqFeature/NormalizedFeature.pm =================================================================== --- bioperl-live/trunk/Bio/DB/SeqFeature/NormalizedFeature.pm 2009-04-02 15:51:07 UTC (rev 15623) +++ bioperl-live/trunk/Bio/DB/SeqFeature/NormalizedFeature.pm 2009-04-04 08:57:40 UTC (rev 15624) @@ -471,8 +471,11 @@ # undo the load_id and Target hacks on the way out sub format_attributes { my $self = shift; - my $parent = shift; + my $parent = shift; + my $fallback_id = shift; + my $load_id = $self->load_id || ''; + my $targobj = ($self->attributes('Target'))[0]; # was getting an 'Use of uninitialized value with split' here, changed to cooperate -cjf 7/10/07 my ($target) = $targobj ? split /\s+/,($self->attributes('Target'))[0] : (''); @@ -491,10 +494,14 @@ foreach (@values) { s/\s+$// } # get rid of trailing whitespace push @result,join '=',$self->escape($t),join(',', map {$self->escape($_)} @values) if @values; } - my $id = $self->primary_id; + my $id = $self->primary_id || $fallback_id; + my $parent_id; + if (@$parent) { + $parent_id = join (',',map {$self->escape($_)} @$parent); + } my $name = $self->display_name; - unshift @result,"ID=".$self->escape($id) if defined $id; - unshift @result,"Parent=".$self->escape($parent->primary_id) if defined $parent; + unshift @result,"ID=".$self->escape($id) if defined $id; + unshift @result,"Parent=".$parent_id if defined $parent_id; unshift @result,"Name=".$self->escape($name) if defined $name; return join ';', at result; } Modified: bioperl-live/trunk/Bio/DB/SeqFeature/Store/GFF3Loader.pm =================================================================== --- bioperl-live/trunk/Bio/DB/SeqFeature/Store/GFF3Loader.pm 2009-04-02 15:51:07 UTC (rev 15623) +++ bioperl-live/trunk/Bio/DB/SeqFeature/Store/GFF3Loader.pm 2009-04-04 08:57:40 UTC (rev 15624) @@ -570,16 +570,20 @@ # contiguous feature, so add a segment warn $old_feat if defined $old_feat and !ref $old_feat; - if (defined $old_feat && - ( - $old_feat->seq_id ne $refname || - $old_feat->start != $start || - $old_feat->end != $end # make sure endpoints are distinct - ) - ) - { - $self->add_segment($old_feat,$self->sfclass->new(@args)); - return; + if (defined $old_feat) { + # set this to 1 to disable split-location behavior + if (0 && @parent_ids) { # If multiple features are held together by the same ID + $feature_id = $ld->{TemporaryID}++; # AND they have a Parent attribute, this causes an undesirable + } # additional layer of aggregation. Changing the ID fixes this. + elsif ( + $old_feat->seq_id ne $refname || + $old_feat->start != $start || + $old_feat->end != $end # make sure endpoints are distinct + ) + { + $self->add_segment($old_feat,$self->sfclass->new(@args)); + return; + } } # we get here if this is a new feature @@ -597,10 +601,6 @@ my $has_id = defined $reserved->{ID}[0]; $index_it ||= $top_level; -# $ld->{IndexIt}{$feature_id}++ if $index_it; -# $ld->{TopLevel}{$feature_id}++ if !$self->{fast} -# && $top_level; # need to track top level features - my $helper = $ld->{Helper}; $helper->indexit($feature_id=>1) if $index_it; $helper->toplevel($feature_id=>1) if !$self->{fast} Modified: bioperl-live/trunk/Bio/SeqFeature/Lite.pm =================================================================== --- bioperl-live/trunk/Bio/SeqFeature/Lite.pm 2009-04-02 15:51:07 UTC (rev 15623) +++ bioperl-live/trunk/Bio/SeqFeature/Lite.pm 2009-04-04 08:57:40 UTC (rev 15624) @@ -266,7 +266,10 @@ -type => $type, -name => $name, -class => $class, + -phase => $self->{phase}, + -score => $self->{score}, -source_tag => $source_tag, + -attributes => $self->{attributes}, ); $min_start = $start if $start < $min_start; $max_stop = $stop if $stop > $max_stop; @@ -280,9 +283,6 @@ } if (@segments) { local $^W = 0; # some warning of an uninitialized variable... - # this was killing performance! - # $self->{segments} = [ sort {$a->start <=> $b->start } @segments ]; - # this seems much faster and seems to work still $self->{segments} = \@segments; $self->{ref} ||= $self->{segments}[0]->seq_id; $self->{start} = $min_start; @@ -514,7 +514,8 @@ my $self = shift; my $d = $self->{primary_id}; $self->{primary_id} = shift if @_; - $d; + return $d if defined $d; + return (overload::StrVal($self) =~ /0x([a-f0-9]+)/)[0]; } sub notes { @@ -677,67 +678,66 @@ $string; } +# Suggested strategy for dealing with the multiple parentage issue. +# First recurse through object tree and record parent tree. +# Then recurse again, skipping objects we've seen before. sub gff3_string { - my ($self, $recurse, $preserveHomegenousParent, $dontPropogateParentAttrs, - # Note: the following parameters, whose name begins with '$_', - # are intended for recursive call only. - $_parent, - $_parentGroup, # if so, what is the group (GFF column 9) of the parent - ) = @_; + my ($self,$recurse,$parent_tree,$seenit,$force_id) = @_; + $parent_tree ||= {}; + $seenit ||= {}; + my @rsf = (); + my @parent_ids; - # PURPOSE: Return GFF3 format for the feature $self. Optionally - # $recurse to include GFF for any subfeatures of the feature. If - # recursing, provide special handling to "remove an extraneous level - # of parentage" (unless $preserveHomegenousParent) for features - # which have at least one subfeature with the same type as the - # feature itself (thus redefining Lincoln's "homogenous - # parent/child" case, which previously required all children to have - # the same type as parent). This usage is a convention for - # representing discontiguous features; they may be created by using - # the -segment directive without specifying a distinct -subtype to - # Bio::SeqFeature::LiteBase->new (or to Bio::DB::SeqFeature, - # Bio::SeqFeature::Lite). Such homogenous subfeatures created in - # this fashion TYPICALLY do not have the parent (GFF column 9) - # attributes propogated to them; but, since they are all part of the - # same parent, the ONLY difference relevant to GFF production SHOULD - # be the $start and $end coordinates for their segment, and ALL - # THEIR OTHER ATTRIBUTES should be taken from the parent (including: - # score, Name, ID, Parent, etc), which happens UNLESS - # $dontPropogateParentAttrs is passed. + if ($recurse) { + $self->_traverse($parent_tree) unless %$parent_tree; # this will record parents of all children + my $primary_id = defined $force_id ? $force_id : $self->primary_id; - my @rsf = $recurse ? $self->sub_SeqFeature : (); - my $recurseSubfeatureWithSameType = - # will be TRUE if we're going to recurse and at least 1 subfeature - # has same type as $self. - sub {($_->type eq $self->type) && return 1 for @rsf ; 0 }->(); - my $typeIsSameAsParent = $_parent && ($_parent->type eq $self->type); - my $hparentOrSelf = ($typeIsSameAsParent && ! $dontPropogateParentAttrs) ? $_parent : $self; - my $group = ($typeIsSameAsParent && ! $dontPropogateParentAttrs) - ? $_parentGroup - : $self->format_attributes($_parent); + return if $seenit->{$primary_id}++; - my @gff3 = $recurseSubfeatureWithSameType && ! $preserveHomegenousParent ? () : - do { - my $name = $hparentOrSelf->name; - my $class = $hparentOrSelf->class; - my $strand = ('-','.','+')[$hparentOrSelf->strand+1]; - # TODO: understand conditions under which $self->strand could be other than - # $hparentOrSelf->strand. In particular, why does add_segment flip - # the strand when start > stop? I thought this was not allowed! - # Lincoln - any ideas? - my $p = join("\t", - $hparentOrSelf->ref||'.',$hparentOrSelf->source||'.',$hparentOrSelf->method||'.', - $self->start||'.',$self->stop||'.', - defined($hparentOrSelf->score) ? $hparentOrSelf->score : '.', - $strand||'.', - defined($hparentOrSelf->phase) ? $hparentOrSelf->phase : '.', - $group||''); - $p; - }; - join("\n", @gff3, map {$_->gff3_string($recurse,$preserveHomegenousParent, - $dontPropogateParentAttrs,$hparentOrSelf,$group)} @rsf); + @rsf = $self->get_SeqFeatures; + if (@rsf) { + # Detect case in which we have a split location feature. In this case we + # skip to the grandchildren and trick them into thinking that our parent is theirs. + my %types = map {$_->primary_tag=>1} @rsf; + my @types = keys %types; + if (@types == 1 && $types[0] eq $self->primary_tag) { + return join ("\n",map {$_->gff3_string(1,$parent_tree,{},$primary_id)} @rsf); + } + } + + @parent_ids = keys %{$parent_tree->{$primary_id}}; + } + + my $group = $self->format_attributes(\@parent_ids,$force_id); + my $name = $self->name; + + my $class = $self->class; + my $strand = ('-','.','+')[$self->strand+1]; + my $p = join("\t", + $self->seq_id||'.', + $self->source||'.', + $self->method||'.', + $self->start||'.', + $self->stop||'.', + defined($self->score) ? $self->score : '.', + $strand||'.', + defined($self->phase) ? $self->phase : '.', + $group||''); + return join("\n", + $p, + map {$_->gff3_string(1,$parent_tree,$seenit)} @rsf); } +sub _traverse { + my $self = shift; + my $tree = shift; # tree => {$child}{$parent} = 1 + my $parent = shift; + my $id = $self->primary_id; + defined $id or return; + $tree->{$id}{$parent->primary_id}++ if $parent; + $_->_traverse($tree,$self) foreach $self->get_SeqFeatures; +} + sub db { return } sub source_tag { @@ -798,18 +798,26 @@ } sub format_attributes { - my $self = shift; - my $parent = shift; + my $self = shift; + my $parent = shift; + my $fallback_id = shift; + @@ Diff output truncated at 10000 characters. @@ From lstein at dev.open-bio.org Mon Apr 6 13:24:59 2009 From: lstein at dev.open-bio.org (Lincoln Stein) Date: Mon, 6 Apr 2009 13:24:59 -0400 Subject: [Bioperl-guts-l] [15625] bioperl-live/trunk/Bio: backed out changes to primary_id that broke seqfeature::store Message-ID: <200904061724.n36HOxRp026827@dev.open-bio.org> Revision: 15625 Author: lstein Date: 2009-04-06 13:24:58 -0400 (Mon, 06 Apr 2009) Log Message: ----------- backed out changes to primary_id that broke seqfeature::store Modified Paths: -------------- bioperl-live/trunk/Bio/DB/SeqFeature/Store/berkeleydb.pm bioperl-live/trunk/Bio/SeqFeature/Lite.pm Modified: bioperl-live/trunk/Bio/DB/SeqFeature/Store/berkeleydb.pm =================================================================== --- bioperl-live/trunk/Bio/DB/SeqFeature/Store/berkeleydb.pm 2009-04-04 08:57:40 UTC (rev 15624) +++ bioperl-live/trunk/Bio/DB/SeqFeature/Store/berkeleydb.pm 2009-04-06 17:24:58 UTC (rev 15625) @@ -12,7 +12,7 @@ use File::Path 'rmtree','mkpath'; use File::Basename; use File::Spec; -use Carp 'carp'; +use Carp 'carp','croak'; use constant BINSIZE => 10_000; use constant MININT => -999_999_999_999; @@ -61,7 +61,7 @@ # use the GFF3 loader to do a bulk write: my $loader = Bio::DB::SeqFeature::Store::GFF3Loader->new(-store => $db, - -verbose => 1, + -verbose => 0, -fast => 1); $loader->load('/home/fly4.3/dmel-all.gff'); @@ -189,6 +189,9 @@ -temp Pass true to create temporary index files that will be deleted once the script exits. + -verbose Pass true to report autoindexing operations on STDERR. + (default is true). + Examples: To create an empty database which will be populated using calls to @@ -257,6 +260,8 @@ 'VERBOSE' ], at _); + $verbose = 1 unless defined $verbose; + if ($autoindex) { -d $autoindex or $self->throw("Invalid directory $autoindex"); $directory ||= "$autoindex/indexes"; @@ -301,13 +306,15 @@ my $autodir = shift; my $result = $self->needs_auto_reindexing($autodir); - if (%$result) { + if ($result && %$result) { + $self->flag_autoindexing(1); $self->lock('exclusive'); $self->reindex_wigfiles($result->{wig},$autodir) if $result->{wig}; $self->reindex_ffffiles($result->{fff},$autodir) if $result->{fff}; $self->reindex_gfffiles($result->{gff},$autodir) if $result->{gff}; $self->dna_db(Bio::DB::Fasta::Subdir->new($autodir)); $self->unlock; + $self->flag_autoindexing(0); } else { @@ -315,6 +322,47 @@ } } +sub autoindex_flagfile { + return File::Spec->catfile(shift->directory,'autoindex.pid'); +} +sub auto_index_in_process { + my $self = shift; + my $flag_file = $self->autoindex_flagfile; + return unless -e $flag_file; + + # if flagfile exists, then check that PID still exists + open my $fh,$flag_file or die "Couldn't open $flag_file: $!"; + my $pid = <$fh>; + close $fh; + return 1 if kill 0=>$pid; + warn "Autoindexing seems to be running in another process, but the process has gone away. Trying to override..."; + if (unlink $flag_file) { + warn "Successfully removed stale PID file." if $self->verbose; + warn "Assuming partial reindexing process. Rebuilding indexes from scratch..." if $self->verbose; + my $glob = File::Spec->catfile($self->directory,'*'); + unlink glob($glob); + return; + } else { + croak ("Cannot recover from apparent aborted autoindex process. Please remove files in ", + $self->directory, + " and allow the adaptor to reindex"); + return 1; + } +} + +sub flag_autoindexing { + my $self = shift; + my $doit = shift; + my $flag_file = $self->autoindex_flagfile; + if ($doit) { + open my $fh,'>',$flag_file or die "Couldn't open $flag_file: $!"; + print $fh $$; + close $fh; + } else { + unlink $flag_file; + } +} + sub reindex_gfffiles { my $self = shift; my $files = shift; @@ -323,11 +371,13 @@ warn "Reindexing GFF files...\n" if $self->verbose; $self->_permissions(1,1); $self->_close_databases(); - $self->_open_databases(1,1); + $self->_open_databases(1,0); require Bio::DB::SeqFeature::Store::GFF3Loader unless Bio::DB::SeqFeature::Store::GFF3Loader->can('new'); my $loader = Bio::DB::SeqFeature::Store::GFF3Loader->new(-store => $self, - -sf_class => $self->seqfeature_class) + -sf_class => $self->seqfeature_class, + -verbose => $self->verbose, + ) or $self->throw("Couldn't create GFF3Loader"); my %seen; $loader->load(grep {!$seen{$_}++} @$files); @@ -346,7 +396,9 @@ require Bio::DB::SeqFeature::Store::FeatureFileLoader unless Bio::DB::SeqFeature::Store::FeatureFileLoader->can('new'); my $loader = Bio::DB::SeqFeature::Store::FeatureFileLoader->new(-store => $self, - -sf_class => $self->seqfeature_class) + -sf_class => $self->seqfeature_class, + -verbose => $self->verbose, + ) or $self->throw("Couldn't create FeatureFileLoader"); my %seen; $loader->load(grep {!$seen{$_}++} @$files); @@ -401,6 +453,9 @@ my $autodir = shift; my $result = {}; + # don't allow two processes to reindex simultaneously + $self->auto_index_in_process and croak "Autoindexing in process. Try again later"; + # first interrogate the GFF3 files, using the timestamp file # as modification comparison my (@gff3, at fff, at wig,$fasta,$fasta_index_time); @@ -408,6 +463,7 @@ or $self->throw("Couldn't open directory $autodir for reading: $!"); my $maxtime = 0; + my $timestamp_time = _mtime($self->_mtime_path) || 0; while (defined (my $node = readdir($D))) { next if $node =~ /^\./; my $path = File::Spec->catfile($autodir,$node); @@ -417,6 +473,7 @@ my $mtime = _mtime(\*_); # not a typo $maxtime = $mtime if $mtime > $maxtime; push @gff3,$path; +# push @gff3,$path if $mtime > $timestamp_time; } @@ -424,6 +481,7 @@ my $mtime = _mtime(\*_); # not a typo $maxtime = $mtime if $mtime > $maxtime; push @fff,$path; +# push @fff,$path if $mtime > $timestamp_time; } elsif ($path =~ /\.wig$/i) { @@ -443,9 +501,9 @@ } } closedir $D; - my $timestamp_time = _mtime($self->_mtime_path) || 0; $result->{gff} = \@gff3 if $maxtime > $timestamp_time; +# $result->{gff} = \@gff3 if @gff3; $result->{wig} = \@wig if @wig; $result->{fff} = \@fff if @fff; $result->{fasta}++ if $fasta; Modified: bioperl-live/trunk/Bio/SeqFeature/Lite.pm =================================================================== --- bioperl-live/trunk/Bio/SeqFeature/Lite.pm 2009-04-04 08:57:40 UTC (rev 15624) +++ bioperl-live/trunk/Bio/SeqFeature/Lite.pm 2009-04-06 17:24:58 UTC (rev 15625) @@ -514,8 +514,9 @@ my $self = shift; my $d = $self->{primary_id}; $self->{primary_id} = shift if @_; - return $d if defined $d; - return (overload::StrVal($self) =~ /0x([a-f0-9]+)/)[0]; + return $d; +# return $d if defined $d; +# return (overload::StrVal($self) =~ /0x([a-f0-9]+)/)[0]; } sub notes { @@ -690,7 +691,7 @@ if ($recurse) { $self->_traverse($parent_tree) unless %$parent_tree; # this will record parents of all children - my $primary_id = defined $force_id ? $force_id : $self->primary_id; + my $primary_id = defined $force_id ? $force_id : $self->_real_or_dummy_id; return if $seenit->{$primary_id}++; @@ -728,13 +729,20 @@ map {$_->gff3_string(1,$parent_tree,$seenit)} @rsf); } +sub _real_or_dummy_id { + my $self = shift; + my $id = $self->primary_id; + return $id if defined $id; + return return (overload::StrVal($self) =~ /0x([a-f0-9]+)/)[0]; +} + sub _traverse { my $self = shift; my $tree = shift; # tree => {$child}{$parent} = 1 my $parent = shift; - my $id = $self->primary_id; + my $id = $self->_real_or_dummy_id; defined $id or return; - $tree->{$id}{$parent->primary_id}++ if $parent; + $tree->{$id}{$parent->_real_or_dummy_id}++ if $parent; $_->_traverse($tree,$self) foreach $self->get_SeqFeatures; } @@ -808,7 +816,7 @@ my @values = $self->each_tag_value($t); push @result,join '=',$self->escape($t),join(',', map {$self->escape($_)} @values) if @values; } - my $id = $self->escape($self->primary_id) || $fallback_id; + my $id = $self->escape($self->_real_or_dummy_id) || $fallback_id; my $parent_id; if (@$parent) { From cjfields at dev.open-bio.org Tue Apr 7 01:27:17 2009 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Tue, 7 Apr 2009 01:27:17 -0400 Subject: [Bioperl-guts-l] [15626] bioperl-live/trunk/Bio/Search/HSP/HSPI.pm: typo Message-ID: <200904070527.n375RHY8031330@dev.open-bio.org> Revision: 15626 Author: cjfields Date: 2009-04-07 01:27:17 -0400 (Tue, 07 Apr 2009) Log Message: ----------- typo Modified Paths: -------------- bioperl-live/trunk/Bio/Search/HSP/HSPI.pm Modified: bioperl-live/trunk/Bio/Search/HSP/HSPI.pm =================================================================== --- bioperl-live/trunk/Bio/Search/HSP/HSPI.pm 2009-04-06 17:24:58 UTC (rev 15625) +++ bioperl-live/trunk/Bio/Search/HSP/HSPI.pm 2009-04-07 05:27:17 UTC (rev 15626) @@ -248,7 +248,7 @@ Title : gaps Usage : my $gaps = $hsp->gaps( ['query'|'hit'|'total'] ); - Function : Get the number of gap charcters in the query, hit, or total alignment. + Function : Get the number of gap characters in the query, hit, or total alignment. Returns : Integer, number of gap characters or 0 if none Args : 'query' = num conserved / length of query seq (without gaps) 'hit' = num conserved / length of hit seq (without gaps) From cjfields at dev.open-bio.org Tue Apr 7 01:28:38 2009 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Tue, 7 Apr 2009 01:28:38 -0400 Subject: [Bioperl-guts-l] [15627] bioperl-live/trunk/Bio/AnnotatableI.pm: remove unnecessary use statements Message-ID: <200904070528.n375Scqw031361@dev.open-bio.org> Revision: 15627 Author: cjfields Date: 2009-04-07 01:28:38 -0400 (Tue, 07 Apr 2009) Log Message: ----------- remove unnecessary use statements Modified Paths: -------------- bioperl-live/trunk/Bio/AnnotatableI.pm Modified: bioperl-live/trunk/Bio/AnnotatableI.pm =================================================================== --- bioperl-live/trunk/Bio/AnnotatableI.pm 2009-04-07 05:27:17 UTC (rev 15626) +++ bioperl-live/trunk/Bio/AnnotatableI.pm 2009-04-07 05:28:38 UTC (rev 15627) @@ -84,12 +84,6 @@ package Bio::AnnotatableI; use strict; -use Bio::Annotation::Comment; -use Bio::Annotation::DBLink; -#use Bio::Annotation::OntologyTerm; -use Bio::Annotation::Reference; -use Bio::Annotation::SimpleValue; - use base qw(Bio::Root::RootI); =head2 annotation From cjfields at dev.open-bio.org Tue Apr 7 01:30:21 2009 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Tue, 7 Apr 2009 01:30:21 -0400 Subject: [Bioperl-guts-l] [15628] bioperl-live/trunk: fix for mail list location bug along with a test Message-ID: <200904070530.n375ULvX031392@dev.open-bio.org> Revision: 15628 Author: cjfields Date: 2009-04-07 01:30:21 -0400 (Tue, 07 Apr 2009) Log Message: ----------- fix for mail list location bug along with a test Modified Paths: -------------- bioperl-live/trunk/Bio/Factory/FTLocationFactory.pm bioperl-live/trunk/t/SeqFeature/LocationFactory.t Modified: bioperl-live/trunk/Bio/Factory/FTLocationFactory.pm =================================================================== --- bioperl-live/trunk/Bio/Factory/FTLocationFactory.pm 2009-04-07 05:28:38 UTC (rev 15627) +++ bioperl-live/trunk/Bio/Factory/FTLocationFactory.pm 2009-04-07 05:30:21 UTC (rev 15628) @@ -172,18 +172,17 @@ next if !$subloc; my $oparg = ($subloc eq 'join' || $subloc eq 'bond' || $subloc eq 'order' || $subloc eq 'complement') ? $subloc : undef; - # has operator, requires further work (recurse) if ($oparg) { my $sub = shift @sublocs; + # simple split operators (no recursive calls needed) if (($oparg eq 'join' || $oparg eq 'order' || $oparg eq 'bond' ) - && $sub !~ m{$oparg}) { + && $sub !~ m{(?:join|order|bond)}) { my @splitlocs = split(q(,), $sub); - $loc_obj = Bio::Location::Split->new(); + $loc_obj = Bio::Location::Split->new(-verbose => 1, + -splittype => $oparg); while (my $splitloc = shift @splitlocs) { next unless $splitloc; - #$loc_obj->add_sub_Location($self->from_string($splitloc, 1)); - # this should work but doesn't my $sobj; if ($splitloc =~ m{\(($LOCREG)\)}) { my $comploc = $1; @@ -196,6 +195,9 @@ } } else { $loc_obj = $self->from_string($sub, $oparg); + # reinsure the operator is set correctly for this level + # unless it is complement + $loc_obj->splittype($oparg) unless $oparg eq 'complement'; } } # no operator, simple or fuzzy Modified: bioperl-live/trunk/t/SeqFeature/LocationFactory.t =================================================================== --- bioperl-live/trunk/t/SeqFeature/LocationFactory.t 2009-04-07 05:28:38 UTC (rev 15627) +++ bioperl-live/trunk/t/SeqFeature/LocationFactory.t 2009-04-07 05:30:21 UTC (rev 15628) @@ -7,7 +7,7 @@ use lib '.'; use Bio::Root::Test; - test_begin(-tests => 271); + test_begin(-tests => 272); use_ok('Bio::Factory::FTLocationFactory'); } @@ -134,7 +134,8 @@ 'join(20464..20694,21548..22763,complement(join(314652..314672,232596..232990,231520..231669)))', 'join(20464..20694,21548..22763,complement(join(314652..314672,232596..232990,231520..231669)))', # this is just seen once - 'join(1000..2000,join(3000..4000,join(5000..6000,7000..8000)),9000..10000)' + 'join(1000..2000,join(3000..4000,join(5000..6000,7000..8000)),9000..10000)', + 'order(S67862.1:72..75,join(S67863.1:1..788,1..19))' ); for my $locstr ( @@ -142,7 +143,8 @@ 'join(11025..11049,complement(join(315036..315294,251354..251412,241499..241580,239890..240081)))', 'join(20464..20694,21548..22763,complement(join(314652..314672,232596..232990,231520..231669)))', 'join(20464..20694,21548..22763,join(complement(231520..231669),complement(232596..232990),complement(314652..314672)))', - 'join(1000..2000,join(3000..4000,join(5000..6000,7000..8000)),9000..10000)' + 'join(1000..2000,join(3000..4000,join(5000..6000,7000..8000)),9000..10000)', + 'order(S67862.1:72..75,join(S67863.1:1..788,1..19))' ) { my $loc = $locfac->from_string($locstr); my $ftstr = $loc->to_FTstring(); From maj at dev.open-bio.org Wed Apr 8 12:19:54 2009 From: maj at dev.open-bio.org (Mark Allen Jensen) Date: Wed, 8 Apr 2009 12:19:54 -0400 Subject: [Bioperl-guts-l] [15629] bioperl-dev/trunk/Bio/DB/HIV/HIVXmlSchema.pm: pod mod Message-ID: <200904081619.n38GJsqf008617@dev.open-bio.org> Revision: 15629 Author: maj Date: 2009-04-08 12:19:53 -0400 (Wed, 08 Apr 2009) Log Message: ----------- pod mod Modified Paths: -------------- bioperl-dev/trunk/Bio/DB/HIV/HIVXmlSchema.pm Modified: bioperl-dev/trunk/Bio/DB/HIV/HIVXmlSchema.pm =================================================================== --- bioperl-dev/trunk/Bio/DB/HIV/HIVXmlSchema.pm 2009-04-07 05:30:21 UTC (rev 15628) +++ bioperl-dev/trunk/Bio/DB/HIV/HIVXmlSchema.pm 2009-04-08 16:19:53 UTC (rev 15629) @@ -76,7 +76,7 @@ use XML::LibXML; use Log::Report; -=head2 Bio::DB::HIV::HIVQuery::make_xml_from_query +=head2 Bio::DB::Query::HIVQuery::make_xml_from_query Title : make_XML_from_query Usage : $q->make_XML_from_query() From maj at dev.open-bio.org Wed Apr 8 13:37:44 2009 From: maj at dev.open-bio.org (Mark Allen Jensen) Date: Wed, 8 Apr 2009 13:37:44 -0400 Subject: [Bioperl-guts-l] [15630] bioperl-dev/trunk/Bio/DB/HIV/HIVXmlSchema.pm: keyword subst Message-ID: <200904081737.n38HbiFB008768@dev.open-bio.org> Revision: 15630 Author: maj Date: 2009-04-08 13:37:44 -0400 (Wed, 08 Apr 2009) Log Message: ----------- keyword subst Property Changed: ---------------- bioperl-dev/trunk/Bio/DB/HIV/HIVXmlSchema.pm From maj at dev.open-bio.org Thu Apr 9 00:45:21 2009 From: maj at dev.open-bio.org (Mark Allen Jensen) Date: Thu, 9 Apr 2009 00:45:21 -0400 Subject: [Bioperl-guts-l] [15631] bioperl-live/trunk/Bio/SimpleAlign.pm: per Tristan' s patch Bug #2805: in array context, Message-ID: <200904090445.n394jLJC010503@dev.open-bio.org> Revision: 15631 Author: maj Date: 2009-04-09 00:45:21 -0400 (Thu, 09 Apr 2009) Log Message: ----------- per Tristan's patch Bug #2805: in array context, uniq_seq() now returns a sequence type hashref along with the new aln object Modified Paths: -------------- bioperl-live/trunk/Bio/SimpleAlign.pm Modified: bioperl-live/trunk/Bio/SimpleAlign.pm =================================================================== --- bioperl-live/trunk/Bio/SimpleAlign.pm 2009-04-08 17:37:44 UTC (rev 15630) +++ bioperl-live/trunk/Bio/SimpleAlign.pm 2009-04-09 04:45:21 UTC (rev 15631) @@ -619,9 +619,14 @@ leading and ending gaps ("-") are NOT counted as differences. Function : Make a new alignment of unique sequence types (STs) - Returns : 1. a new Bio::SimpleAlign object (all sequences renamed as "ST") + Returns : 1a. if called in a scalar context, + a new Bio::SimpleAlign object (all sequences renamed as "ST") + 1b. if called in an array context, + a new Bio::SimpleAlign object, and a hashref whose keys + are sequence types, and whose values are arrayrefs to + lists of sequence ids within the corresponding sequence type 2. if $aln->verbose > 0, ST of each sequence is sent to - STDERR + STDERR (in a tabular format) Argument : None =cut @@ -629,9 +634,10 @@ sub uniq_seq { my ($self, $seqid) = @_; my $aln = $self->new; - my (%member, %order, @seq, @uniq_str); + my (%member, %order, @seq, @uniq_str, $st); my $order=0; my $len = $self->length(); + $st = {}; foreach my $seq ( $self->each_seq() ) { my $str = $seq->seq(); @@ -686,10 +692,11 @@ ); $aln->add_seq($new); foreach (@{$member{$str}}) { + push @{$$st{$order{$str}}}, $_->id(); # per Tristan's patch/Bug #2805 $self->debug($_->id(), "\t", "ST", $order{$str}, "\n"); } } - return $aln; + return wantarray ? ($aln, $st) : $aln; } sub _check_uniq { # check if same seq exists in the alignment From bugzilla-daemon at portal.open-bio.org Thu Apr 9 00:47:42 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 9 Apr 2009 00:47:42 -0400 Subject: [Bioperl-guts-l] [Bug 2805] Bio::SimpleAlign, uniq_seq, and ST composition In-Reply-To: Message-ID: <200904090447.n394lg4K031945@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2805 maj at fortinbras.us changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |maj at fortinbras.us Status|ASSIGNED |RESOLVED Resolution| |FIXED ------- Comment #2 from maj at fortinbras.us 2009-04-09 00:47 EST ------- SimpleAlign.pm updated: uniq_seq() will now return ($aln_obj, $seq_type_hashref) per Tristan's patch when called in an array context; in a scalar context, returns just the new alignment object. Passes all tests. Thanks for the patch! -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From lstein at dev.open-bio.org Thu Apr 9 13:13:06 2009 From: lstein at dev.open-bio.org (Lincoln Stein) Date: Thu, 9 Apr 2009 13:13:06 -0400 Subject: [Bioperl-guts-l] [15632] bioperl-live/trunk/Bio/DB: fix for broken types() method Message-ID: <200904091713.n39HD6JH012881@dev.open-bio.org> Revision: 15632 Author: lstein Date: 2009-04-09 13:13:05 -0400 (Thu, 09 Apr 2009) Log Message: ----------- fix for broken types() method Modified Paths: -------------- bioperl-live/trunk/Bio/DB/GFF/Adaptor/memory.pm bioperl-live/trunk/Bio/DB/GFF/RelSegment.pm bioperl-live/trunk/Bio/DB/SeqFeature/Store/GFF3Loader.pm bioperl-live/trunk/Bio/DB/SeqFeature/Store/LoadHelper.pm bioperl-live/trunk/Bio/DB/SeqFeature/Store/Loader.pm Modified: bioperl-live/trunk/Bio/DB/GFF/Adaptor/memory.pm =================================================================== --- bioperl-live/trunk/Bio/DB/GFF/Adaptor/memory.pm 2009-04-09 04:45:21 UTC (rev 15631) +++ bioperl-live/trunk/Bio/DB/GFF/Adaptor/memory.pm 2009-04-09 17:13:05 UTC (rev 15632) @@ -549,7 +549,7 @@ sub get_types { my $self = shift; my ($srcseq,$class,$start,$stop,$want_count,$typelist) = @_; - + my(%result,%obj); for my $feature (@{$self->{data}}) { Modified: bioperl-live/trunk/Bio/DB/GFF/RelSegment.pm =================================================================== --- bioperl-live/trunk/Bio/DB/GFF/RelSegment.pm 2009-04-09 04:45:21 UTC (rev 15631) +++ bioperl-live/trunk/Bio/DB/GFF/RelSegment.pm 2009-04-09 17:13:05 UTC (rev 15632) @@ -912,7 +912,6 @@ @args = @_; } $self->factory->types(-ref => $ref, - -class => $class, -start=> $start, -stop => $stop, @args); Modified: bioperl-live/trunk/Bio/DB/SeqFeature/Store/GFF3Loader.pm =================================================================== --- bioperl-live/trunk/Bio/DB/SeqFeature/Store/GFF3Loader.pm 2009-04-09 04:45:21 UTC (rev 15631) +++ bioperl-live/trunk/Bio/DB/SeqFeature/Store/GFF3Loader.pm 2009-04-09 17:13:05 UTC (rev 15632) @@ -325,7 +325,8 @@ $self->msg(sprintf "%5.2fs\n",$self->time()-$start); } eval {$self->store->commit}; - delete $self->{load_data}; + # don't delete load data so that caller can ask for the loaded IDs + # $self->delete_load_data; } =item do_load @@ -948,7 +949,37 @@ return $self->{load_data}{Helper}->local2global(@_); } +=item local_ids + my $ids = $self->local_ids; + my $id_cnt = @$ids; + +After performing a load, this returns an array ref containing all the +load file IDs that were contained within the file just loaded. + +=cut + +sub local_ids { # override + my $self = shift; + return $self->{load_data}{Helper}->local_ids(@_); +} + +=item loaded_ids + + my $ids = $loader->loaded_ids; + my $id_cnt = @$ids; + +After performing a load, this returns an array ref containing all the +feature primary ids that were created during the load. + +=cut + +sub loaded_ids { # override + my $self = shift; + return $self->{load_data}{Helper}->loaded_ids(@_); +} + + 1; __END__ Modified: bioperl-live/trunk/Bio/DB/SeqFeature/Store/LoadHelper.pm =================================================================== --- bioperl-live/trunk/Bio/DB/SeqFeature/Store/LoadHelper.pm 2009-04-09 04:45:21 UTC (rev 15631) +++ bioperl-live/trunk/Bio/DB/SeqFeature/Store/LoadHelper.pm 2009-04-09 17:13:05 UTC (rev 15632) @@ -173,4 +173,18 @@ return ($parent,\@children); } +sub local_ids { + my $self = shift; + my @ids = keys %{$self->{Local2Global}} + if $self->{Local2Global}; + return \@ids; +} + +sub loaded_ids { + my $self = shift; + my @ids = values %{$self->{Local2Global}} + if $self->{Local2Global}; + return \@ids; +} + 1; Modified: bioperl-live/trunk/Bio/DB/SeqFeature/Store/Loader.pm =================================================================== --- bioperl-live/trunk/Bio/DB/SeqFeature/Store/Loader.pm 2009-04-09 04:45:21 UTC (rev 15631) +++ bioperl-live/trunk/Bio/DB/SeqFeature/Store/Loader.pm 2009-04-09 17:13:05 UTC (rev 15632) @@ -367,7 +367,9 @@ } eval {$self->store->commit}; $self->msg(sprintf "%5.2fs\n",$self->time()-$self->{load_data}{start_time}); - $self->delete_load_data; + + # don't delete load data so that caller can ask for the loaded IDs + # $self->delete_load_data; } =item do_load @@ -471,7 +473,6 @@ =cut - sub store_current_feature { my $self = shift; @@ -611,6 +612,40 @@ print STDERR @msg; } +=item loaded_ids + + my $ids = $loader->loaded_ids; + my $id_cnt = @$ids; + +After performing a load, this returns an array ref containing all the +feature primary ids that were created during the load. + +=cut + +sub loaded_ids { + my $self = shift; + my @ids = values %{$self->{load_data}{Local2GlobalID}} + if $self->{load_data}; + return \@ids; +} + +=item local_ids + + my $ids = $self->local_ids; + my $id_cnt = @$ids; + +After performing a load, this returns an array ref containing all the +load file IDs that were contained within the file just loaded. + +=cut + +sub local_ids { + my $self = shift; + my @ids = keys %{$self->{load_data}{Local2GlobalID}} + if $self->{load_data}; + return \@ids; +} + =item time my $time = $loader->time From maj at dev.open-bio.org Fri Apr 10 09:02:51 2009 From: maj at dev.open-bio.org (Mark Allen Jensen) Date: Fri, 10 Apr 2009 09:02:51 -0400 Subject: [Bioperl-guts-l] [15633] bioperl-dev/trunk/Bio/DB/HIV/HIVXmlSchema.pm: add code to locate . xsd files Message-ID: <200904101302.n3AD2pke018327@dev.open-bio.org> Revision: 15633 Author: maj Date: 2009-04-10 09:02:50 -0400 (Fri, 10 Apr 2009) Log Message: ----------- add code to locate .xsd files Modified Paths: -------------- bioperl-dev/trunk/Bio/DB/HIV/HIVXmlSchema.pm Modified: bioperl-dev/trunk/Bio/DB/HIV/HIVXmlSchema.pm =================================================================== --- bioperl-dev/trunk/Bio/DB/HIV/HIVXmlSchema.pm 2009-04-09 17:13:05 UTC (rev 15632) +++ bioperl-dev/trunk/Bio/DB/HIV/HIVXmlSchema.pm 2009-04-10 13:02:50 UTC (rev 15633) @@ -72,7 +72,7 @@ package Bio::DB::Query::HIVQuery; use strict; -use HIVXmlSchemaHelper; # fully qualify the ns when necessary +use Bio::DB::HIV::HIVXmlSchemaHelper; use XML::LibXML; use Log::Report; @@ -109,7 +109,7 @@ my $self = shift; my @ids = @_; my @hashes; - unless ($self->_run_option == 2) { + unless ($self->_run_level == 2) { $self->warn("Method requires that query be run at level 2"); return undef; } @@ -130,9 +130,8 @@ $guts = $wri->($doc, { 'annotHivqSeq' => [@hashes] }) }; if ($@) { - $@->reportAll; - exit(0); # handle XML::Compile::Schema error + $self->throw($@->reportAll); } else { $doc->addChild($guts); @@ -149,7 +148,7 @@ package Bio::DB::HIV; use strict; -use HIVXmlSchemaHelper; # fully qualify the ns when necessary +use Bio::DB::HIV::HIVXmlSchemaHelper; # fully qualify the ns when necessary use XML::LibXML::Reader; use XML::LibXML; use Bio::Phylo::Factory; @@ -271,17 +270,27 @@ use strict; use constant HIVNS => 'http://fortinbras.us/HIVDBSchema/1.0'; use constant NEXML => 'http://www.nexml.org/1.0'; - +use File::Spec; use XML::LibXML; use XML::Compile; use XML::Compile::Util qw( SCHEMA2001 SCHEMA2001i pack_type ); use Exporter; +use vars qw( $XSDDIR ); use base qw(XML::Compile::Schema Bio::Root::Root); BEGIN { our (@ISA, @EXPORT_OK); push @ISA, qw( Exporter ); - @EXPORT_OK = qw( HIVNS NEXML ); + + for (@INC) { + $XSDDIR = File::Spec->catdir($_,"Bio/DB/HIV"); + last if ( -d $XSDDIR ); + undef $XSDDIR; + } + + @EXPORT_OK = qw( HIVNS NEXML $XSDDIR ); + + } our @schemata = qw( @@ -311,7 +320,7 @@ my ($class, at args) = @_; my ($schema_dir,$XSC_args) = $class->SUPER::_rearrange([qw(SCHEMADIR,XSCARGS)], @args); my @XSDDIRs = ($schema_dir and ref($schema_dir) eq 'ARRAY') ? @$schema_dir : ($schema_dir); - my @XSDDIRS = (@INC, $schema_dir); + my @XSDDIRS = (@INC, $XSDDIR, $schema_dir); my $self = $class->SUPER::new([SCHEMA2001,SCHEMA2001i, @schemata], 'schema_dirs' => [@XSDDIRS], @$XSC_args); From bugzilla-daemon at portal.open-bio.org Sun Apr 12 06:54:30 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 12 Apr 2009 06:54:30 -0400 Subject: [Bioperl-guts-l] [Bug 2810] New: SeqUtils not copying SeqFeature tags - part 2 - revcom this time Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2810 Summary: SeqUtils not copying SeqFeature tags - part 2 - revcom this time Product: BioPerl Version: 1.6 branch Platform: Other OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Core Components AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: david_j at users.sourceforge.net CC: david_j at users.sourceforge.net BugsThisDependsOn: 2766 +++ This bug was initially created as a clone of Bug #2766 +++ Bio::SeqUtils->revcom_with_features The tags associated with a SeqFeature on a Seq are not copied to the new SeqFeature on the new reverse complemented Seq. This bug is in 1.6 and trunk but not in 1.5. I missed this when reporting 2766 where tags weren't copied for cat and trunc_with_features functions. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sun Apr 12 06:54:31 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 12 Apr 2009 06:54:31 -0400 Subject: [Bioperl-guts-l] [Bug 2766] SeqUtils not copying SeqFeature tags In-Reply-To: Message-ID: <200904121054.n3CAsVLo012671@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2766 david_j at users.sourceforge.net changed: What |Removed |Added ---------------------------------------------------------------------------- OtherBugsDependingO| |2810 nThis| | -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sun Apr 12 06:58:43 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 12 Apr 2009 06:58:43 -0400 Subject: [Bioperl-guts-l] [Bug 2810] SeqUtils not copying SeqFeature tags - part 2 - revcom this time In-Reply-To: Message-ID: <200904121058.n3CAwhWh012897@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2810 ------- Comment #1 from david_j at users.sourceforge.net 2009-04-12 06:58 EST ------- Created an attachment (id=1276) --> (http://bugzilla.open-bio.org/attachment.cgi?id=1276&action=view) Test case Not for adding to BioPerl test suite - just an independent file to allow easy confirmation that this is a regression from 1.5 in 1.6 and HEAD. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sun Apr 12 07:33:45 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 12 Apr 2009 07:33:45 -0400 Subject: [Bioperl-guts-l] [Bug 2810] SeqUtils not copying SeqFeature tags - part 2 - revcom this time In-Reply-To: Message-ID: <200904121133.n3CBXj95014931@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2810 ------- Comment #2 from david_j at users.sourceforge.net 2009-04-12 07:33 EST ------- Created an attachment (id=1277) --> (http://bugzilla.open-bio.org/attachment.cgi?id=1277&action=view) Patch to revision 15633. Patch to problem in Bio/SeqUtils.pm . Patch for testing this problem, and addition checks for previous 2766, in Bio/SeqUtils.pm . Patch for addition tag related tests in t/SeqFeature/SeqFeature.t created when I was tracking this problem down. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From maj at dev.open-bio.org Sun Apr 12 21:14:47 2009 From: maj at dev.open-bio.org (Mark Allen Jensen) Date: Sun, 12 Apr 2009 21:14:47 -0400 Subject: [Bioperl-guts-l] [15634] bioperl-dev/trunk/Bio/DB/HIV/HIVXmlSchema.pm: changed pod: Bio:: Phylo best repository link Message-ID: <200904130114.n3D1ElSA009212@dev.open-bio.org> Revision: 15634 Author: maj Date: 2009-04-12 21:14:46 -0400 (Sun, 12 Apr 2009) Log Message: ----------- changed pod: Bio::Phylo best repository link Modified Paths: -------------- bioperl-dev/trunk/Bio/DB/HIV/HIVXmlSchema.pm Modified: bioperl-dev/trunk/Bio/DB/HIV/HIVXmlSchema.pm =================================================================== --- bioperl-dev/trunk/Bio/DB/HIV/HIVXmlSchema.pm 2009-04-10 13:02:50 UTC (rev 15633) +++ bioperl-dev/trunk/Bio/DB/HIV/HIVXmlSchema.pm 2009-04-13 01:14:46 UTC (rev 15634) @@ -47,7 +47,8 @@ =head1 IMPLEMENTATION NOTES These routines depend on the NeXML parser/writers in the C package -by Rutger Vos. It can be obtained at L. +by Rutger Vos. It can be obtained at +L. XML manipulations here currently employ the C package of Petr Pajas and C and C of Mark Overbeek. These require the presence of C libraries, which can be obtained for many platforms at L From cjfields at dev.open-bio.org Wed Apr 15 11:33:49 2009 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Wed, 15 Apr 2009 11:33:49 -0400 Subject: [Bioperl-guts-l] [15636] bioperl-live/trunk/Bio/SimpleAlign.pm: [bug 2801] Message-ID: <200904151533.n3FFXned020899@dev.open-bio.org> Revision: 15636 Author: cjfields Date: 2009-04-15 11:33:48 -0400 (Wed, 15 Apr 2009) Log Message: ----------- [bug 2801] * don't return self silently; throw an exception when args are required Modified Paths: -------------- bioperl-live/trunk/Bio/SimpleAlign.pm Modified: bioperl-live/trunk/Bio/SimpleAlign.pm =================================================================== --- bioperl-live/trunk/Bio/SimpleAlign.pm 2009-04-14 19:11:13 UTC (rev 15635) +++ bioperl-live/trunk/Bio/SimpleAlign.pm 2009-04-15 15:33:48 UTC (rev 15636) @@ -1177,24 +1177,23 @@ Args : Array ref of types ('match'|'weak'|'strong'|'mismatch'|'gaps'| 'all_gaps_columns') or array ref where the referenced array contains a pair of integers that specify a range. - The first column is 0, - + The first column is 0 =cut sub remove_columns { - my ($self, at args) = @_; - @args || return $self; - my $aln; + my ($self, at args) = @_; + @args || $self->throw("Must supply column ranges or column types"); + my $aln; - if ($args[0][0] =~ /^[a-z_]+$/i) { - $aln = $self->_remove_columns_by_type($args[0]); - } elsif ($args[0][0] =~ /^\d+$/) { - $aln = $self->_remove_columns_by_num(\@args); - } else { - $self->throw("You must pass array references to remove_columns(), not @args"); - } + if ($args[0][0] =~ /^[a-z_]+$/i) { + $aln = $self->_remove_columns_by_type($args[0]); + } elsif ($args[0][0] =~ /^\d+$/) { + $aln = $self->_remove_columns_by_num(\@args); + } else { + $self->throw("You must pass array references to remove_columns(), not @args"); + } # fix for meta, sf, ann - $aln; + $aln; } From bugzilla-daemon at portal.open-bio.org Wed Apr 15 11:34:27 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 15 Apr 2009 11:34:27 -0400 Subject: [Bioperl-guts-l] [Bug 2801] SimpleAlig remove_columns In-Reply-To: Message-ID: <200904151534.n3FFYREu030388@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2801 cjfields at bioperl.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #1 from cjfields at bioperl.org 2009-04-15 11:34 EST ------- Yes, returning self is not what should occur. In this case it makes more sense to throw an exception; use of this method indicates arguments are required, so the user should be violently informed. I have committed a tentative fix to svn that passes tests. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Wed Apr 15 11:38:40 2009 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Wed, 15 Apr 2009 11:38:40 -0400 Subject: [Bioperl-guts-l] [15637] bioperl-live/trunk: [bug 2810] Message-ID: <200904151538.n3FFcekH020971@dev.open-bio.org> Revision: 15637 Author: cjfields Date: 2009-04-15 11:38:39 -0400 (Wed, 15 Apr 2009) Log Message: ----------- [bug 2810] * copy over tags properly from seqfeatures * patch courtesy of David Jackson Modified Paths: -------------- bioperl-live/trunk/Bio/SeqUtils.pm bioperl-live/trunk/t/SeqFeature/SeqFeature.t bioperl-live/trunk/t/SeqTools/SeqUtils.t Modified: bioperl-live/trunk/Bio/SeqUtils.pm =================================================================== --- bioperl-live/trunk/Bio/SeqUtils.pm 2009-04-15 15:33:48 UTC (rev 15636) +++ bioperl-live/trunk/Bio/SeqUtils.pm 2009-04-15 15:38:39 UTC (rev 15637) @@ -649,6 +649,9 @@ $newfeat->annotation->add_Annotation($key, $value); } } + foreach my $key ( $feat->get_all_tags() ) { + $newfeat->add_tag_value($key, $feat->get_tag_values($key)); + } if (@loc==1) { $newfeat->location($loc[0]) } else { Modified: bioperl-live/trunk/t/SeqFeature/SeqFeature.t =================================================================== --- bioperl-live/trunk/t/SeqFeature/SeqFeature.t 2009-04-15 15:33:48 UTC (rev 15636) +++ bioperl-live/trunk/t/SeqFeature/SeqFeature.t 2009-04-15 15:38:39 UTC (rev 15637) @@ -7,7 +7,7 @@ use lib '.'; use Bio::Root::Test; - test_begin(-tests => 214); + test_begin(-tests => 222); use_ok('Bio::Seq'); use_ok('Bio::SeqIO'); @@ -363,3 +363,17 @@ is ($sfseq->translate->subseq(1,10), $phase_check{$sfseq->subseq(1,10)}, 'phase check'); } +# tags +$sf->add_tag_value('note','n1'); +$sf->add_tag_value('note','n2'); +$sf->add_tag_value('comment','c1'); +is_deeply( [sort $sf->get_all_tags()], [sort qw(note comment)] , 'tags found'); +is_deeply( [sort $sf->get_tagset_values('note')], [sort qw(n1 n2)] , 'get_tagset_values tag values found'); +is_deeply( [sort $sf->get_tagset_values(qw(note comment))], [sort qw(c1 n1 n2)] , 'get_tagset_values tag values for multiple tags found'); +lives_ok { + is_deeply( [sort $sf->get_tag_values('note')], [sort qw(n1 n2)] , 'get_tag_values tag values found'); +} 'get_tag_values lives with tag'; +lives_ok { + is_deeply( [$sf->get_tagset_values('notag') ], [], 'get_tagset_values no tag values found'); +} 'get_tagset_values lives with no tag'; +throws_ok { $sf->get_tag_values('notag') } qr/tag value that does not exist/, 'get_tag_values throws with no tag'; Modified: bioperl-live/trunk/t/SeqTools/SeqUtils.t =================================================================== --- bioperl-live/trunk/t/SeqTools/SeqUtils.t 2009-04-15 15:33:48 UTC (rev 15636) +++ bioperl-live/trunk/t/SeqTools/SeqUtils.t 2009-04-15 15:38:39 UTC (rev 15637) @@ -5,9 +5,10 @@ BEGIN { use lib '.'; + use List::MoreUtils qw(uniq); use Bio::Root::Test; - test_begin(-tests => 43); + test_begin(-tests => 49); use_ok('Bio::PrimarySeq'); use_ok('Bio::SeqUtils'); @@ -193,7 +194,8 @@ -end => 3, -strand => 1, -primary => 'hotspot', - -tag => {note => ['note3a','note3b']}, + -tag => {note => ['note3a','note3b'], + comment => 'c1'}, ); $seq2->add_SeqFeature($ft2); @@ -203,11 +205,13 @@ ok (Bio::SeqUtils->cat($seq1, $seq2)); is $seq1->seq, 'aaaattttcccctttt'; is scalar $seq1->annotation->get_Annotations, 5; +is_deeply([uniq sort map{$_->get_all_tags}$seq1->get_SeqFeatures], [sort qw(note comment)], 'cat - has expected tags'); +is_deeply([sort map{$_->get_tagset_values('note')}$seq1->get_SeqFeatures], [sort qw(note2 note3a note3b)], 'cat - has expected tag values'); my @tags; lives_ok { @tags = map{$_->get_tag_values(q(note))}$seq1->get_SeqFeatures ; -} 'tags transfered (no throw)'; -cmp_ok(scalar(@tags),'==',3, 'tags transfered (correct count)') ; +} 'cat - note tag transfered (no throw)'; +cmp_ok(scalar(@tags),'==',3, 'cat - note tag values transfered (correct count)') ; my $protseq = Bio::PrimarySeq->new(-id => 2, -seq => 'MVTF'); # protein seq @@ -243,6 +247,7 @@ -end => 4, -strand => 1, -primary => 'source', + -tag => {note => 'note2'}, ); @@ -250,6 +255,8 @@ -end => 8, -strand => -1, -primary => 'hotspot', + -tag => {note => ['note3a','note3b'], + comment => 'c1'}, ); $seq2->add_SeqFeature($ft2); $seq2->add_SeqFeature($ft3); @@ -259,9 +266,13 @@ my @feat=$trunc->get_SeqFeatures; is $feat[0]->location->to_FTstring, '<1..3'; is $feat[1]->location->to_FTstring, 'complement(4..>6)'; +is_deeply([uniq sort map{$_->get_all_tags}$trunc->get_SeqFeatures], [sort qw(note comment)], 'trunc_with_features - has expected tags'); +is_deeply([sort map{$_->get_tagset_values('note')}$trunc->get_SeqFeatures], [sort qw(note2 note3a note3b)], 'trunc_with_features - has expected tag values'); my $revcom=Bio::SeqUtils->revcom_with_features($seq2); is $revcom->seq, 'ttttaacc'; my @revfeat=$revcom->get_SeqFeatures; is $revfeat[0]->location->to_FTstring, 'complement(5..8)'; is $revfeat[1]->location->to_FTstring, '1..4'; +is_deeply([uniq sort map{$_->get_all_tags}$revcom->get_SeqFeatures], [sort qw(note comment)], 'revcom_with_features - has expected tags'); +is_deeply([sort map{$_->get_tagset_values('note')}$revcom->get_SeqFeatures], [sort qw(note2 note3a note3b)], 'revcom_with_features - has expected tag values'); From bugzilla-daemon at portal.open-bio.org Wed Apr 15 11:38:55 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 15 Apr 2009 11:38:55 -0400 Subject: [Bioperl-guts-l] [Bug 2810] SeqUtils not copying SeqFeature tags - part 2 - revcom this time In-Reply-To: Message-ID: <200904151538.n3FFct87030796@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2810 cjfields at bioperl.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #3 from cjfields at bioperl.org 2009-04-15 11:38 EST ------- Patch committed to svn. Thanks David! -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Apr 15 11:44:01 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 15 Apr 2009 11:44:01 -0400 Subject: [Bioperl-guts-l] [Bug 2735] bp_load_gff.pl will not shut up In-Reply-To: Message-ID: <200904151544.n3FFi1BE031318@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2735 cjfields at bioperl.org changed: What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|1.6.1 point release |1.6.x point release ------- Comment #3 from cjfields at bioperl.org 2009-04-15 11:44 EST ------- Dan, I'm bumping this to 1.6.x to include it in with the GFF refactor proposed here: http://www.bioperl.org/wiki/GFF_Refactor It may be pushed further along to 1.7 depending on how in-depth the refactor is (i.e. whether it requires a significant API change in already-existing modules) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Thu Apr 16 12:26:33 2009 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Thu, 16 Apr 2009 12:26:33 -0400 Subject: [Bioperl-guts-l] [15638] bioperl-live/trunk/Bio/Matrix/IO/phylip.pm: [bug 2800] Message-ID: <200904161626.n3GGQXO4029570@dev.open-bio.org> Revision: 15638 Author: cjfields Date: 2009-04-16 12:26:32 -0400 (Thu, 16 Apr 2009) Log Message: ----------- [bug 2800] * don't use $_ unless it's actually being used * patch courtesy of Wei Zou Modified Paths: -------------- bioperl-live/trunk/Bio/Matrix/IO/phylip.pm Modified: bioperl-live/trunk/Bio/Matrix/IO/phylip.pm =================================================================== --- bioperl-live/trunk/Bio/Matrix/IO/phylip.pm 2009-04-15 15:38:39 UTC (rev 15637) +++ bioperl-live/trunk/Bio/Matrix/IO/phylip.pm 2009-04-16 16:26:32 UTC (rev 15638) @@ -122,7 +122,7 @@ my $size = 0; while ($entry=$self->_readline) { if($#names >=0 && $entry =~/^\s+\d+\n$/){ - $self->_pushback($_); + $self->_pushback($entry); last; } elsif($entry=~/^\s+(\d+)\n$/){ $size = $1; From bugzilla-daemon at portal.open-bio.org Thu Apr 16 12:26:51 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 16 Apr 2009 12:26:51 -0400 Subject: [Bioperl-guts-l] [Bug 2800] Bio::Matrix::IO::phylip warn "The number of entries N is not the same 0" when parsing multiple data sets In-Reply-To: Message-ID: <200904161626.n3GGQpZg008146@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2800 cjfields at bioperl.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #2 from cjfields at bioperl.org 2009-04-16 12:26 EST ------- Patch committed to svn. thanks! -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Thu Apr 16 12:39:30 2009 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Thu, 16 Apr 2009 12:39:30 -0400 Subject: [Bioperl-guts-l] [15639] bioperl-live/trunk/Bio/Root/Utilities.pm: [bug 2737] Message-ID: <200904161639.n3GGdUrb029626@dev.open-bio.org> Revision: 15639 Author: cjfields Date: 2009-04-16 12:39:30 -0400 (Thu, 16 Apr 2009) Log Message: ----------- [bug 2737] * make warning a little more informative Modified Paths: -------------- bioperl-live/trunk/Bio/Root/Utilities.pm Modified: bioperl-live/trunk/Bio/Root/Utilities.pm =================================================================== --- bioperl-live/trunk/Bio/Root/Utilities.pm 2009-04-16 16:26:32 UTC (rev 15638) +++ bioperl-live/trunk/Bio/Root/Utilities.pm 2009-04-16 16:39:30 UTC (rev 15639) @@ -1245,7 +1245,7 @@ if (scalar @exes) { $exe = $exes[0]; if (defined $exes[1]) { - $self->warn("find_exe: Multiple paths to '$name' found. Using first."); + $self->warn("find_exe: Multiple paths to '$name' found. Using $exe."); } } } From bugzilla-daemon at portal.open-bio.org Thu Apr 16 12:39:51 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 16 Apr 2009 12:39:51 -0400 Subject: [Bioperl-guts-l] [Bug 2737] t/Root/Utilities test is fooled by softlinks to /usr/bin/gunzip In-Reply-To: Message-ID: <200904161639.n3GGdpAM009125@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2737 cjfields at bioperl.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #4 from cjfields at bioperl.org 2009-04-16 12:39 EST ------- Making an executive decision. The warning will stay, but will be slightly more informative (i.e. indicate exactly what executable is being used). The warning is minor, we can downgrade it to debugging if there are enough complaints. Closing out... -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 16 12:41:45 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 16 Apr 2009 12:41:45 -0400 Subject: [Bioperl-guts-l] [Bug 2753] Unknown format of PAML output did not see seqtype in Yn00 In-Reply-To: Message-ID: <200904161641.n3GGfjJC009346@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2753 ------- Comment #1 from cjfields at bioperl.org 2009-04-16 12:41 EST ------- We can't fix the issue without example data to replicate the problem. We'll need that or I will have to close out the bug report. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 16 12:45:43 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 16 Apr 2009 12:45:43 -0400 Subject: [Bioperl-guts-l] [Bug 2514] Run tests with binaries installed In-Reply-To: Message-ID: <200904161645.n3GGjhMX009680@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2514 cjfields at bioperl.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #5 from cjfields at bioperl.org 2009-04-16 12:45 EST ------- Closing out. Appears to be resolved. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 16 12:46:49 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 16 Apr 2009 12:46:49 -0400 Subject: [Bioperl-guts-l] [Bug 2498] Add HSP sorting to Bio::Search::Hit In-Reply-To: Message-ID: <200904161646.n3GGknNF009814@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2498 cjfields at bioperl.org changed: What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|1.6 release |1.6.x point release ------- Comment #5 from cjfields at bioperl.org 2009-04-16 12:46 EST ------- Pushing to later in the branch. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 16 12:47:24 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 16 Apr 2009 12:47:24 -0400 Subject: [Bioperl-guts-l] [Bug 2575] Easy request - a frac_identical_total column in HSPTableWriter.pm? In-Reply-To: Message-ID: <200904161647.n3GGlOKE009863@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2575 cjfields at bioperl.org changed: What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|1.6 release |1.6.1 point release ------- Comment #2 from cjfields at bioperl.org 2009-04-16 12:47 EST ------- Will attempt to add this for 1.6.1. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Thu Apr 16 13:25:09 2009 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Thu, 16 Apr 2009 13:25:09 -0400 Subject: [Bioperl-guts-l] [15640] bioperl-live/trunk: [bug RT 44536] Message-ID: <200904161725.n3GHP99V029724@dev.open-bio.org> Revision: 15640 Author: cjfields Date: 2009-04-16 13:25:09 -0400 (Thu, 16 Apr 2009) Log Message: ----------- [bug RT 44536] * support for UniProt/UniProtKB * tests and data added Modified Paths: -------------- bioperl-live/trunk/Bio/SeqIO/Handler/GenericRichSeqHandler.pm bioperl-live/trunk/Bio/SeqIO/genbank.pm bioperl-live/trunk/t/SeqIO/Handler.t bioperl-live/trunk/t/SeqIO/genbank.t Added Paths: ----------- bioperl-live/trunk/t/data/P39765.gb Modified: bioperl-live/trunk/Bio/SeqIO/Handler/GenericRichSeqHandler.pm =================================================================== --- bioperl-live/trunk/Bio/SeqIO/Handler/GenericRichSeqHandler.pm 2009-04-16 16:39:30 UTC (rev 15639) +++ bioperl-live/trunk/Bio/SeqIO/Handler/GenericRichSeqHandler.pm 2009-04-16 17:25:09 UTC (rev 15640) @@ -874,7 +874,7 @@ my $annotation = $self->annotation_collection; # deal with swissprot dbsources # we could possibly parcel these out to subhandlers... - if( $dbsource =~ s/(UniProtKB|swissprot):\s+locus\s+(\S+)\,.+\n// ) { + if( $dbsource =~ s/(UniProt(?:KB)|swissprot):\s+locus\s+(\S+)\,.+\n// ) { $annotation->add_Annotation ('dblink', Bio::Annotation::DBLink->new Modified: bioperl-live/trunk/Bio/SeqIO/genbank.pm =================================================================== --- bioperl-live/trunk/Bio/SeqIO/genbank.pm 2009-04-16 16:39:30 UTC (rev 15639) +++ bioperl-live/trunk/Bio/SeqIO/genbank.pm 2009-04-16 17:25:09 UTC (rev 15640) @@ -474,7 +474,7 @@ $dbsource .= $_; } # deal with UniProKB dbsources - if( $dbsource =~ s/(UniProtKB|swissprot):\s+locus\s+(\S+)\,.+\n// ) { + if( $dbsource =~ s/(UniProt(?:KB)?|swissprot):\s+locus\s+(\S+)\,.+\n// ) { $annotation->add_Annotation ('dblink', Bio::Annotation::DBLink->new Modified: bioperl-live/trunk/t/SeqIO/Handler.t =================================================================== --- bioperl-live/trunk/t/SeqIO/Handler.t 2009-04-16 16:39:30 UTC (rev 15639) +++ bioperl-live/trunk/t/SeqIO/Handler.t 2009-04-16 17:25:09 UTC (rev 15640) @@ -7,7 +7,7 @@ use lib '.'; use Bio::Root::Test; - test_begin(-tests => 550); + test_begin(-tests => 561); use_ok('Bio::SeqIO'); } @@ -1082,3 +1082,28 @@ 'Acetobacter;Acetobacteraceae;Rhodospirillales;Alphaproteobacteria;'. 'Proteobacteria;Bacteria'); } + +# test for GenBank swissprot/UniProt/UniProtKB DBSOURCE line (Bug : RT 44536) +$ast = Bio::SeqIO->new(-format => 'genbank', + -verbose => $verbose, + -file => test_input_file('P39765.gb')); +$ast->verbose($verbose); +$as = $ast->next_seq(); +is $as->molecule, 'linear',$as->accession_number;; +is $as->alphabet, 'protein'; +# Though older GenBank releases indicate SOURCE contains only the common name, +# this is no longer true. In general, this line will contain an abbreviated +# form of the full organism name (but may contain the full length name), +# as well as the optional common name and organelle. There is no get/set +# for the abbreviated name but it is accessible via name() +ok defined($as->species->name('abbreviated')->[0]); +is $as->species->name('abbreviated')->[0], 'Bacillus subtilis'; +is($as->primary_id, 20141743); +$ac = $as->annotation; +ok defined $ac; + at dblinks = $ac->get_Annotations('dblink'); +is(scalar @dblinks,31); +is($dblinks[0]->database, 'UniProtKB'); +is($dblinks[0]->primary_id, 'PYRR_BACSU'); +is($dblinks[0]->version, undef); +is($dblinks[0]->display_text, 'UniProtKB:PYRR_BACSU','operator overloading in AnnotationI is deprecated'); Modified: bioperl-live/trunk/t/SeqIO/genbank.t =================================================================== --- bioperl-live/trunk/t/SeqIO/genbank.t 2009-04-16 16:39:30 UTC (rev 15639) +++ bioperl-live/trunk/t/SeqIO/genbank.t 2009-04-16 17:25:09 UTC (rev 15640) @@ -7,7 +7,7 @@ use lib '.'; use Bio::Root::Test; - test_begin(-tests => 249); + test_begin(-tests => 260); use_ok('Bio::SeqIO::genbank'); } @@ -536,3 +536,28 @@ } else { ok(0, "Roundtrip test failed"); } + +# test for swissprot/UniProt/UniProtKB DBSOURCE line (Bug : RT 44536) +$ast = Bio::SeqIO->new(-format => 'genbank', + -verbose => $verbose, + -file => test_input_file('P39765.gb')); +$ast->verbose($verbose); +$as = $ast->next_seq(); +is $as->molecule, 'linear',$as->accession_number;; +is $as->alphabet, 'protein'; +# Though older GenBank releases indicate SOURCE contains only the common name, +# this is no longer true. In general, this line will contain an abbreviated +# form of the full organism name (but may contain the full length name), +# as well as the optional common name and organelle. There is no get/set +# for the abbreviated name but it is accessible via name() +ok defined($as->species->name('abbreviated')->[0]); +is $as->species->name('abbreviated')->[0], 'Bacillus subtilis'; +is($as->primary_id, 20141743); +$ac = $as->annotation; +ok defined $ac; + at dblinks = $ac->get_Annotations('dblink'); +is(scalar @dblinks,31); +is($dblinks[0]->database, 'UniProtKB'); +is($dblinks[0]->primary_id, 'PYRR_BACSU'); +is($dblinks[0]->version, undef); +is($dblinks[0]->display_text, 'UniProtKB:PYRR_BACSU','operator overloading in AnnotationI is deprecated'); Added: bioperl-live/trunk/t/data/P39765.gb =================================================================== --- bioperl-live/trunk/t/data/P39765.gb (rev 0) +++ bioperl-live/trunk/t/data/P39765.gb 2009-04-16 17:25:09 UTC (rev 15640) @@ -0,0 +1,384 @@ +LOCUS P39765 181 aa linear BCT 03-MAR-2009 +DEFINITION RecName: Full=Bifunctional protein pyrR; Includes: RecName: + Full=Pyrimidine operon regulatory protein; Includes: RecName: + Full=Uracil phosphoribosyltransferase; Short=UPRTase. +ACCESSION P39765 +VERSION P39765.2 GI:20141743 +DBSOURCE UniProtKB: locus PYRR_BACSU, accession P39765; + class: standard. + extra accessions:P25982,Q45483 + created: Feb 1, 1995. + sequence updated: Jan 31, 2002. + annotation updated: Mar 3, 2009. + xrefs: M59757.2, AAA21265.2, Z99112.2, CAB13421.1, U48870.1, + AAB57770.1, B57986, NP_389430.1, 1A3C_A, 1A4X_A, 1A4X_B + xrefs (non-sequence databases): PDBsum:1A3C, PDBsum:1A4X, + GeneID:938030, GenomeReviews:AL009126_GR, KEGG:bsu:BSU15470, + NMPDR:fig|224308.1.peg.1549, SubtiList:BG10712, HOGENOM:P39765, + BioCyc:BSUB224308:BSU1549-MON, BRENDA:2.4.2.9, GO:0003723, + GO:0004845, GO:0009116, GO:0006355, GO:0006353, HAMAP:MF_01219, + InterPro:IPR000836, Pfam:PF00156, PROSITE:PS00103 +KEYWORDS 3D-structure; Complete proteome; Glycosyltransferase; RNA-binding; + Transcription; Transcription regulation; Transcription termination; + Transferase. +SOURCE Bacillus subtilis + ORGANISM Bacillus subtilis + Bacteria; Firmicutes; Bacillales; Bacillaceae; Bacillus. +REFERENCE 1 (residues 1 to 181) + AUTHORS Quinn,C.L., Stephenson,B.T. and Switzer,R.L. + TITLE Functional organization and nucleotide sequence of the Bacillus + subtilis pyrimidine biosynthetic operon + JOURNAL J. Biol. Chem. 266 (14), 9113-9127 (1991) + PUBMED 1709162 + REMARK NUCLEOTIDE SEQUENCE [GENOMIC DNA]. + STRAIN=1A610, and JH861 +REFERENCE 2 (residues 1 to 181) + AUTHORS Turner,R.J., Lu,Y. and Switzer,R.L. + TITLE Regulation of the Bacillus subtilis pyrimidine biosynthetic (pyr) + gene cluster by an autogenous transcriptional attenuation mechanism + JOURNAL J. Bacteriol. 176 (12), 3708-3722 (1994) + PUBMED 8206849 + REMARK SEQUENCE REVISION, AND CHARACTERIZATION. +REFERENCE 3 (residues 1 to 181) + AUTHORS Switzer,R.L. + TITLE Direct Submission + JOURNAL Submitted (??-MAY-1999) + REMARK SEQUENCE REVISION TO 34 AND 53. +REFERENCE 4 (residues 1 to 181) + AUTHORS Kunst,F., Ogasawara,N., Moszer,I., Albertini,A.M., Alloni,G., + Azevedo,V., Bertero,M.G., Bessieres,P., Bolotin,A., Borchert,S., + Borriss,R., Boursier,L., Brans,A., Braun,M., Brignell,S.C., + Bron,S., Brouillet,S., Bruschi,C.V., Caldwell,B., Capuano,V., + Carter,N.M., Choi,S.-K., Codani,J.-J., Connerton,I.F., + Cummings,N.J., Daniel,R.A., Denizot,F., Devine,K.M., + Duesterhoeft,A., Ehrlich,S.D., Emmerson,P.T., Entian,K.-D., + Errington,J., Fabret,C., Ferrari,E., Foulger,D., Fritz,C., + Fujita,M., Fujita,Y., Fuma,S., Galizzi,A., Galleron,N., Ghim,S.-Y., + Glaser,P., Goffeau,A., Golightly,E.J., Grandi,G., Guiseppi,G., + Guy,B.J., Haga,K., Haiech,J., Harwood,C.R., Henaut,A., Hilbert,H., + Holsappel,S., Hosono,S., Hullo,M.-F., Itaya,M., Jones,L.-M., + Joris,B., Karamata,D., Kasahara,Y., Klaerr-Blanchard,M., Klein,C., + Kobayashi,Y., Koetter,P., Koningstein,G., Krogh,S., Kumano,M., + Kurita,K., Lapidus,A., Lardinois,S., Lauber,J., Lazarevic,V., + Lee,S.-M., Levine,A., Liu,H., Masuda,S., Mauel,C., Medigue,C., + Medina,N., Mellado,R.P., Mizuno,M., Moestl,D., Nakai,S., Noback,M., + Noone,D., O'Reilly,M., Ogawa,K., Ogiwara,A., Oudega,B., Park,S.-H., + Parro,V., Pohl,T.M., Portetelle,D., Porwollik,S., Prescott,A.M., + Presecan,E., Pujic,P., Purnelle,B., Rapoport,G., Rey,M., + Reynolds,S., Rieger,M., Rivolta,C., Rocha,E., Roche,B., Rose,M., + Sadaie,Y., Sato,T., Scanlan,E., Schleich,S., Schroeter,R., + Scoffone,F., Sekiguchi,J., Sekowska,A., Seror,S.J., Serror,P., + Shin,B.-S., Soldo,B., Sorokin,A., Tacconi,E., Takagi,T., + Takahashi,H., Takemaru,K., Takeuchi,M., Tamakoshi,A., Tanaka,T., + Terpstra,P., Tognoni,A., Tosato,V., Uchiyama,S., Vandenbol,M., + Vannier,F., Vassarotti,A., Viari,A., Wambutt,R., Wedler,E., + Wedler,H., Weitzenegger,T., Winters,P., Wipat,A., Yamamoto,H., + Yamane,K., Yasumoto,K., Yata,K., Yoshida,K., Yoshikawa,H.-F., + Zumstein,E., Yoshikawa,H. and Danchin,A. + TITLE The complete genome sequence of the gram-positive bacterium + Bacillus subtilis + JOURNAL Nature 390 (6657), 249-256 (1997) + PUBMED 9384377 + REMARK NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA]. + STRAIN=168 +REFERENCE 5 (residues 1 to 181) @@ Diff output truncated at 10000 characters. @@ From cjfields at dev.open-bio.org Thu Apr 16 16:17:27 2009 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Thu, 16 Apr 2009 16:17:27 -0400 Subject: [Bioperl-guts-l] [15641] bioperl-live/trunk/Bio/SearchIO/Writer/GbrowseGFF.pm: [bug RT 44782 ] Message-ID: <200904162017.n3GKHRam030454@dev.open-bio.org> Revision: 15641 Author: cjfields Date: 2009-04-16 16:17:27 -0400 (Thu, 16 Apr 2009) Log Message: ----------- [bug RT 44782] * evalue now caught * patch courtesy Allen Day Modified Paths: -------------- bioperl-live/trunk/Bio/SearchIO/Writer/GbrowseGFF.pm Modified: bioperl-live/trunk/Bio/SearchIO/Writer/GbrowseGFF.pm =================================================================== --- bioperl-live/trunk/Bio/SearchIO/Writer/GbrowseGFF.pm 2009-04-16 17:25:09 UTC (rev 15640) +++ bioperl-live/trunk/Bio/SearchIO/Writer/GbrowseGFF.pm 2009-04-16 20:17:27 UTC (rev 15641) @@ -107,7 +107,7 @@ ($self->{'_evalue'}, $self->{'_cigar'}, $self->{'_prefix'}, - $self->{'signif'} ) = $self->_rearrange([qw(E_VALUE OUTPUT_CIGAR PREFIX + $self->{'_signif'} ) = $self->_rearrange([qw(E_VALUE OUTPUT_CIGAR PREFIX OUTPUT_SIGNIF)], @args); $self->{'_evalue'} && warn( "Use of the -e_value argument is deprecated.\nIn future, use \$writer->filter(\"type\", \&code) instead.\n\tparsing will proceed correctly with this e_value\n"); $self->{Gbrowse_HSPID} = 0; From cjfields at dev.open-bio.org Thu Apr 16 16:25:42 2009 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Thu, 16 Apr 2009 16:25:42 -0400 Subject: [Bioperl-guts-l] [15642] bioperl-live/trunk/Bio/DB/SeqFeature/Store/FeatureFileLoader.pm: [ bug RT 44535] Message-ID: <200904162025.n3GKPgOp030487@dev.open-bio.org> Revision: 15642 Author: cjfields Date: 2009-04-16 16:25:42 -0400 (Thu, 16 Apr 2009) Log Message: ----------- [bug RT 44535] * strand '.' == 0 * patch courtesy Cathy Gresham Modified Paths: -------------- bioperl-live/trunk/Bio/DB/SeqFeature/Store/FeatureFileLoader.pm Modified: bioperl-live/trunk/Bio/DB/SeqFeature/Store/FeatureFileLoader.pm =================================================================== --- bioperl-live/trunk/Bio/DB/SeqFeature/Store/FeatureFileLoader.pm 2009-04-16 20:17:27 UTC (rev 15641) +++ bioperl-live/trunk/Bio/DB/SeqFeature/Store/FeatureFileLoader.pm 2009-04-16 20:25:42 UTC (rev 15642) @@ -639,6 +639,7 @@ -strand => $strand eq '+' ? 1 :$strand eq '-' ? -1 :$strand eq '' ? 0 + :$strand eq '.' ? 0 :$strand == 1 ? 1 :$strand == -1 ? -1 :0, From bugzilla-daemon at portal.open-bio.org Thu Apr 16 19:58:01 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 16 Apr 2009 19:58:01 -0400 Subject: [Bioperl-guts-l] [Bug 2753] Unknown format of PAML output did not see seqtype in Yn00 In-Reply-To: Message-ID: <200904162358.n3GNw1tX005893@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2753 cjfields at bioperl.org changed: What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|1.6.1 point release |1.6.x point release ------- Comment #2 from cjfields at bioperl.org 2009-04-16 19:58 EST ------- Pushing to a later release (per my last comments and to wait on respondent). -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 16 20:00:10 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 16 Apr 2009 20:00:10 -0400 Subject: [Bioperl-guts-l] [Bug 2505] Add in a sort feature for SeqFeatureI get_all_tags() In-Reply-To: Message-ID: <200904170000.n3H00AaQ006108@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2505 cjfields at bioperl.org changed: What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|1.6 release |1.6.x point release ------- Comment #1 from cjfields at bioperl.org 2009-04-16 20:00 EST ------- push to 1.6.x -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Thu Apr 16 21:43:48 2009 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Thu, 16 Apr 2009 21:43:48 -0400 Subject: [Bioperl-guts-l] [15643] bioperl-live/trunk: [bug 2575] Message-ID: <200904170143.n3H1hmTL031927@dev.open-bio.org> Revision: 15643 Author: cjfields Date: 2009-04-16 21:43:48 -0400 (Thu, 16 Apr 2009) Log Message: ----------- [bug 2575] * added frac_identical_total, frac_conserved_total to end of table (so not to break scripts) * moving GbrowseGFF to Writer Modified Paths: -------------- bioperl-live/trunk/Bio/SearchIO/Writer/HSPTableWriter.pm Added Paths: ----------- bioperl-live/trunk/t/SearchIO/Writer/GbrowseGFF.t bioperl-live/trunk/t/SearchIO/Writer/HSPTableWriter.t Removed Paths: ------------- bioperl-live/trunk/t/SearchIO/GbrowseGFF.t Modified: bioperl-live/trunk/Bio/SearchIO/Writer/HSPTableWriter.pm =================================================================== --- bioperl-live/trunk/Bio/SearchIO/Writer/HSPTableWriter.pm 2009-04-16 20:25:42 UTC (rev 15642) +++ bioperl-live/trunk/Bio/SearchIO/Writer/HSPTableWriter.pm 2009-04-17 01:43:48 UTC (rev 15643) @@ -103,6 +103,8 @@ frame # Reading frame of the aligned query sequence hit_description # Full description of the hit sequence query_description # Full description of the query sequence + frac_identical_total # fraction of total identical substitutions + frac_conserved_total # fraction of total conserved substitutions For more details about these columns, see the documentation for the corresponding method in Bio::Search::HSP::HSPI. @@ -213,6 +215,8 @@ 'frame_query' => ['26', 'hsp', 'frame/query', 's', 'FRAME_Q'], 'hit_description' => ['27', 'hit', 'hit_description', 's', 'DESC_H'], 'query_description' => ['28', 'result', 'query_description', 's', 'DESC_Q'], + 'frac_identical_total' => ['29', 'hsp', 'frac_identical/total', '.2f', 'FR_IDT'], + 'frac_conserved_total' => ['30', 'hsp', 'frac_conserved/total', '.2f', 'FR_CNT'], ); sub column_map { return %column_map } @@ -257,7 +261,7 @@ while(my $hsp = $hit->next_hsp) { next if ( defined $hspfilter && ! &{$hspfilter}($hsp)); my @row_data = &{$func_ref}($result, $hit, $hsp); - $str .= sprintf "$printf_fmt\n", @row_data; + $str .= sprintf("$printf_fmt\n", map {$_ || ($printf_fmt eq 's' ? '' : 0)} @row_data); } } } Deleted: bioperl-live/trunk/t/SearchIO/GbrowseGFF.t =================================================================== --- bioperl-live/trunk/t/SearchIO/GbrowseGFF.t 2009-04-16 20:25:42 UTC (rev 15642) +++ bioperl-live/trunk/t/SearchIO/GbrowseGFF.t 2009-04-17 01:43:48 UTC (rev 15643) @@ -1,25 +0,0 @@ -# -*-Perl-*- Test Harness script for Bioperl -# $Id$ - -use strict; - -BEGIN { - use lib '.'; - use Bio::Root::Test; - - test_begin(-tests => 3); - - use_ok('Bio::SearchIO'); -} - -my $in = Bio::SearchIO->new(-format => 'blast', - -file => test_input_file('brassica_ATH.WUBLASTN')); -my $out = Bio::SearchIO->new(-output_format => 'GbrowseGFF', - -prefix => 'Sequence', - -output_cigar => 1, - -output_signif => 1, - -file => ">".test_output_file()); -ok($out); -while( my $r = $in->next_result ) { - ok($out->write_result($r)); -} Copied: bioperl-live/trunk/t/SearchIO/Writer/GbrowseGFF.t (from rev 15634, bioperl-live/trunk/t/SearchIO/GbrowseGFF.t) =================================================================== --- bioperl-live/trunk/t/SearchIO/Writer/GbrowseGFF.t (rev 0) +++ bioperl-live/trunk/t/SearchIO/Writer/GbrowseGFF.t 2009-04-17 01:43:48 UTC (rev 15643) @@ -0,0 +1,32 @@ +# -*-Perl-*- Test Harness script for Bioperl +# $Id$ + +use strict; + +BEGIN { + use lib '.'; + use Bio::Root::Test; + + test_begin(-tests => 4); + + use_ok('Bio::SearchIO'); +} + +my $in = Bio::SearchIO->new(-format => 'blast', + -file => test_input_file('brassica_ATH.WUBLASTN')); + +my $outfile = test_output_file(); + +my $out = Bio::SearchIO->new(-output_format => 'GbrowseGFF', + -prefix => 'Sequence', + -output_cigar => 1, + -output_signif => 1, + -file => ">$outfile"); +ok($out); +while( my $r = $in->next_result ) { + ok($out->write_result($r)); +} + +ok(-s $outfile); + +# tests checking file output? Added: bioperl-live/trunk/t/SearchIO/Writer/HSPTableWriter.t =================================================================== --- bioperl-live/trunk/t/SearchIO/Writer/HSPTableWriter.t (rev 0) +++ bioperl-live/trunk/t/SearchIO/Writer/HSPTableWriter.t 2009-04-17 01:43:48 UTC (rev 15643) @@ -0,0 +1,48 @@ +# -*-Perl-*- Test Harness script for Bioperl +# $Id$ + +use strict; + +BEGIN { + use lib '.'; + use Bio::Root::Test; + + test_begin(-tests => 8); + + use_ok('Bio::SearchIO'); + use_ok('Bio::SearchIO::Writer::HSPTableWriter'); +} + +my ($searchio, $result, $hit, $hsp); + +$searchio = Bio::SearchIO->new('-format' => 'blast', + '-file' => test_input_file('HUMBETGLOA.tblastx')); + +$result = $searchio->next_result; + +isa_ok($result,'Bio::Search::Result::ResultI'); +$hit = $result->next_hit; +is($hit->accession, 'AE000479'); +is($hit->bits, 33.6); +$hsp = $hit->next_hsp; +is($hit->hsp->bits,$hsp->bits); +isa_ok($hsp->get_aln,'Bio::Align::AlignI'); + +my $writer = Bio::SearchIO::Writer::HSPTableWriter->new( +-columns => [qw( + query_name + query_length + hit_name + hit_length + rank + frac_identical_query + expect + )] ); + +my $outfile = test_output_file(); +my $out = Bio::SearchIO->new(-writer => $writer, + -file => ">$outfile"); +$out->write_result($result, 1); +ok(-s $outfile); + +# tests checking file output? Property changes on: bioperl-live/trunk/t/SearchIO/Writer/HSPTableWriter.t ___________________________________________________________________ Name: svn:executable + * Name: svn:mime-type + text/x-perl Name: svn:keywords + "Author Date Id Rev URL" Name: svn:eol-style + native From bugzilla-daemon at portal.open-bio.org Thu Apr 16 21:44:15 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 16 Apr 2009 21:44:15 -0400 Subject: [Bioperl-guts-l] [Bug 2575] Easy request - a frac_identical_total column in HSPTableWriter.pm? In-Reply-To: Message-ID: <200904170144.n3H1iFn5012914@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2575 cjfields at bioperl.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #3 from cjfields at bioperl.org 2009-04-16 21:44 EST ------- Janet I've added frac_identical_total and frac_conserved_total to HSPTableWriter, but I've added them to the end of the columns so it doesn't affect scripts relying on the order. I believe you can reorder the columns as you need, though. Closing out. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Thu Apr 16 22:16:07 2009 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Thu, 16 Apr 2009 22:16:07 -0400 Subject: [Bioperl-guts-l] [15644] bioperl-live/trunk/Bio/Search/SearchUtils.pm: [bug RT 36480] Message-ID: <200904170216.n3H2G7IG032035@dev.open-bio.org> Revision: 15644 Author: cjfields Date: 2009-04-16 22:16:07 -0400 (Thu, 16 Apr 2009) Log Message: ----------- [bug RT 36480] * small fix for SearchUtils to deal with overlap issues with ambiguous_aln * fix courtesy of C?\195?\169dric Cabau Modified Paths: -------------- bioperl-live/trunk/Bio/Search/SearchUtils.pm Modified: bioperl-live/trunk/Bio/Search/SearchUtils.pm =================================================================== --- bioperl-live/trunk/Bio/Search/SearchUtils.pm 2009-04-17 01:43:48 UTC (rev 15643) +++ bioperl-live/trunk/Bio/Search/SearchUtils.pm 2009-04-17 02:16:07 UTC (rev 15644) @@ -197,13 +197,13 @@ $hit_len_aln += $hsp->length; ## Collect contigs in the query sequence. - $qoverlap = &_adjust_contigs('query', $hsp, $qstart, $qstop, + $qoverlap += &_adjust_contigs('query', $hsp, $qstart, $qstop, \@qcontigs, $max_overlap, $frame, $qstrand); ## Collect contigs in the sbjct sequence # (needed for domain data and gapped Blast). - $soverlap = &_adjust_contigs('sbjct', $hsp, $sstart, $sstop, + $soverlap += &_adjust_contigs('sbjct', $hsp, $sstart, $sstop, \@scontigs, $max_overlap, $frame, $sstrand); From bugzilla-daemon at portal.open-bio.org Thu Apr 16 22:53:14 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 16 Apr 2009 22:53:14 -0400 Subject: [Bioperl-guts-l] [Bug 2739] using nonstandard characters in protein alignment results in residue count warning In-Reply-To: Message-ID: <200904170253.n3H2rELN016773@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2739 cjfields at bioperl.org changed: What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|1.6.1 point release |1.6.x point release ------- Comment #4 from cjfields at bioperl.org 2009-04-16 22:53 EST ------- Janet, I'll look into changing this so it doesn't affect things globally but it relates to LocatableSeq refactoring and use of global package variables, so it'll be pushed back a small bit. However, you can work around the issue by explicitly calling Bio::LocatableSeq and amending/resetting the specific package variable directly: #!/usr/bin/perl use warnings; use strict; use Bio::AlignIO; use Bio::LocatableSeq; $Bio::LocatableSeq::GAP_SYMBOLS .= '\*\?'; foreach my $infile (@ARGV) { my $outfile = "$infile" . ".out"; my $in = Bio::AlignIO->new(-file => $infile , -format => 'fasta'); my $out = Bio::AlignIO->new(-file => ">$outfile" , -format => 'fasta'); while (my $aln = $in->next_aln()) { $out -> write_aln($aln); } } -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Fri Apr 17 09:31:55 2009 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Fri, 17 Apr 2009 09:31:55 -0400 Subject: [Bioperl-guts-l] [15645] bioperl-live/trunk/Bio/DB/SwissProt.pm: [bug 2764] Message-ID: <200904171331.n3HDVtUK005004@dev.open-bio.org> Revision: 15645 Author: cjfields Date: 2009-04-17 09:31:54 -0400 (Fri, 17 Apr 2009) Log Message: ----------- [bug 2764] * add simple idtracker() method to SwissProt to retrieve current ID * based on Neil Saunders's script: http://nsaunders.wordpress.com/2008/03/07/missing-links-using-swissprot-idtracker-in-your-code/ * not sure how this will work when everything transitions to the new UniProt site: http://www.uniprot.org/ Modified Paths: -------------- bioperl-live/trunk/Bio/DB/SwissProt.pm Modified: bioperl-live/trunk/Bio/DB/SwissProt.pm =================================================================== --- bioperl-live/trunk/Bio/DB/SwissProt.pm 2009-04-17 02:16:07 UTC (rev 15644) +++ bioperl-live/trunk/Bio/DB/SwissProt.pm 2009-04-17 13:31:54 UTC (rev 15645) @@ -110,19 +110,19 @@ package Bio::DB::SwissProt; use strict; -use vars qw($MODVERSION %HOSTS $DEFAULTFORMAT $DEFAULTSERVERTYPE); -$MODVERSION = '0.8.1'; use HTTP::Request::Common; +our $MODVERSION = '0.8.1'; use base qw(Bio::DB::WebDBSeqI); # global vars -$DEFAULTSERVERTYPE = 'ebi'; -$DEFAULTFORMAT = 'swissprot'; +our $DEFAULTSERVERTYPE = 'ebi'; +our $DEFAULTFORMAT = 'swissprot'; +our $DEFAULTIDTRACKER = 'http://www.expasy.ch'; # you can add your own here theoretically. -%HOSTS = ( +our %HOSTS = ( 'expasy' => { 'default' => 'us', 'baseurl' => 'http://%s/cgi-bin/sprot-retrieve-list.pl', @@ -456,6 +456,40 @@ return @{$self->{'_format'}}; } +=head2 idtracker + + Title : idtracker + Usage : my ($newid) = $self->idtracker($oldid); + Function: Retrieve new ID using old ID. + Returns : single ID if one is found + Args : ID to look for + +=cut + +sub idtracker { + my ($self, $id) = @_; + return unless defined $id; + my $st = $self->servertype; + my $base = ($st eq 'expasy') ? "http://".$HOSTS{$st}->{'hosts'}->{$self->hostlocation} + : $DEFAULTIDTRACKER; + my $url = $base.'/cgi-bin/idtracker?id='.$id; + my $response; + eval {$response = $self->ua->get($url)}; + if ($@ || $response->is_error) { + my $error = $@ || $response->error_as_HTML; + $self->throw("Error:\n".$error); + } + if ($response->content =~ /was renamed to (.*?)<\/b>/) { + return $1; + } elsif ($response->content =~ /Entry name<\/th>Accession number<\/th>Release created<\/th><\/tr>/){ + # output indicates no mapping needed, return original ID + return $id; + } else { + $self->warn("Unknown response:\n".$response->content); + return + } +} + 1; __END__ From bugzilla-daemon at portal.open-bio.org Fri Apr 17 09:39:45 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 17 Apr 2009 09:39:45 -0400 Subject: [Bioperl-guts-l] [Bug 2764] enhance SwissProt retreival by id using IDTracker In-Reply-To: Message-ID: <200904171339.n3HDdj7R028756@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2764 ------- Comment #2 from cjfields at bioperl.org 2009-04-17 09:39 EST ------- I have added a simple idtracker() method that uses a modification of Neil's code. Incorporating it into get_Seq_by_id is a little tricky as it delegates to get_Stream_by_id in WebDBSeqI. It would be nice to handle streams as well but it may be asking too much, and apparently the new UniProt site (http://www.uniprot.org/) allows both old and new IDs. Maybe our efforts are better spent on updating SwissProt to use that site instead? -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Apr 17 10:28:10 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 17 Apr 2009 10:28:10 -0400 Subject: [Bioperl-guts-l] [Bug 2813] New: AlignIO returns undef when last sequence has zero end Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2813 Summary: AlignIO returns undef when last sequence has zero end Product: BioPerl Version: unspecified Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Core Components AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: bernd at bio.vu.nl AlignIO parsers as clustalw and fasta do not return an alignment object when the end coordinate of the last sequence is undefined or zero. This can happen if the last sequence contains gaps only as a result of taking a slice with SimpleAlign earlier. This is caused by the follwing code (present in 1.5 and 1.6). As a quick work-around I changed <= 0 to < 0 in the code of clustalw.pm. In fasta.pm: # If $end <= 0, we have either reached the end of # file in <> or we have encountered some other error if ( $end <= 0 ) { undef $aln; return $aln; } In clustalw.pm # not sure if this should be a default option - or we can pass in # an option to do this in the future? --jason stajich # $aln->map_chars('\.','-'); undef $aln if ( !defined $end || $end <= 0 ); return $aln; Example code and input: use Bio::AlignIO; my $in = Bio::AlignIO->new(-file => 'test.aln', -format => 'clustalw'); my $out = Bio::AlignIO->new(-file => '>testerr.ALN', -format => 'clustalw'); my $aln = $in->next_aln(); print $aln->length, "\n"; test.aln contains: CLUSTAL W(1.81) multiple sequence alignment QUERY/7-143 PETLE-ARINRATNPLNKEL--DWASI 7082547/1-128 ---------ERATNDMLIGP--DWAVN 1_3265048/1-0 --------------------------- 3265047/2-138 QTSLE-ALLLKATNSQNQNI--DTAAV 1_3265047/1-0 --------------------------- -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From thm09830 at dev.open-bio.org Fri Apr 17 11:01:23 2009 From: thm09830 at dev.open-bio.org (Marian Thieme) Date: Fri, 17 Apr 2009 11:01:23 -0400 Subject: [Bioperl-guts-l] [15646] bioperl-live/trunk/Bio/Microarray/Tools/ReseqChip.pm: Documentation , Bugfixes, Minor Changes Message-ID: <200904171501.n3HF1NNk005282@dev.open-bio.org> Revision: 15646 Author: thm09830 Date: 2009-04-17 11:01:23 -0400 (Fri, 17 Apr 2009) Log Message: ----------- Documentation, Bugfixes, Minor Changes Modified Paths: -------------- bioperl-live/trunk/Bio/Microarray/Tools/ReseqChip.pm Modified: bioperl-live/trunk/Bio/Microarray/Tools/ReseqChip.pm =================================================================== --- bioperl-live/trunk/Bio/Microarray/Tools/ReseqChip.pm 2009-04-17 13:31:54 UTC (rev 15645) +++ bioperl-live/trunk/Bio/Microarray/Tools/ReseqChip.pm 2009-04-17 15:01:23 UTC (rev 15646) @@ -17,9 +17,16 @@ =head1 SYNOPSIS use Bio::Microarray::Tools::ReseqChip; - - + + # fasta file with reference sequence context useed for the Chip (MitoChip in that case) + my $refseq_chip=...; + #... and design file that describes the addional probes, grouped by consecutive probes covering 20-100 consecutive positions referred to the reference sequence + my $Affy_frags_design_filename=...; + #format of the design file + $format="xls"; + # positions that are missing with respect to the reference sequence (rCRS - cambridge reference sequence) are going to be marked, so numbering with respect to the rCRS is conform my %ref_seq_max_ins_hash=(3106 => 1); + my $reseqfragSample=Bio::Tools::ReseqChip->new( $Affy_frags_design_filename, $format, @@ -94,7 +101,12 @@ Assumption: Gaps which are inserted in several fragments and in the reference sequence itself refer to the reference sequence. The reference sequence is given as input parameter. -Optionshash, specifying the explained parameter and some further options is provided by the user. +Optionshash, specifying the explained parameter and some further options is provided by the user: +we recommend to only specify the following hash-values that correnspod to the hash-keys resp. parameters: + flank_right (K-neighbourhood up and downstream of the genotype position in question) + call_threshold (minU) + depth (minP) + allowed_n_flank (maxN in K-neighbourhood) This module depends on the following modules: @@ -111,7 +123,7 @@ =head1 AUTHORS -Marian Thieme (marian.thieme at arcor.de) +Marian Thieme (marian.thieme at gmail.com) =head1 COPYRIGHT @@ -779,15 +791,10 @@ my ($self, $pos, $sequence) = @_; my $swap=0; - #print "\ninside swap_insertion ($pos)\n"; - #print "offset: ".$self->_get_start_pos($pos-$self->{_oligo_flank_length}, '_frags_max_ins_hash', 1)."\n"; my $pos1=$pos; my $offset=$self->_get_start_pos($pos-$self->{_oligo_flank_length}); $pos+=$self->{_oligo_flank_length}-$offset; foreach my $key (sort{$a<=>$b} keys %{$self->{_max_ins_hash}}) { - #print "\tstart_gap: ".($key)."\tend_gap: "; - #print ($key+$self->{_max_ins_hash}{$key}-1); - #print " vs ".($pos)."\n"; if ($key+$self->{_max_ins_hash}{$key}-1>=$pos and $key+2<=$pos) { my $isgap=1; my $i=1; @@ -800,7 +807,6 @@ $isgap=0; } } - #print "swap: (#positions): $swap\n"; last; } } @@ -811,7 +817,7 @@ Title : calc_sequence() Usage : $myReseqFrags->calc_sequence($aln, $options_hash [, $filename]); - Function : iterates over each position of the sequence (first element in the alignment) and change calls if sufficient evidence can be obtained by the alignment of redundant fragments + Function : iterates over each position of the sequence (first element in the alignment) and changes calls if sufficient evidence can be obtained by the alignment of redundant fragments Returns : sequence (string) Args : $als : locatable sequence object (redundand fragment) $options_hash : hash of options for redundant fragments analysis @@ -821,11 +827,8 @@ sub calc_sequence() { - my ($self, $aln, $seq_ins_hash_ref, $options_hash, $filename_rawrow) = @_; -# $filename_rawrow="test_reference_output.txt"; -# print("1. $aln (ReseqChip)\n"); -# print("============calc_sequence===========\n"); - #$aln_ins=0; + my ($self, $aln, $options_hash, $filename_rawrow) = @_; + my $final_seq=""; my $start_c=1; my $stop_c=0; @@ -865,14 +868,11 @@ my $count=0; my $output_rawrow_tmp=""; my $not_only_ref=0; - my $seq_ins; - -# if ($i % 3 == 0) { - + my $alt_seq; foreach my $seq ($aln->each_seq) { - if ($seq_ins_hash_ref) { - $seq_ins=$seq_ins_hash_ref->{$seq->id()}; + if ($options_hash->{alternative_sequence_hash}) { + $alt_seq=$options_hash->{alternative_sequence_hash}{$seq->id()}; } my $offset=$self->_get_start_pos($seq->start()); if ($seq_no==1) { @@ -882,48 +882,29 @@ } if ($i>$stop_c and $stop_c>0) { $stop=1; - #print("cut: $i bis zum ende"); - if ($options_hash->{QT_no}) { - #print("QT_no: ".$options_hash->{QT_no}."\n"); - if ($options_hash->{QT_no} eq "ins" and $seq_ins_hash_ref) { - $final_seq.=$seq_ins->subseq($i,$seq->length()); - #print(" mit ins (QT3)"); - } else { - $final_seq.=$seq->subseq($i,$seq->length()); - } + if ($options_hash->{alternative_sequence_hash}) { + $final_seq.=$alt_seq->subseq($i,$seq->length()); } else { $final_seq.=$seq->subseq($i,$seq->length()); - #print(" mit sub (QT12)"); } - #print("\n"); - #print("$i <".$seq->start()." + $offset\n"); last; } - #print "$i: $ref_base\n"; if ($i_neu != $i) { - #$i_neu=$i_neu; - #print("cut: $i bis ".($i_neu-1).""); - if ($options_hash->{QT_no}) { - if ($options_hash->{QT_no} eq "ins" and $seq_ins_hash_ref) { - $final_seq.=$seq_ins->subseq($i,$i_neu-1); - #print(" mit ins (QT3)"); - } else { - $final_seq.=$seq->subseq($i,$i_neu-1); - } + if ($options_hash->{alternative_sequence_hash}) { + $final_seq.=$alt_seq->subseq($i,$i_neu-1); } else { $final_seq.=$seq->subseq($i,$i_neu-1); - #print(" mit sub (QT12)"); } - #print("\n"); } $i=$i_neu; } ##add base to basearray if it fullfill the criteria - ##differ between insertions - if ($ref_base eq "-" and $seq_ins_hash_ref) { - ($not_only_ref, $count, $output_rawrow_tmp)=$self->_augment_base_array($seq_ins, $ref_base, \@base_array, $not_only_ref, $count, $i, $offset, $seq_no, $options_hash, $filename_rawrow, $output_rawrow_tmp); - ##and non insertions (sub/del) + ##differ between alternative and + #if ($ref_base eq "-" and $options_hash->{alternative_sequence_hash}) { + if ($options_hash->{alternative_sequence_hash}) { + ($not_only_ref, $count, $output_rawrow_tmp)=$self->_augment_base_array($alt_seq, $ref_base, \@base_array, $not_only_ref, $count, $i, $offset, $seq_no, $options_hash, $filename_rawrow, $output_rawrow_tmp); + ##and "normal" sequence } else { ($not_only_ref, $count, $output_rawrow_tmp)=$self->_augment_base_array($seq, $ref_base, \@base_array, $not_only_ref, $count, $i, $offset, $seq_no, $options_hash, $filename_rawrow, $output_rawrow_tmp); } @@ -933,7 +914,7 @@ $aln->remove_seq($seq); #$aln->sort_alphabetically; } - #finish iteration, if startpos of current sequence is outside of current position + #finish iteration, if startpos of current fragment/sequence is outside of current position in the alignment if ($i<$seq->start()+$offset) { last; } @@ -946,58 +927,51 @@ } #at least one nonref base is available if ($not_only_ref) { - #if ($ref_base eq "n" or ($i>4767 and $i<4780)) { - # my $offset=$self->_get_start_pos($i); - # print("$i (+$offset), get consensus call:.".$ref_base." @base_array\n"); - #} - #print($output_rawrow_tmp); - $output_rawrow.=$output_rawrow_tmp; - ($final_seq, $output_rawrow)=$self->_get_consensus_call($ref_base, \@base_array, $count, $final_seq, $options_hash, $filename_rawrow, $output_rawrow); -# print(substr($final_seq, $i-1)." (ref: $ref_base)\n"); + ($final_seq, $output_rawrow)=$self->_get_consensus_call($ref_base, \@base_array, $count, $final_seq, $options_hash, $filename_rawrow, $output_rawrow, $i); } else { - #if ($options_hash->{call_n}) { - # if (@base_array>0) { - # $final_seq.="n"; - # } - #} else { $final_seq.=$ref_base; - #} } -# } $i++; - #print("$i\n"); } if ($filename_rawrow) { open (my $RAWROW, ">>$filename_rawrow") or die "Cannot open file\n $!\n"; print $RAWROW $output_rawrow; close($RAWROW); } -# open (RAWROW, ">test_reference_seq.txt"); -# print RAWROW $final_seq; -# close(RAWROW); + return $final_seq; } + +=head2 _augment_base_array() + + Title : _augment_base_array() + Usage : dont call directly + Function : pushs base into the array of candidate bases, if the base meets criteria (options_hash->{allowed_n_in_flank}, options_hash->{flank_left/right}) + Returns : 3 values: + $not_only_ref 0/1: whether the array of candidate bases contains only reference base + $count integer: number of candidate base currently in the array + $output_rawrow_tmp string: log message, currently not used. + Args : lot of args, see the function itself + +=cut + sub _augment_base_array() { my ($self, $seq, $ref_base, $base_array, $not_only_ref, $count, $i, $offset, $seq_no, $options_hash, $filename_rawrow, $output_rawrow_tmp) = @_; if ($seq->start() < $i-$offset and ($seq->end()+$offset+$self->{_oligo_flank_length}) > $i) { - #if ($seq->start() < $i-$offset and $seq->end()+$offset+2+$self->{_oligo_flank_length} > $i) { - #print "$i id: ".$seq->id()." ";#.$seq->seq()."\n"; my $cleared_pos=$i-($seq->start()+$offset); if ($cleared_pos<=$seq->length()) { my $base=$seq->subseq($cleared_pos, $cleared_pos); if ($base ne 'n') { - #print $seq->subseq($cleared_pos,$cleared_pos); if ($filename_rawrow) { $output_rawrow_tmp.= $base; $count++; } - #print $base;#."\n"; @@ Diff output truncated at 10000 characters. @@ From thm09830 at dev.open-bio.org Fri Apr 17 11:01:52 2009 From: thm09830 at dev.open-bio.org (Marian Thieme) Date: Fri, 17 Apr 2009 11:01:52 -0400 Subject: [Bioperl-guts-l] [15647] bioperl-live/trunk/Bio/Microarray/Tools/MitoChipV2Parser.pm: Documentation, Bugfixes, Minor Changes Message-ID: <200904171501.n3HF1q4C005313@dev.open-bio.org> Revision: 15647 Author: thm09830 Date: 2009-04-17 11:01:51 -0400 (Fri, 17 Apr 2009) Log Message: ----------- Documentation, Bugfixes, Minor Changes Modified Paths: -------------- bioperl-live/trunk/Bio/Microarray/Tools/MitoChipV2Parser.pm Modified: bioperl-live/trunk/Bio/Microarray/Tools/MitoChipV2Parser.pm =================================================================== --- bioperl-live/trunk/Bio/Microarray/Tools/MitoChipV2Parser.pm 2009-04-17 15:01:23 UTC (rev 15646) +++ bioperl-live/trunk/Bio/Microarray/Tools/MitoChipV2Parser.pm 2009-04-17 15:01:51 UTC (rev 15647) @@ -314,17 +314,17 @@ $subc+=$diff_var_hash{$pos}{$type}{$var}; } - else {$self->warn("doesnt exist!\n");} + else {print "doesnt exist!\n";} #print " $var: $count_var ($count_tot_pos)"; } } #print("$pos sub $subc del $delc ins $insc\n"); - #print("$pos sub $count_var_sub del $count_var_del ins $count_var_ins\n"); + print("$pos sub $count_var_sub del $count_var_del ins $count_var_ins\n"); } - #print "\n\nPos: $count_pos/Different Types: $count_type/Different Variants: $count_var/Total No. of Variants: $count_tot\n\n"; - #print "ins: $count_var_ins $count_var_ins_tot\n"; - #print "del: $count_var_del $count_var_del_tot\n"; - #print "sub: $count_var_sub $count_var_sub_tot\n"; + print "\n\nPos: $count_pos/Different Types: $count_type/Different Variants: $count_var/Total No. of Variants: $count_tot\n\n"; + print "ins: $count_var_ins $count_var_ins_tot\n"; + print "del: $count_var_del $count_var_del_tot\n"; + print "sub: $count_var_sub $count_var_sub_tot\n"; } =head2 _calc_oligo_region_hash() From thm09830 at dev.open-bio.org Fri Apr 17 11:08:35 2009 From: thm09830 at dev.open-bio.org (Marian Thieme) Date: Fri, 17 Apr 2009 11:08:35 -0400 Subject: [Bioperl-guts-l] [15648] bioperl-live/trunk/t/Microarray/Tools/ReseqChip.t: Test routine fixed Message-ID: <200904171508.n3HF8Zg8005485@dev.open-bio.org> Revision: 15648 Author: thm09830 Date: 2009-04-17 11:08:34 -0400 (Fri, 17 Apr 2009) Log Message: ----------- Test routine fixed Modified Paths: -------------- bioperl-live/trunk/t/Microarray/Tools/ReseqChip.t Modified: bioperl-live/trunk/t/Microarray/Tools/ReseqChip.t =================================================================== --- bioperl-live/trunk/t/Microarray/Tools/ReseqChip.t 2009-04-17 15:01:51 UTC (rev 15647) +++ bioperl-live/trunk/t/Microarray/Tools/ReseqChip.t 2009-04-17 15:08:34 UTC (rev 15648) @@ -5,7 +5,7 @@ BEGIN { use lib '.'; use Bio::Root::Test; - test_begin(-tests => 7, + test_begin(-tests => 10, -requires_modules => [qw(Statistics::Frequency Spreadsheet::ParseExcel Spreadsheet::WriteExcel)] @@ -15,6 +15,58 @@ my $DEBUG = test_debug(); +sub read_params($$$$$) { + + my ($params_filename, $options_hash, $ncall_threshold, $ploidy, $mut_type) = @_; + + open(PARAMSFILE, $params_filename); + if (!(-e $params_filename)) { + print "File ($params_filename) does not exists !!!\n"; + } + my $old_ncall=0; + my $cur_ncall=0; + my @params; + my @p_array; + my $test=0; + while () { + if (/$ploidy/) { + if (/$mut_type/) { + + $test=1; + @params = split(' ', $_); + @p_array=split('_', $params[0]); + $cur_ncall=$params[1]; +# print("Jo da wernma hier: ".$_." Crit: $ncall_threshold, $old_ncall, $cur_ncall\n"); + if ($ncall_threshold<$cur_ncall and $ncall_threshold>=$old_ncall) { + + last; + } + $old_ncall=$cur_ncall; + } + } + } + close(ROCFILE); + ok($test, 'read_params'); + + if ($mut_type eq 'Subdel') { + $options_hash->{depth}=$p_array[3]; + $options_hash->{depth_del}=$p_array[3]; + $options_hash->{flank_left}=$p_array[4]; + $options_hash->{flank_right}=$p_array[5]; + $options_hash->{allowed_n_in_flank}=$p_array[6]; + $options_hash->{call_threshold}=$p_array[7]; + $options_hash->{del_threshold}=$p_array[7]; + } else { + $options_hash->{depth_ins}=$p_array[3]; + $options_hash->{flank_left_ins}=$p_array[4]; + $options_hash->{flank_right_ins}=$p_array[5]; + $options_hash->{allowed_n_in_flank_ins}=$p_array[6]; + $options_hash->{ins_threshold}=$p_array[7]; + } + +} + + sub process_sample($$$$$$$) { my ($myReseqChip, $aln, $ind_id, $options_hash, $newseq_output_filename, $recalls_output_filename, $workbook) = @_; @@ -22,12 +74,18 @@ $aln->sort_alphabetically; $myReseqChip->write_alignment2xls($aln, $workbook, $ind_id, 'human_mtDNA_RCRS', 1); - ok(-e $workbook, 'write_alignment2xls'); - my ($newseq,$dummy); - $newseq=$myReseqChip->calc_sequence($aln, $dummy, $options_hash, $recalls_output_filename); - ok(-e $recalls_output_filename, 'calc_sequence'); + + my $newseq=$myReseqChip->calc_sequence($aln, $options_hash, $recalls_output_filename); + ##test if $newseq has expected length + ok(length($newseq)==16565, 'calc_sequence: length'); + ##test if logfile has expected size + #print((-s $recalls_output_filename)."logfile size\n"); + ok(((-s $recalls_output_filename)==9242 or (-s $recalls_output_filename)==20622), 'calc_sequence: logfile'); + $myReseqChip->write2fasta($newseq, $ind_id, $newseq_output_filename, 1); - ok(-e $newseq_output_filename, 'write2fasta'); + ##test if fastafile has expected size + #print((-s $newseq_output_filename)."fastafile size\n"); + ok(((-s $newseq_output_filename)==16873 or (-s $newseq_output_filename)==33746), 'write2fasta'); } @@ -35,7 +93,9 @@ my $Affy_frags_design_filename = test_input_file('ReseqChip_mtDNA_design_annotation_file_FINAL.xls'); my $format = 'affy_mitochip_v2'; my $Affy_sample_fasta_file = test_input_file('ReseqChip_ExampleData.fasta'); +my $Affy_alternative_sample_fasta_file = test_input_file('ReseqChip_ExampleData.fasta'); my $Mito_reference_fasta_file = test_input_file('ReseqChip_RefSeq.fasta'); +my $Parameter_file = test_input_file('ReseqChip_ParamsNcall.csv'); #my $tmpdir = File::Spec->catfile(qw(t tmp)); #mkdir($tmpdir,0777); @@ -45,8 +105,12 @@ my $recalls_output_filename=test_output_file(); my $newseq_output_filename=test_output_file(); -my $in = Bio::SeqIO->new(-file => $Affy_sample_fasta_file, +my $in = Bio::SeqIO->new(-file => $Affy_sample_fasta_file, -format => 'Fasta'); + +my $in_alt = Bio::SeqIO->new(-file => $Affy_alternative_sample_fasta_file, + -format => 'Fasta'); + my $in_refseq = Bio::SeqIO->new(-file => $Mito_reference_fasta_file, -format => 'Fasta'); my $refseq = $in_refseq->next_seq(); @@ -54,93 +118,37 @@ my $myReseqChip = Bio::Microarray::Tools::ReseqChip->new($Affy_frags_design_filename, $format, \%ref_seq_max_ins_hash, $refseq); -my %options_hash=( - include_main_sequence => 1, - insertions => 1, - deletions => 1, - depth_ins => 1, - depth_del => 9, - depth => 1, -# depth_ins => 0, -# depth_del => 0, -# depth => 0, - consider_context => 1, - flank_left => 10, - flank_right => 10, - allowed_n_in_flank => 0, -# allowed_n_in_flank => 20, - flank_left_ins => 4, - flank_right_ins => 4, - allowed_n_in_flank_ins => 1, -# allowed_n_in_flank_ins => 21, - flank_size_weak => 1, - call_threshold => 55, - ins_threshold => 35, - del_threshold => 75, - swap_ins => 1, -# start_pos=> 300, -# stop_pos=> 320, - ); - ##general options -$options_hash{include_main_sequence}=1; -$options_hash{insertions}=1; -$options_hash{deletions}=1; -$options_hash{flank_size_weak}=1; -$options_hash{consider_context}=1; -$options_hash{swap_ins}=1; +my %options_hash=( + include_main_sequence => 1, + insertions => 1, + deletions => 1, + swap_ins => 1, + flank_size_weak => 1, + consider_context => 1 +); -##options for diploid model -$options_hash{depth}=10; -$options_hash{depth_del}=10; -$options_hash{call_threshold}=60; -$options_hash{del_threshold}=60; +##data specific options have to set by parsing parameter file +##subdel +read_params($Parameter_file, \%options_hash, 4.5, 'Hap', 'Subdel'); +##insertions +read_params($Parameter_file, \%options_hash, 5.4, 'Hap', 'Ins'); +#for my $pos (sort{$a<=>$b}keys %options_hash) { +# print "$pos :".$options_hash{$pos}."\n"; +#} + -$options_hash{flank_left}=1; -$options_hash{flank_right}=1; - -$options_hash{allowed_n_in_flank}=2; -##ins options -$options_hash{depth_ins}=1; -$options_hash{ins_threshold}=60; -$options_hash{flank_left_ins}=2; -$options_hash{flank_right_ins}=2; -$options_hash{allowed_n_in_flank_ins}=2; - -##options for haploid model #1 -$options_hash{depth}=1; -$options_hash{depth_del}=1; - -$options_hash{call_threshold}=90; -$options_hash{del_threshold}=90; - -$options_hash{flank_left}=5; -$options_hash{flank_right}=5; - -$options_hash{allowed_n_in_flank}=0; - -##hap model 2 -$options_hash{depth}=10; -$options_hash{depth_del}=10; - -$options_hash{call_threshold}=90; -$options_hash{del_threshold}=90; - -$options_hash{flank_left}=1; -$options_hash{flank_right}=1; - -$options_hash{allowed_n_in_flank}=1; - my $ind_id=""; my $ind_id_old=""; my $workbook = Spreadsheet::WriteExcel->new($xls_filename); my $j=1; my $aln = Bio::SimpleAlign->new(); while ( (my $seq = $in->next_seq())) { - my $locseq; + my ($locseq, $locseq_alt); + my $seq_alt=$in_alt->next_seq(); my $test_complete_seq=($seq->id =~ /human_mtDNA_RCRS/); if ($test_complete_seq==1) { $ind_id=$seq->id; @@ -149,8 +157,15 @@ if (!$test_complete_seq) { $locseq=$myReseqChip->insert_gaps2frag($seq); $aln->add_seq($locseq); - } - else { + + ##alternative basecalls + $locseq_alt=$myReseqChip->insert_gaps2frag($seq_alt); + $options_hash{alternative_sequence_hash}->{$locseq_alt->id}=$locseq_alt; + #my $start=$options_hash{alternative_sequence_hash}->{$locseq_alt->id}->start; + #my $end=$options_hash{alternative_sequence_hash}->{$locseq_alt->id}->end; + #print($options_hash{alternative_sequence_hash}->{$locseq_alt->id}->subseq(1, 12)."\n") + + } else { if ($aln->length>0) { process_sample($myReseqChip, $aln, $ind_id_old, \%options_hash, $newseq_output_filename, $recalls_output_filename, $workbook); @@ -160,6 +175,10 @@ $aln = new Bio::SimpleAlign(); $locseq=$myReseqChip->insert_gaps2reference_sequence($seq); $aln->add_seq($locseq); + ##alternative basecalls + $locseq_alt=$myReseqChip->insert_gaps2reference_sequence($seq_alt); + $options_hash{alternative_sequence_hash}->{$locseq_alt->id}=$locseq_alt; + $j++; } } @@ -168,3 +187,6 @@ $newseq_output_filename, $recalls_output_filename, $workbook); $workbook->close(); +##test if xls file has expected size +#print((-s $xls_filename)."xlsfile size\n"); +ok((-s $xls_filename)==1144832, 'write_alignment2xls'); \ No newline at end of file From cjfields at dev.open-bio.org Fri Apr 17 11:08:37 2009 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Fri, 17 Apr 2009 11:08:37 -0400 Subject: [Bioperl-guts-l] [15649] bioperl-live/trunk/t/RemoteDB: * split up and clean DB.t tests ( easier to debug) Message-ID: <200904171508.n3HF8bC9005496@dev.open-bio.org> Revision: 15649 Author: cjfields Date: 2009-04-17 11:08:36 -0400 (Fri, 17 Apr 2009) Log Message: ----------- * split up and clean DB.t tests (easier to debug) * move Query::GenBank-related tests to a separate directory --this line, and those below, will be ignored-- AM RemoteDB/SwissProt.t D RemoteDB/DB.t AM RemoteDB/MeSH.t AM RemoteDB/GenPept.t A RemoteDB/Query AM RemoteDB/Query/GenBank.t AM RemoteDB/GenBank.t AM RemoteDB/EntrezGene.t Added Paths: ----------- bioperl-live/trunk/t/RemoteDB/EntrezGene.t bioperl-live/trunk/t/RemoteDB/GenBank.t bioperl-live/trunk/t/RemoteDB/GenPept.t bioperl-live/trunk/t/RemoteDB/MeSH.t bioperl-live/trunk/t/RemoteDB/Query/ bioperl-live/trunk/t/RemoteDB/Query/GenBank.t bioperl-live/trunk/t/RemoteDB/SwissProt.t Removed Paths: ------------- bioperl-live/trunk/t/RemoteDB/DB.t Deleted: bioperl-live/trunk/t/RemoteDB/DB.t =================================================================== --- bioperl-live/trunk/t/RemoteDB/DB.t 2009-04-17 15:08:34 UTC (rev 15648) +++ bioperl-live/trunk/t/RemoteDB/DB.t 2009-04-17 15:08:36 UTC (rev 15649) @@ -1,381 +0,0 @@ -# -*-Perl-*- Test Harness script for Bioperl -# $Id$ - -use strict; - -BEGIN { - use lib '.'; - use Bio::Root::Test; - - test_begin(-tests => 113, - -requires_modules => [qw(IO::String - LWP::UserAgent - HTTP::Request::Common)], - -requires_networking => 1); - - use_ok('Bio::DB::GenBank'); - use_ok('Bio::DB::GenPept'); - use_ok('Bio::DB::SwissProt'); - use_ok('Bio::DB::MeSH'); -} - -my %expected_lengths = ('NDP_MOUSE' => 131, - 'NDP_HUMAN' => 133, - 'MUSIGHBA1' => 408, - 'AF303112' => 1611, - 'J00522' => 408, - 'AF303112' => 1611, - 'AF303112.1' => 1611, - '2981014' => 1156, - 'AF041456' => 1156, - 'AY080910' => 798, - 'AY080909' => 1042, - 'AF155220' => 1172, - '405830' => 1743, - 'CELRABGDI' => 1743, - '195055' => 136, - 'AAD15290' => 136, - 'AAC06201' => 353, - 'P43780' => 103, - 'BOLA_HAEIN'=> 103, - 'YNB3_YEAST'=> 125, - 'O39869' => 56, - 'P18584' => 497, - 'DEGP_CHLTR'=> 497, - 'AF442768' => 2547, - 'P31383' => 635, - 'CH402638' => 5041); - -my ($gb, $seq, $seqio, $seqin, $query); - -# -# Bio::DB::GenBank -# -ok $gb = Bio::DB::GenBank->new('-delay'=>0), 'Bio::DB::GenBank'; - -# get a single seq -SKIP: { - eval {$seq = $gb->get_Seq_by_id('MUSIGHBA1');}; - skip "Couldn't connect to Genbank with Bio::DB::GenBank.pm. Do you have network access? Skipping GenBank tests", 4 if $@; - is $seq->length, $expected_lengths{$seq->display_id}; - eval {$seq = $gb->get_Seq_by_acc('AF303112');}; - skip "Couldn't connect to Genbank with Bio::DB::GenBank.pm. Transient network problems? Skipping GenBank tests", 3 if $@; - is $seq->length, $expected_lengths{$seq->display_id}; - eval {$seq = $gb->get_Seq_by_version('AF303112.1');}; - skip "Couldn't connect to Genbank with Bio::DB::GenBank.pm. Transient network problems? Skipping GenBank tests", 2 if $@; - is $seq->length, $expected_lengths{$seq->display_id}; - eval {$seq = $gb->get_Seq_by_gi('405830');}; - skip "Couldn't connect to Genbank with Bio::DB::GenBank.pm. Transient network problems? Skipping GenBank tests", 1 if $@; - is $seq->length, $expected_lengths{$seq->display_id}; -} - -$seq = $seqio = undef; - -# batch mode -SKIP: { - eval {$seqio = $gb->get_Stream_by_id([qw(J00522 AF303112 2981014)]);}; - skip "Batch access test failed for Genbank. Skipping those tests", 4 if $@; - my $done = 0; - while (my $s = $seqio->next_seq) { - is $s->length, $expected_lengths{$s->display_id}; - $done++; - } - skip('No seqs returned', 4) if !$done; - is $done, 3; -} - -$seq = $seqio = undef; - -# test the temporary file creation and fasta -ok $gb = Bio::DB::GenBank->new('-format' => 'fasta', '-retrievaltype' => 'tempfile', '-delay' => 0); -SKIP: { - eval {$seq = $gb->get_Seq_by_id('MUSIGHBA1');}; - skip "Couldn't connect to complete GenBank tests with a tempfile with Bio::DB::GenBank.pm. Skipping those tests", 6 if $@; - # last part of id holds the key - is $seq->length, $expected_lengths{(split(/\|/,$seq->display_id))[-1]}; - eval {$seq = $gb->get_Seq_by_acc('AF303112');}; - skip "Couldn't connect to complete GenBank tests with a tempfile with Bio::DB::GenBank.pm. Skipping those tests", 5 if $@; - # last part of id holds the key - is $seq->length, $expected_lengths{(split(/\|/,$seq->display_id))[-1]}; - # batch mode requires genbank format - $gb->request_format("gb"); - eval {$seqio = $gb->get_Stream_by_id([qw(J00522 AF303112 2981014)]);}; - skip "Couldn't connect to complete GenBank batch tests with a tempfile with Bio::DB::GenBank.pm. Skipping those tests", 4 if $@; - my $done = 0; - while (my $s = $seqio->next_seq) { - is $s->length, $expected_lengths{$s->display_id}; - undef $gb; # test the case where the db is gone, - # but a temp file should remain until seqio goes away. - $done++; - } - skip('No seqs returned', 4) if !$done; - is $done, 3; -} - -$seq = $seqio = undef; - -# test pipeline creation -ok $gb = Bio::DB::GenBank->new('-retrievaltype' => 'pipeline', '-delay' => 0); -SKIP: { - eval {$seq = $gb->get_Seq_by_id('MUSIGHBA1');}; - skip "Couldn't connect to complete GenBank tests with a pipeline with Bio::DB::GenBank.pm. Skipping those tests", 6 if $@; - is $seq->length, $expected_lengths{$seq->display_id}; - eval {$seq = $gb->get_Seq_by_acc('AF303112');}; - skip "Couldn't connect to complete GenBank tests with a pipeline with Bio::DB::GenBank.pm. Skipping those tests", 5 if $@; - is $seq->length, $expected_lengths{$seq->display_id}; - eval {$seqio = $gb->get_Stream_by_id([qw(J00522 AF303112 2981014)]);}; - skip "Couldn't connect to complete GenBank tests with a pipeline with Bio::DB::GenBank.pm. Skipping those tests", 4 if $@; - my $done = 0; - while (my $s = $seqio->next_seq) { - is $s->length, $expected_lengths{$s->display_id}; - undef $gb; # test the case where the db is gone, - # but the pipeline should remain until seqio goes away - $done++; - } - skip('No seqs returned', 4) if !$done; - is $done, 3; -} - -$seq = $seqio = undef; - -# test query facility -ok $query = Bio::DB::Query::GenBank->new('-db' => 'nucleotide', - '-query' => 'Onchocerca volvulus[Organism]', - '-mindate' => '2002/1/1', - '-maxdate' => '2002/12/31'), 'Bio::DB::Query::GenBank'; -SKIP: { - cmp_ok $query->count, '>', 0; - my @ids = $query->ids; - cmp_ok @ids, '>', 0; - is @ids, $query->count; - ok $gb = Bio::DB::GenBank->new('-delay' => 0); - eval {$seqio = $gb->get_Stream_by_query($query);}; - skip "Couldn't connect to complete GenBank query tests. Skipping those tests", 5 if $@; - my $done = 0; - while (my $s = $seqio->next_seq) { - is $s->length, $expected_lengths{$s->display_id}; - undef $gb; # test the case where the db is gone, - # but the pipeline should remain until seqio goes away - $done++; - } - skip('No seqs returned', 5) if !$done; - is $done, 4; -} - -$seq = $seqio = undef; - -# test query facility (again) -ok $query = Bio::DB::Query::GenBank->new('-db' => 'nucleotide', - '-ids' => [qw(J00522 AF303112 2981014)]); -SKIP: { - cmp_ok $query->count, '>', 0; - my @ids = $query->ids; - cmp_ok @ids, '>', 0; - is @ids, $query->count; - $gb = Bio::DB::GenBank->new('-delay' => 0); - eval {$seqio = $gb->get_Stream_by_query($query);}; - skip "Couldn't connect to complete GenBank query tests. Skipping those tests: $@", 4 if $@; - my $done = 0; - while (my $s = $seqio->next_seq) { - is $s->length, $expected_lengths{$s->display_id}; - $done++; - } - skip('No seqs returned', 4) if !$done; - is $done, 3; - $seqio->close(); # the key to preventing errors during make test, no idea why -} - -$seq = $seqio = undef; - -# and yet again, for bug 2133 -$query = Bio::DB::Query::GenBank->new('-query' => 'AF303112', - '-ids' => [qw(J00522 AF303112 2981014)]); -is $query->query, 'J00522[PACC]|AF303112[PACC]|2981014[UID]'; - -# test contig retrieval -ok $gb = Bio::DB::GenBank->new('-delay' => 0, '-format' => 'gbwithparts'); -SKIP: { - eval {$seq = $gb->get_Seq_by_id('CH402638');}; - skip "Couldn't connect to GenBank with Bio::DB::GenBank.pm. Skipping those tests", 3 if $@; - is $seq->length, $expected_lengths{$seq->display_id}; - # now to check that postprocess_data in NCBIHelper catches CONTIG... - ok $gb = Bio::DB::GenBank->new('-delay' => 0, '-format' => 'gb'); - eval {$seq = $gb->get_Seq_by_id('CH402638');}; - skip "Couldn't connect to GenBank with Bio::DB::GenBank.pm. Skipping those tests", 1 if $@; - is $seq->length, $expected_lengths{$seq->display_id}; -} - -$seq = $seqio = undef; - -# bug 1405 -my @result; -ok $gb = Bio::DB::GenBank->new(-format => 'Fasta', -seq_start => 2, -seq_stop => 7); -SKIP: { - eval {$seq = $gb->get_Seq_by_acc("A11111");}; - skip "Couldn't connect to complete GenBank tests. Skipping those tests", 15 if $@; - is $seq->length, 6; - # complexity tests - ok $gb = Bio::DB::GenBank->new(-format => 'Fasta', -complexity => 0); - eval {$seqin = $gb->get_Stream_by_acc("5");}; - skip "Couldn't connect to complete GenBank tests. Skipping those tests", 13 if $@; - @result = (1136, 'dna', 342, 'protein'); - while ($seq = $seqin->next_seq) { - is $seq->length, shift(@result); - is $seq->alphabet, shift(@result); - } - is @result, 0; - # Real batch retrieval using epost/efetch - # these tests may change if integrated further into Bio::DB::Gen* - # Currently only useful for retrieving GI's via get_seq_stream - $gb = Bio::DB::GenBank->new(); - eval {$seqin = $gb->get_seq_stream(-uids => [4887706 ,431229, 147460], -mode => 'batch');}; - skip "Couldn't connect to complete GenBank batchmode epost/efetch tests. Skipping those tests", 8 if $@; - my %result = ('M59757' => 12611 ,'X76083'=> 3140, 'J01670'=> 1593); - my $ct = 0; - while ($seq = $seqin->next_seq) { - $ct++; - my $acc = $seq->accession; - ok exists $result{ $acc }; - is $seq->length, $result{ $acc }; - delete $result{$acc}; - } @@ Diff output truncated at 10000 characters. @@ From cjfields at dev.open-bio.org Fri Apr 17 11:14:32 2009 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Fri, 17 Apr 2009 11:14:32 -0400 Subject: [Bioperl-guts-l] [15650] bioperl-live/trunk/t/RemoteDB/GenBank.t: remove extraneous text Message-ID: <200904171514.n3HFEWwb005527@dev.open-bio.org> Revision: 15650 Author: cjfields Date: 2009-04-17 11:14:32 -0400 (Fri, 17 Apr 2009) Log Message: ----------- remove extraneous text Modified Paths: -------------- bioperl-live/trunk/t/RemoteDB/GenBank.t Modified: bioperl-live/trunk/t/RemoteDB/GenBank.t =================================================================== --- bioperl-live/trunk/t/RemoteDB/GenBank.t 2009-04-17 15:08:36 UTC (rev 15649) +++ bioperl-live/trunk/t/RemoteDB/GenBank.t 2009-04-17 15:14:32 UTC (rev 15650) @@ -16,31 +16,6 @@ use_ok('Bio::DB::GenBank'); } -#ok 3 - MUSIGHBA1 -#ok 4 - AF303112 -#ok 5 - AF303112 -#ok 6 - CELRABGDI -# -#ok 7 - MUSIGHBA1 -# -#ok 8 - AF303112 -#ok 9 - AF041456 -#ok 10 -#ok 11 -#ok 12 - gi|195052|gb|J00522.1|MUSIGHBA1 -#ok 13 - gi|11127914|gb|AF303112.1| -#ok 14 -#ok 15 -#ok 16 -#ok 17 -#ok 18 -#ok 19 - MUSIGHBA1 -#ok 20 - AF303112 -#ok 21 - AF303112 -#ok 22 - AF303112 -#ok 23 - AF303112 - - my %expected_lengths = ( 'MUSIGHBA1' => 408, 'AF303112' => 1611, From bugzilla-daemon at portal.open-bio.org Fri Apr 17 11:15:30 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 17 Apr 2009 11:15:30 -0400 Subject: [Bioperl-guts-l] [Bug 2764] enhance SwissProt retreival by id using IDTracker In-Reply-To: Message-ID: <200904171515.n3HFFUZs005675@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2764 ------- Comment #3 from cjfields at bioperl.org 2009-04-17 11:15 EST ------- Tests added to t/RemoteDB/SwissProt.t for idtracker() -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From thm09830 at dev.open-bio.org Fri Apr 17 11:20:05 2009 From: thm09830 at dev.open-bio.org (Marian Thieme) Date: Fri, 17 Apr 2009 11:20:05 -0400 Subject: [Bioperl-guts-l] [15651] bioperl-live/trunk/t/data/ReseqChip_ParamsNcall.csv: CSV files with Parameters covering different levels of ncalls Message-ID: <200904171520.n3HFK5wx005598@dev.open-bio.org> Revision: 15651 Author: thm09830 Date: 2009-04-17 11:20:04 -0400 (Fri, 17 Apr 2009) Log Message: ----------- CSV files with Parameters covering different levels of ncalls Added Paths: ----------- bioperl-live/trunk/t/data/ReseqChip_ParamsNcall.csv Added: bioperl-live/trunk/t/data/ReseqChip_ParamsNcall.csv =================================================================== --- bioperl-live/trunk/t/data/ReseqChip_ParamsNcall.csv (rev 0) +++ bioperl-live/trunk/t/data/ReseqChip_ParamsNcall.csv 2009-04-17 15:20:04 UTC (rev 15651) @@ -0,0 +1,166 @@ +3_Hap_Subdel_0_3_7_7_5_30 0.1 +3_Hap_Subdel_0_1_3_3_0_50 0.2 +3_Hap_Subdel_0_1_4_4_0_50 0.3 +3_Hap_Subdel_0_2_4_4_0_50 0.5 +3_Hap_Subdel_0_3_3_3_0_50 0.6 +3_Hap_Subdel_0_3_4_4_0_50 0.7 +3_Hap_Subdel_0_1_4_4_0_60 0.8 +3_Hap_Subdel_0_10_0_0_0_40 0.9 +3_Hap_Subdel_0_10_5_5_5_50 1 +3_Hap_Subdel_0_10_6_6_5_50 1.1 +3_Hap_Subdel_0_10_7_7_5_30 1.2 +3_Hap_Subdel_0_10_5_5_3_50 1.3 +3_Hap_Subdel_0_10_10_10_5_50 1.4 +3_Hap_Subdel_0_10_10_10_4_50 1.5 +3_Hap_Subdel_0_10_11_11_4_50 1.6 +3_Hap_Subdel_0_10_10_10_3_30 1.7 +3_Hap_Subdel_0_10_0_0_0_60 1.8 +3_Hap_Subdel_0_10_7_7_5_60 2 +3_Hap_Subdel_0_10_0_0_0_70 2.1 +3_Hap_Subdel_0_10_7_7_5_70 2.2 +6_Hap_Subdel_0_1_4_4_2_80 2.3 +3_Hap_Subdel_0_2_7_7_2_80 2.4 +3_Hap_Subdel_0_2_11_11_2_80 2.5 +3_Hap_Subdel_0_2_10_10_2_80 2.6 +3_Hap_Subdel_0_3_10_10_2_80 2.7 +3_Hap_Subdel_0_4_10_10_2_80 2.8 +3_Hap_Subdel_0_7_10_10_2_80 3.3 +3_Hap_Subdel_0_1_4_4_1_90 3.4 +3_Hap_Subdel_0_2_4_4_1_90 3.5 +3_Hap_Subdel_0_3_4_4_1_90 3.6 +3_Hap_Subdel_0_2_11_11_2_90 3.7 +3_Hap_Subdel_0_3_11_11_2_90 3.8 +3_Hap_Subdel_0_3_10_10_3_90 3.9 +3_Hap_Subdel_0_4_10_10_3_90 4 +3_Hap_Subdel_0_10_9_9_3_90 4.5 +3_Hap_Subdel_0_10_10_10_3_90 4.6 +3_Hap_Subdel_0_3_4_4_1_100 8.4 +3_Hap_Subdel_0_3_11_11_3_100 8.8 +3_Hap_Subdel_0_3_10_10_3_100 8.9 +3_Hap_Subdel_0_4_10_10_3_100 9 +3_Hap_Subdel_0_10_4_4_1_100 9.3 +3_Hap_Subdel_0_10_11_11_3_100 9.6 +3_Hap_Subdel_0_10_10_10_3_100 9.7 +3_Hap_Subdel_0_3_3_3_2_100 11.2 +3_Hap_Subdel_0_10_3_3_2_100 11.7 +3_Hap_Subdel_0_10_2_2_2_100 13.5 +3_Hap_Subdel_0_10_1_1_1_100 14.3 +3_Hap_Subdel_0_10_0_0_0_100 16.9 +3_Hap_Ins_0_1_4_4_1_30 0.1 +3_Hap_Ins_0_10_4_4_1_30 0.5 +3_Hap_Ins_0_1_6_6_1_60 1.7 +3_Hap_Ins_0_1_5_5_1_60 1.9 +3_Hap_Ins_0_4_5_5_1_60 2.1 +3_Hap_Ins_0_10_5_5_1_60 2.7 +3_Hap_Ins_0_10_7_7_3_70 3.9 +3_Hap_Ins_0_10_10_10_5_70 4 +3_Hap_Ins_0_10_3_3_2_70 4.2 +3_Hap_Ins_0_10_6_6_4_70 4.5 +3_Hap_Ins_0_10_4_4_3_70 4.7 +3_Hap_Ins_0_1_6_6_2_70 5.1 +3_Hap_Ins_0_10_6_6_2_70 5.2 +3_Hap_Ins_0_10_4_4_1_70 5.4 +3_Hap_Ins_0_10_5_5_2_70 5.5 +3_Hap_Ins_0_10_10_10_5_80 10.6 +3_Hap_Ins_0_10_2_2_2_80 11.2 +3_Hap_Ins_0_10_0_0_0_90 21 +3_Hap_Ins_0_10_5_5_2_90 23.6 +3_Hap_Ins_0_10_3_3_1_100 50.5 +3_Hap_Ins_0_10_12_12_4_100 52.7 +9_Hap_Ins_0_10_3_3_5_100 52.9 +9_Hap_Ins_0_10_0_0_0_100 55.6 +3_Dip_Subdel_0_1_2_2_3_40 0.1 +3_Dip_Subdel_0_2_3_3_5_40 0.2 +3_Dip_Subdel_0_3_3_3_5_40 0.3 +3_Dip_Subdel_0_1_2_2_1_40 0.4 +3_Dip_Subdel_0_1_3_3_5_50 0.5 +3_Dip_Subdel_0_1_3_3_2_50 0.6 +3_Dip_Subdel_0_3_4_4_5_50 0.7 +3_Dip_Subdel_0_3_1_1_1_50 0.8 +3_Dip_Subdel_0_4_4_4_5_50 0.9 +3_Dip_Subdel_0_3_4_4_3_50 1.1 +3_Dip_Subdel_0_3_3_3_2_50 1.2 +3_Dip_Subdel_0_1_6_6_5_60 1.3 +3_Dip_Subdel_0_1_3_3_4_60 1.4 +3_Dip_Subdel_0_2_4_4_4_60 1.5 +3_Dip_Subdel_0_3_3_3_5_60 1.6 +3_Dip_Subdel_0_3_1_1_1_60 1.7 +3_Dip_Subdel_0_4_3_3_4_60 1.8 +3_Dip_Subdel_0_1_5_5_5_70 1.9 +3_Dip_Subdel_0_2_6_6_5_70 2 +3_Dip_Subdel_0_3_2_2_3_70 2.2 +3_Dip_Subdel_0_4_2_2_3_70 2.3 +3_Dip_Subdel_0_10_0_0_0_40 2.5 +3_Dip_Subdel_0_10_0_0_0_50 2.7 +3_Dip_Subdel_0_10_3_3_5_50 2.8 +3_Dip_Subdel_0_10_2_2_3_50 2.9 +3_Dip_Subdel_0_1_3_3_0_70 3 +3_Dip_Subdel_0_1_4_4_2_80 3.4 +3_Dip_Subdel_0_1_4_4_3_80 3.5 +3_Dip_Subdel_0_2_6_6_5_80 3.7 +3_Dip_Subdel_0_2_4_4_3_80 3.8 +3_Dip_Subdel_0_3_5_5_4_80 3.9 +3_Dip_Subdel_0_3_5_5_5_80 4 +3_Dip_Subdel_0_4_5_5_5_80 4.1 +3_Dip_Subdel_0_4_4_4_5_80 4.2 +3_Dip_Subdel_0_10_4_4_5_80 5.5 +3_Dip_Subdel_0_11_4_4_5_80 5.7 +3_Dip_Subdel_0_2_2_2_1_90 6.4 +3_Dip_Subdel_0_3_4_4_3_90 6.5 +3_Dip_Subdel_0_3_6_6_5_90 6.6 +3_Dip_Subdel_0_3_2_2_1_90 6.7 +3_Dip_Subdel_0_3_4_4_4_90 6.8 +3_Dip_Subdel_0_3_2_2_2_90 7 +3_Dip_Subdel_0_4_2_2_2_90 7.1 +3_Dip_Subdel_0_3_3_3_4_90 7.2 +3_Dip_Subdel_0_4_3_3_4_90 7.4 +3_Dip_Subdel_0_4_2_2_3_90 7.5 +3_Dip_Subdel_0_10_3_3_4_90 8.3 +3_Dip_Subdel_0_11_3_3_4_90 8.5 +3_Dip_Subdel_0_11_2_2_3_90 8.7 +6_Dip_Subdel_0_8_4_4_5_90 10.4 +6_Dip_Subdel_0_10_4_4_5_90 11.2 +6_Dip_Subdel_0_11_4_4_5_90 11.5 +3_Dip_Subdel_0_5_6_6_4_100 14.2 +3_Dip_Subdel_0_8_6_6_4_100 14.9 +3_Dip_Subdel_0_9_6_6_4_100 15.2 +3_Dip_Subdel_0_5_6_6_5_100 15.4 +3_Dip_Subdel_0_10_6_6_4_100 15.5 +6_Dip_Subdel_0_7_1_1_1_100 16.1 +6_Dip_Subdel_0_8_4_4_5_100 16.3 +3_Dip_Subdel_0_4_2_2_2_100 16.6 +6_Dip_Subdel_0_11_4_4_5_100 17.4 +3_Dip_Subdel_0_4_3_3_4_100 18 +3_Dip_Subdel_0_3_2_2_3_100 18.6 +3_Dip_Subdel_0_4_2_2_3_100 18.8 +3_Dip_Subdel_0_10_2_2_3_100 19.7 +3_Dip_Subdel_0_3_0_0_0_100 20 +3_Dip_Subdel_0_4_0_0_0_100 20.1 +3_Dip_Subdel_0_10_0_0_0_100 20.9 +3_Dip_Ins_0_1_5_5_3_30 0.1 +3_Dip_Ins_0_1_4_4_2_60 0.6 +3_Dip_Ins_0_1_6_6_2_60 0.7 +3_Dip_Ins_0_1_5_5_2_60 0.8 +3_Dip_Ins_0_1_6_6_3_60 0.9 +3_Dip_Ins_0_10_6_6_3_60 1.7 +3_Dip_Ins_0_10_6_6_2_70 4 +3_Dip_Ins_0_1_3_3_1_70 4.8 +3_Dip_Ins_0_1_6_6_3_70 5.2 +3_Dip_Ins_0_10_6_6_3_70 5.6 +3_Dip_Ins_0_1_5_5_4_70 6.5 +3_Dip_Ins_0_10_5_5_4_70 6.6 +3_Dip_Ins_0_10_3_3_1_80 6.9 +3_Dip_Ins_0_10_3_3_3_80 7.2 +3_Dip_Ins_0_10_4_4_4_80 7.5 +3_Dip_Ins_0_10_2_2_1_90 12.3 +3_Dip_Ins_0_10_3_3_2_90 12.4 +3_Dip_Ins_0_10_2_2_2_90 12.8 +3_Dip_Ins_0_10_4_4_4_90 13.6 +3_Dip_Ins_0_10_2_2_3_90 13.7 +3_Dip_Ins_0_10_3_3_5_90 13.9 +3_Dip_Ins_0_10_0_0_0_90 14 +3_Dip_Ins_0_10_3_3_3_100 46.2 +3_Dip_Ins_0_10_2_2_2_100 48.4 +3_Dip_Ins_0_10_1_1_1_100 50.2 +3_Dip_Ins_0_10_3_3_5_100 53.2 +3_Dip_Ins_0_10_0_0_0_100 54.9 From thm09830 at dev.open-bio.org Fri Apr 17 13:23:16 2009 From: thm09830 at dev.open-bio.org (Marian Thieme) Date: Fri, 17 Apr 2009 13:23:16 -0400 Subject: [Bioperl-guts-l] [15652] bioperl-live/trunk/t/Microarray/Tools/ReseqChip.t: bug fixes Message-ID: <200904171723.n3HHNGat006127@dev.open-bio.org> Revision: 15652 Author: thm09830 Date: 2009-04-17 13:23:16 -0400 (Fri, 17 Apr 2009) Log Message: ----------- bug fixes Modified Paths: -------------- bioperl-live/trunk/t/Microarray/Tools/ReseqChip.t Modified: bioperl-live/trunk/t/Microarray/Tools/ReseqChip.t =================================================================== --- bioperl-live/trunk/t/Microarray/Tools/ReseqChip.t 2009-04-17 15:20:04 UTC (rev 15651) +++ bioperl-live/trunk/t/Microarray/Tools/ReseqChip.t 2009-04-17 17:23:16 UTC (rev 15652) @@ -31,14 +31,11 @@ while () { if (/$ploidy/) { if (/$mut_type/) { - $test=1; @params = split(' ', $_); @p_array=split('_', $params[0]); $cur_ncall=$params[1]; -# print("Jo da wernma hier: ".$_." Crit: $ncall_threshold, $old_ncall, $cur_ncall\n"); - if ($ncall_threshold<$cur_ncall and $ncall_threshold>=$old_ncall) { - + if (($ncall_threshold<$cur_ncall and $ncall_threshold>=$old_ncall)) { last; } $old_ncall=$cur_ncall; @@ -46,6 +43,7 @@ } } close(ROCFILE); + #print("the ncalllevel: $cur_ncall\n"); ok($test, 'read_params'); if ($mut_type eq 'Subdel') { @@ -131,10 +129,16 @@ ##data specific options have to set by parsing parameter file + ##subdel -read_params($Parameter_file, \%options_hash, 4.5, 'Hap', 'Subdel'); +my $subdel_ncalls=4.5; #specify value between 0 and 16.9 for Hap + # and 0 and 55.6 for Dip model, respectively +read_params($Parameter_file, \%options_hash, $subdel_ncalls, 'Hap', 'Subdel'); + ##insertions -read_params($Parameter_file, \%options_hash, 5.4, 'Hap', 'Ins'); +my $ins_ncalls=5.4; #specify value between 0 and 20.9 + # and 0.1 and 54.9 +read_params($Parameter_file, \%options_hash, $ins_ncalls, 'Hap', 'Ins'); #for my $pos (sort{$a<=>$b}keys %options_hash) { # print "$pos :".$options_hash{$pos}."\n"; #} @@ -161,9 +165,6 @@ ##alternative basecalls $locseq_alt=$myReseqChip->insert_gaps2frag($seq_alt); $options_hash{alternative_sequence_hash}->{$locseq_alt->id}=$locseq_alt; - #my $start=$options_hash{alternative_sequence_hash}->{$locseq_alt->id}->start; - #my $end=$options_hash{alternative_sequence_hash}->{$locseq_alt->id}->end; - #print($options_hash{alternative_sequence_hash}->{$locseq_alt->id}->subseq(1, 12)."\n") } else { if ($aln->length>0) { @@ -175,7 +176,8 @@ $aln = new Bio::SimpleAlign(); $locseq=$myReseqChip->insert_gaps2reference_sequence($seq); $aln->add_seq($locseq); - ##alternative basecalls + + ##alternative primary basecalls for insertions $locseq_alt=$myReseqChip->insert_gaps2reference_sequence($seq_alt); $options_hash{alternative_sequence_hash}->{$locseq_alt->id}=$locseq_alt; @@ -189,4 +191,4 @@ $workbook->close(); ##test if xls file has expected size #print((-s $xls_filename)."xlsfile size\n"); -ok((-s $xls_filename)==1144832, 'write_alignment2xls'); \ No newline at end of file +ok((-s $xls_filename)==1144832, 'write_alignment2xls'); From thm09830 at dev.open-bio.org Fri Apr 17 13:23:40 2009 From: thm09830 at dev.open-bio.org (Marian Thieme) Date: Fri, 17 Apr 2009 13:23:40 -0400 Subject: [Bioperl-guts-l] [15653] bioperl-live/trunk/Bio/Microarray/Tools/ReseqChip.pm: bug fixes Message-ID: <200904171723.n3HHNeNC006158@dev.open-bio.org> Revision: 15653 Author: thm09830 Date: 2009-04-17 13:23:40 -0400 (Fri, 17 Apr 2009) Log Message: ----------- bug fixes Modified Paths: -------------- bioperl-live/trunk/Bio/Microarray/Tools/ReseqChip.pm Modified: bioperl-live/trunk/Bio/Microarray/Tools/ReseqChip.pm =================================================================== --- bioperl-live/trunk/Bio/Microarray/Tools/ReseqChip.pm 2009-04-17 17:23:16 UTC (rev 15652) +++ bioperl-live/trunk/Bio/Microarray/Tools/ReseqChip.pm 2009-04-17 17:23:40 UTC (rev 15653) @@ -23,7 +23,7 @@ #... and design file that describes the addional probes, grouped by consecutive probes covering 20-100 consecutive positions referred to the reference sequence my $Affy_frags_design_filename=...; #format of the design file - $format="xls"; + $format='affy_mitochip_v2'; # positions that are missing with respect to the reference sequence (rCRS - cambridge reference sequence) are going to be marked, so numbering with respect to the rCRS is conform my %ref_seq_max_ins_hash=(3106 => 1); @@ -626,13 +626,7 @@ return $startpos; } } - #if ($startpos < $i and $endpos > $i) { - # $test=1; - # last; - #} -# } else if ($i$self->{_oligos2calc_hash}{$startpos}) { -# -# } + $startpos_old=$startpos; $endpos_old=$endpos; if ($i>$endpos_old) { @@ -640,10 +634,8 @@ } } if ($lastone==0) { - #print("return lastone=0\n"); return 0; } - #print("return i: $i\n"); return $i; } @@ -882,29 +874,20 @@ } if ($i>$stop_c and $stop_c>0) { $stop=1; - if ($options_hash->{alternative_sequence_hash}) { - $final_seq.=$alt_seq->subseq($i,$seq->length()); - } else { - $final_seq.=$seq->subseq($i,$seq->length()); - } + $final_seq.=$seq->subseq($i,$seq->length()); last; } if ($i_neu != $i) { - if ($options_hash->{alternative_sequence_hash}) { - $final_seq.=$alt_seq->subseq($i,$i_neu-1); - } else { - $final_seq.=$seq->subseq($i,$i_neu-1); - } + $final_seq.=$seq->subseq($i,$i_neu-1); } $i=$i_neu; } ##add base to basearray if it fullfill the criteria - ##differ between alternative and - #if ($ref_base eq "-" and $options_hash->{alternative_sequence_hash}) { - if ($options_hash->{alternative_sequence_hash}) { + ##differ between alternative base calls (in case of an insertion) + if ($ref_base eq "-" and $options_hash->{alternative_sequence_hash}) { ($not_only_ref, $count, $output_rawrow_tmp)=$self->_augment_base_array($alt_seq, $ref_base, \@base_array, $not_only_ref, $count, $i, $offset, $seq_no, $options_hash, $filename_rawrow, $output_rawrow_tmp); - ##and "normal" sequence + ##and "normal" base calls for the alternative probes } else { ($not_only_ref, $count, $output_rawrow_tmp)=$self->_augment_base_array($seq, $ref_base, \@base_array, $not_only_ref, $count, $i, $offset, $seq_no, $options_hash, $filename_rawrow, $output_rawrow_tmp); } @@ -912,14 +895,12 @@ #remove no more needed sequences if (($i) >= ($seq->end()+$offset+2+$self->{_oligo_flank_length})) { $aln->remove_seq($seq); - #$aln->sort_alphabetically; } #finish iteration, if startpos of current fragment/sequence is outside of current position in the alignment if ($i<$seq->start()+$offset) { last; } $seq_no++; - #print "\n"; } if ($stop) { From thm09830 at dev.open-bio.org Fri Apr 17 13:24:49 2009 From: thm09830 at dev.open-bio.org (Marian Thieme) Date: Fri, 17 Apr 2009 13:24:49 -0400 Subject: [Bioperl-guts-l] [15654] bioperl-live/trunk/Bio/Microarray/Tools/ReseqChip.pm: remove comments Message-ID: <200904171724.n3HHOnGC006189@dev.open-bio.org> Revision: 15654 Author: thm09830 Date: 2009-04-17 13:24:49 -0400 (Fri, 17 Apr 2009) Log Message: ----------- remove comments Modified Paths: -------------- bioperl-live/trunk/Bio/Microarray/Tools/ReseqChip.pm Modified: bioperl-live/trunk/Bio/Microarray/Tools/ReseqChip.pm =================================================================== --- bioperl-live/trunk/Bio/Microarray/Tools/ReseqChip.pm 2009-04-17 17:23:40 UTC (rev 15653) +++ bioperl-live/trunk/Bio/Microarray/Tools/ReseqChip.pm 2009-04-17 17:24:49 UTC (rev 15654) @@ -906,7 +906,7 @@ if ($stop) { last; } - #at least one nonref base is available + #at least one nonref base is available if ($not_only_ref) { $output_rawrow.=$output_rawrow_tmp; ($final_seq, $output_rawrow)=$self->_get_consensus_call($ref_base, \@base_array, $count, $final_seq, $options_hash, $filename_rawrow, $output_rawrow, $i); @@ -1012,7 +1012,6 @@ ###insertions if ($options_hash->{insertions}==1) { if ($vote ne "-" and $ref_base eq "-" and $options_hash->{depth_ins}<=$alignment_depth ) { - #print "insertion: @$base_array $ref_base => $vote\n"; $newbase=$vote; } } @@ -1025,7 +1024,6 @@ if ($filename_rawrow) { $output_rawrow.= "\t$ref_base vs $vote => $newbase"; } - #} elsif ($vote eq "n") { } else { if ($options_hash->{call_n}) { $newbase=$vote; @@ -1043,16 +1041,14 @@ $final_seq=substr($final_seq,0,0-$swap).$newbase.substr($final_seq,0-$swap); } else { - #print($newbase." $i \n"); $final_seq.=$newbase; } } else { - #print - if ($options_hash->{call_n}) { - $final_seq.="n"; - } else { - $final_seq.=$ref_base; - } + if ($options_hash->{call_n}) { + $final_seq.="n"; + } else { + $final_seq.=$ref_base; + } } } else { From bugzilla-daemon at portal.open-bio.org Mon Apr 20 14:08:02 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 20 Apr 2009 14:08:02 -0400 Subject: [Bioperl-guts-l] [Bug 2816] New: Bio::SeqIO::entrezgene no longer contains Features Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2816 Summary: Bio::SeqIO::entrezgene no longer contains Features Product: BioPerl Version: main-trunk Platform: All OS/Version: All Status: NEW Severity: normal Priority: P2 Component: Bio::SeqIO AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: cjfields at bioperl.org Just noticed per a discussion on #bioperl that Bio::DB::EntrezGene (and by extension, Bio::SeqIO::EntrezGene) is returning sequences that lack features, including exons. This is probably due to some changes with mapping but needs to be delved into further. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Apr 20 14:11:26 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 20 Apr 2009 14:11:26 -0400 Subject: [Bioperl-guts-l] [Bug 2816] Bio::SeqIO::entrezgene no longer contains Features In-Reply-To: Message-ID: <200904201811.n3KIBQTL007248@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2816 ------- Comment #1 from cjfields at bioperl.org 2009-04-20 14:11 EST ------- s/Bio::SeqIO::EntrezGene/Bio::SeqIO::entrezgene/ -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Mon Apr 20 16:23:12 2009 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Mon, 20 Apr 2009 16:23:12 -0400 Subject: [Bioperl-guts-l] [15655] bioperl-live/trunk/Bio/SeqIO/entrezgene.pm: perltidy Message-ID: <200904202023.n3KKNChm020464@dev.open-bio.org> Revision: 15655 Author: cjfields Date: 2009-04-20 16:23:10 -0400 (Mon, 20 Apr 2009) Log Message: ----------- perltidy Modified Paths: -------------- bioperl-live/trunk/Bio/SeqIO/entrezgene.pm Modified: bioperl-live/trunk/Bio/SeqIO/entrezgene.pm =================================================================== --- bioperl-live/trunk/Bio/SeqIO/entrezgene.pm 2009-04-17 17:24:49 UTC (rev 15654) +++ bioperl-live/trunk/Bio/SeqIO/entrezgene.pm 2009-04-20 20:23:10 UTC (rev 15655) @@ -132,235 +132,280 @@ use Bio::SeqFeature::Gene::Transcript; use Bio::SeqFeature::Gene::GeneStructure; use Bio::Cluster::SequenceFamily; + #use Bio::Ontology::Ontology; Relationships.... later use Bio::Ontology::Term; use Bio::Annotation::OntologyTerm; use Data::Dumper; use base qw(Bio::SeqIO); -%main::eg_to_ll =('Official Full Name' => 'OFFICIAL_GENE_NAME', - 'chromosome' => 'CHR', - 'cyto' => 'MAP', - 'Official Symbol' => 'OFFICIAL_SYMBOL'); +%main::eg_to_ll = ( + 'Official Full Name' => 'OFFICIAL_GENE_NAME', + 'chromosome' => 'CHR', + 'cyto' => 'MAP', + 'Official Symbol' => 'OFFICIAL_SYMBOL' +); @main::egonly = keys %main::eg_to_ll; -# We define $xval and some other variables so we don't have + +# We define $xval and some other variables so we don't have # to pass them as arguments -my ($seq,$ann,$xval,%seqcollection,$buf); +my ( $seq, $ann, $xval, %seqcollection, $buf ); sub _initialize { - my($self, at args) = @_; - $self->SUPER::_initialize(@args); - my %param = @args; - @param{ map { lc $_ } keys %param } = values %param; # lowercase keys - $self->{_debug} = $param{-debug} || 'off'; - $self->{_locuslink} = $param{-locuslink}||'no'; - $self->{_service_record} = $param{-service_record}||'no'; - $self->{_parser} = Bio::ASN1::EntrezGene->new(file=>$param{-file}); - #Instantiate the low level parser here (it is -file in Bioperl - #-should tell M.) - #$self->{_parser}->next_seq; #First empty record- bug in Bio::ASN::Parser + my ( $self, @args ) = @_; + $self->SUPER::_initialize(@args); + my %param = @args; + @param{ map { lc $_ } keys %param } = values %param; # lowercase keys + $self->{_debug} = $param{-debug} || 'off'; + $self->{_locuslink} = $param{-locuslink} || 'no'; + $self->{_service_record} = $param{-service_record} || 'no'; + $self->{_parser} = Bio::ASN1::EntrezGene->new( file => $param{-file} ); + + #Instantiate the low level parser here (it is -file in Bioperl + #-should tell M.) + #$self->{_parser}->next_seq; #First empty record- bug in Bio::ASN::Parser } +sub next_seq { + my $self = shift; + my $value = $self->{_parser}->next_seq(1); -sub next_seq { - my $self=shift; - my $value = $self->{_parser}->next_seq(1); - # $value contains data structure for the - # record being parsed. 2 indicates the recommended - # trimming mode of the data structure - #I use 1 as I prefer not to descend into size 0 arrays - return unless ($value); - my $debug=$self->{_debug}; - $self->{_ann} = Bio::Annotation::Collection->new(); + # $value contains data structure for the + # record being parsed. 2 indicates the recommended + # trimming mode of the data structure + #I use 1 as I prefer not to descend into size 0 arrays + return unless ($value); + my $debug = $self->{_debug}; + $self->{_ann} = Bio::Annotation::Collection->new(); $self->{_currentann} = Bio::Annotation::Collection->new(); my @alluncaptured; + # parse the entry #my @keys=keys %{$value}; obsolete - $xval=$value->[0]; - #return unless ($xval->{gene}->{desc} eq 'albumin'); - #return new Bio::Seq (-id=>'Generif service record', -seq=>'') unless ($xval->{'track-info'}{geneid}== 283); - return Bio::Seq->new(-id=>'Generif service record', -seq=>'') if (($self->{_service_record} ne 'yes')&& - ($xval->{gene}->{desc} =~ /record to support submission of generifs for a gene not in entrez/i)); + $xval = $value->[0]; + + #return unless ($xval->{gene}->{desc} eq 'albumin'); + #return new Bio::Seq (-id=>'Generif service record', -seq=>'') + # unless ($xval->{'track-info'}{geneid}== 283); + return Bio::Seq->new( -id => 'Generif service record', -seq => '' ) + if ( + ( $self->{_service_record} ne 'yes' ) + && ( $xval->{gene}->{desc} =~ + /record to support submission of generifs for a gene not in entrez/i + ) + ); + #Basic data - #$xval->{summary}=~s/\n//g; + #$xval->{summary}=~s/\n//g; my $seq = Bio::Seq->new( - -display_id => $xval->{gene}{locus}, - -accession_number =>$xval->{'track-info'}{geneid}, - -desc=>$xval->{summary} - ); + -display_id => $xval->{gene}{locus}, + -accession_number => $xval->{'track-info'}{geneid}, + -desc => $xval->{summary} + ); + #Source data here - $self->_add_to_ann($xval->{'track-info'}->{status},'Entrez Gene Status'); - my $lineage=$xval->{source}{org}{orgname}{lineage}; - $lineage=~s/[\s\n]//g; - my ($comp, at lineage); + $self->_add_to_ann( $xval->{'track-info'}->{status}, 'Entrez Gene Status' ); + my $lineage = $xval->{source}{org}{orgname}{lineage}; + $lineage =~ s/[\s\n]//g; + my ( $comp, @lineage ); while ($lineage) { - ($comp,$lineage)=split(/;/,$lineage,2); - unshift @lineage,$comp; + ( $comp, $lineage ) = split( /;/, $lineage, 2 ); + unshift @lineage, $comp; } - unless (exists($xval->{source}{org}{orgname}{name}{binomial})) { + unless ( exists( $xval->{source}{org}{orgname}{name}{binomial} ) ) { shift @lineage; - my ($gen,$sp)=split(/\s/, $xval->{source}{org}{taxname}); - if (($sp)&&($sp ne '')) { - if ($gen=~/plasmid/i) { - $sp=$gen.$sp; + my ( $gen, $sp ) = split( /\s/, $xval->{source}{org}{taxname} ); + if ( ($sp) && ( $sp ne '' ) ) { + if ( $gen =~ /plasmid/i ) { + $sp = $gen . $sp; } - unshift @lineage,$sp; + unshift @lineage, $sp; } else { - unshift @lineage,'unknown'; + unshift @lineage, 'unknown'; } } else { - my $sp=$xval->{source}{org}{orgname}{name}{binomial}{species}; - if (($sp)&&($sp ne '')) { - my ($spc,$strain)=split('sp.',$sp);#Do we need strain? - $spc=~s/\s//g; - if (($spc)&&($spc ne '')) { - unshift @lineage,$spc; + my $sp = $xval->{source}{org}{orgname}{name}{binomial}{species}; + if ( ($sp) && ( $sp ne '' ) ) { + my ( $spc, $strain ) = split( 'sp.', $sp ); #Do we need strain? + $spc =~ s/\s//g; + if ( ($spc) && ( $spc ne '' ) ) { + unshift @lineage, $spc; } else { - unshift @lineage,'unknown'; + unshift @lineage, 'unknown'; } } else { - unshift @lineage,'unknown'; + unshift @lineage, 'unknown'; } } + #print Dumper($xval->{source}{org}); my $ncbiid; - if (ref($xval->{source}{org}{db}) eq 'ARRAY') { - foreach my $taxonomy (@{$xval->{source}{org}{db}}) { - if (lc($taxonomy->{db}) eq 'taxon') { - $ncbiid=$taxonomy->{tag}{id}; - } - else { - push @alluncaptured,$taxonomy; - } - delete $xval->{source}{org}{db}; - } - } - $ncbiid= $ncbiid||$xval->{source}{org}{db}{tag}{id}; - my $s1=shift @lineage; - my $s2=shift @lineage; - my $specie=Bio::Species->new(-classification=>[$s1 , $s2], - -ncbi_taxid=>$ncbiid); - $specie->common_name($xval->{source}{org}{common}); - if (exists($xval->{source}->{subtype}) && ($xval->{source}->{subtype})) { - if (ref($xval->{source}->{subtype}) eq 'ARRAY') { - foreach my $subtype (@{$xval->{source}->{subtype}}) { - $self->_add_to_ann($subtype->{name},$subtype->{subtype}); + if ( ref( $xval->{source}{org}{db} ) eq 'ARRAY' ) { + foreach my $taxonomy ( @{ $xval->{source}{org}{db} } ) { + if ( lc( $taxonomy->{db} ) eq 'taxon' ) { + $ncbiid = $taxonomy->{tag}{id}; } + else { + push @alluncaptured, $taxonomy; + } + delete $xval->{source}{org}{db}; } + } + $ncbiid = $ncbiid || $xval->{source}{org}{db}{tag}{id}; + my $s1 = shift @lineage; + my $s2 = shift @lineage; + my $specie = Bio::Species->new( + -classification => [ $s1, $s2 ], + -ncbi_taxid => $ncbiid + ); + $specie->common_name( $xval->{source}{org}{common} ); + if ( exists( $xval->{source}->{subtype} ) + && ( $xval->{source}->{subtype} ) ) + { + if ( ref( $xval->{source}->{subtype} ) eq 'ARRAY' ) { + foreach my $subtype ( @{ $xval->{source}->{subtype} } ) { + $self->_add_to_ann( $subtype->{name}, $subtype->{subtype} ); + } + } else { - $self->_add_to_ann($xval->{source}->{subtype}->{name},$xval->{source}->{subtype}->{subtype}); + $self->_add_to_ann( + $xval->{source}->{subtype}->{name}, + $xval->{source}->{subtype}->{subtype} + ); } } + #Synonyms - if (ref($xval->{gene}->{syn}) eq 'ARRAY') { - foreach my $symsyn (@{$xval->{gene}->{syn}}) { - $self->_add_to_ann($symsyn,'ALIAS_SYMBOL'); + if ( ref( $xval->{gene}->{syn} ) eq 'ARRAY' ) { + foreach my $symsyn ( @{ $xval->{gene}->{syn} } ) { + $self->_add_to_ann( $symsyn, 'ALIAS_SYMBOL' ); } } else { - $self->_add_to_ann($xval->{gene}->{syn},'ALIAS_SYMBOL') if ($xval->{gene}->{syn}); + $self->_add_to_ann( $xval->{gene}->{syn}, 'ALIAS_SYMBOL' ) + if ( $xval->{gene}->{syn} ); } - + #COMMENTS (STS not dealt with yet) - if (exists($xval->{comments})) { - if (ref($xval->{comments}) eq 'ARRAY') { - for my $i (0..$#{$xval->{comments}}) { - $self->{_current}=$xval->{comments}->[$i]; - push @alluncaptured,$self->_process_all_comments(); - } - } - else { - $self->{_current}=$xval->{comments}; - push @alluncaptured,$self->_process_all_comments(); - } - } @@ Diff output truncated at 10000 characters. @@ From bugzilla-daemon at portal.open-bio.org Mon Apr 20 16:29:30 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 20 Apr 2009 16:29:30 -0400 Subject: [Bioperl-guts-l] [Bug 2816] Bio::SeqIO::entrezgene no longer contains Features In-Reply-To: Message-ID: <200904202029.n3KKTUkR018786@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2816 cjfields at bioperl.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |INVALID ------- Comment #2 from cjfields at bioperl.org 2009-04-20 16:29 EST ------- Invalid. The gene structure is returned separately from the gene itself: while (my ($gene,$genestructure,$uncaptured) = $io->next_seq) { # get exons from $genestructure } which wasn't immediately apparent. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Apr 22 08:37:41 2009 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 22 Apr 2009 08:37:41 -0400 Subject: [Bioperl-guts-l] [Bug 2668] bioperl-ext fails to compile on x86_64 In-Reply-To: Message-ID: <200904221237.n3MCbfH8008942@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2668 andrea at familie-schafferhans.de changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |andrea at familie- | |schafferhans.de ------- Comment #2 from andrea at familie-schafferhans.de 2009-04-22 08:37 EST ------- I have run across this bug when installing bioperl-ext on a x86_64 architecture. The solution is to go to the Bio/Ext/Align/libs subdirectory and add -fPIC as a flag to CFLAGS (as indicated in the comment in that file). -- Unfortunately I have no idea how to automate this in the build process. Therefore I cannot resolve the bug. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From heikki at dev.open-bio.org Wed Apr 22 08:57:33 2009 From: heikki at dev.open-bio.org (Heikki Lehvaslaiho) Date: Wed, 22 Apr 2009 08:57:33 -0400 Subject: [Bioperl-guts-l] [15656] bioperl-run/trunk/Bio/Tools/Run/Alignment/Exonerate.pm: tested under exonerate 2.2 Message-ID: <200904221257.n3MCvXdc027946@dev.open-bio.org> Revision: 15656 Author: heikki Date: 2009-04-22 08:57:32 -0400 (Wed, 22 Apr 2009) Log Message: ----------- tested under exonerate 2.2 Modified Paths: -------------- bioperl-run/trunk/Bio/Tools/Run/Alignment/Exonerate.pm Modified: bioperl-run/trunk/Bio/Tools/Run/Alignment/Exonerate.pm =================================================================== --- bioperl-run/trunk/Bio/Tools/Run/Alignment/Exonerate.pm 2009-04-20 20:23:10 UTC (rev 15655) +++ bioperl-run/trunk/Bio/Tools/Run/Alignment/Exonerate.pm 2009-04-22 12:57:32 UTC (rev 15656) @@ -1,8 +1,7 @@ -# Wrapper module for Exonerate Bio::Tools::Run::Alignment::Exonerate # $Id$ -# -# Please direct questions and support issues to # +# Please direct questions and support issues to +# # Cared for by Shawn Hoon # # Copyright Shawn Hoon @@ -17,7 +16,6 @@ =head1 SYNOPSIS - use Bio::Tools::Run::Alignment::Exonerate; use Bio::SeqIO; @@ -29,7 +27,8 @@ #exonerate parameters can all be passed via arguments parameter. #parameters passed are not checked for validity - my $run = Bio::Tools::Run::Alignment::Exonerate->new(arguments=>'--model est2genome --bestn 10'); + my $run = Bio::Tools::Run::Alignment::Exonerate-> + new(arguments=>'--model est2genome --bestn 10'); my $searchio_obj = $run->run($query,$target); while(my $result = $searchio->next_result){ @@ -42,40 +41,43 @@ =head1 DESCRIPTION - Wrapper for Exonerate alignment program. You can get exonerate at - http://www.ebi.ac.uk/~guy/exonerate/. - This wrapper is written without parameter checking. All parameters are passed - via the arugment parameter that is passed in the constructor. See SYNOPSIS. - For exonerate parameters, run exonerate --help for more details. +Wrapper for Exonerate alignment program. You can get exonerate at +http://www.ebi.ac.uk/~guy/exonerate/. This wrapper is written without +parameter checking. All parameters are passed via the arugment +parameter that is passed in the constructor. See SYNOPSIS. For +exonerate parameters, run exonerate --help for more details. +=head1 PROGRAM VERSIONS +The tests have been shown to pass with exonorate versions 2.0 - 2.2. + =head1 FEEDBACK =head2 Mailing Lists - User feedback is an integral part of the evolution of this and other - Bioperl modules. Send your comments and suggestions preferably to one - of the Bioperl mailing lists. Your participation is much appreciated. +User feedback is an integral part of the evolution of this and other +Bioperl modules. Send your comments and suggestions preferably to one +of the Bioperl mailing lists. Your participation is much appreciated. bioperl-l at bioperl.org - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists -=head2 Support - +=head2 Support + Please direct usage questions or support issues to the mailing list: - + L - -rather than to the module maintainer directly. Many experienced and -reponsive experts will be able look at the problem and quickly -address it. Please include a thorough description of the problem + +rather than to the module maintainer directly. Many experienced and +reponsive experts will be able look at the problem and quickly +address it. Please include a thorough description of the problem with code and data examples if at all possible. =head2 Reporting Bugs - Report bugs to the Bioperl bug tracking system to help us keep track - the bugs and their resolution. Bug reports can be submitted via the - web: +Report bugs to the Bioperl bug tracking system to help us keep track +the bugs and their resolution. Bug reports can be submitted via the +web: http://bugzilla.open-bio.org/ @@ -85,8 +87,8 @@ =head1 APPENDIX - The rest of the documentation details each of the object - methods. Internal methods are usually preceded with a _ +The rest of the documentation details each of the object +methods. Internal methods are usually preceded with a _ =cut @@ -206,7 +208,7 @@ my ($self,$query,$target) = @_; my @feats; my ($file1) = $self->_writeInput($query); - my ($file2) = $self->_writeInput($target); + my ($file2) = $self->_writeInput($target); my $assembly = $self->_run($file1,$file2); return $assembly; } @@ -250,9 +252,9 @@ my $exonerate_obj = Bio::SearchIO->new(-file=>"$outfile",-format=>'exonerate'); close($tfh); - undef $tfh; + undef $tfh; unlink $outfile; - + return $exonerate_obj; } From heikki at dev.open-bio.org Mon Apr 27 11:41:28 2009 From: heikki at dev.open-bio.org (Heikki Lehvaslaiho) Date: Mon, 27 Apr 2009 11:41:28 -0400 Subject: [Bioperl-guts-l] [15657] bioperl-live/trunk: added methods to extract regions based on quality threshold value. Message-ID: <200904271541.n3RFfSfj015908@dev.open-bio.org> Revision: 15657 Author: heikki Date: 2009-04-27 11:41:27 -0400 (Mon, 27 Apr 2009) Log Message: ----------- added methods to extract regions based on quality threshold value. Based on code by Dan Bolser. Modified Paths: -------------- bioperl-live/trunk/Bio/Seq/Quality.pm bioperl-live/trunk/Bio/Tree/Statistics.pm bioperl-live/trunk/t/Seq/Quality.t bioperl-live/trunk/t/Tree/TreeStatistics.t Modified: bioperl-live/trunk/Bio/Seq/Quality.pm =================================================================== --- bioperl-live/trunk/Bio/Seq/Quality.pm 2009-04-22 12:57:32 UTC (rev 15656) +++ bioperl-live/trunk/Bio/Seq/Quality.pm 2009-04-27 15:41:27 UTC (rev 15657) @@ -152,8 +152,10 @@ =head1 CONTRIBUTORS -Chad Matsalla, bioinformatics at dieselwurks.com +Chad Matsalla, bioinformatics at dieselwurks dot com +Dan Bolser, dan dot bolser at gmail dot com + =head1 APPENDIX The rest of the documentation details each of the object methods. @@ -240,12 +242,18 @@ Returns : reference to an array of meta data Args : new value, string or array ref or Bio::Seq::PrimaryQual, optional +Setting quality values resets the cached good quality ranges that +depend on the set threshold value. + =cut sub qual { my $self = shift; my $value = shift; - $value = $value->qual if ref($value) and ref($value) ne 'ARRAY' and $value->isa('Bio::Seq::PrimaryQual'); + $value = $value->qual + if ref($value) and ref($value) ne 'ARRAY' and + $value->isa('Bio::Seq::PrimaryQual'); + $self->_empty_cache if $value; $self->named_meta($DEFAULT_NAME, $value); } @@ -288,7 +296,7 @@ =cut sub subqual { - shift->named_submeta($DEFAULT_NAME, @_); + shift->named_submeta($DEFAULT_NAME, @_); } =head2 subqual_text @@ -454,14 +462,18 @@ =head2 get_trace_graph Title : get_trace_graph - Usage : @trace_values = $obj->get_trace_graph( -trace => 'a', -scale => 100) - Function : Returns array of raw trace values for a trace file, or false if no trace data exists. - Requires a value for trace to get either the a, g, c or t trace information, and an - optional value for scale, which rescales the data between 0 and the provided value, - a scale value of '0' performs no scaling + Usage : @trace_values = $obj->get_trace_graph( -trace => 'a', + -scale => 100) + Function : Returns array of raw trace values for a trace file, or + false if no trace data exists. Requires a value for trace + to get either the a, g, c or t trace information, and an + optional value for scale, which rescales the data between + 0 and the provided value, a scale value of '0' performs no + scaling Returns : Array or 0 Args : string, trace to retrieve, one of a, g, c or t - integer, scale, for scaling of trace between 0 and scale, or 0 for no scaling, optional + integer, scale, for scaling of trace between 0 and scale, + or 0 for no scaling, optional =cut @@ -484,9 +496,205 @@ @trace_data = map { $_ / $max * $scale } @trace_data; } return @trace_data; -} +} +=head2 threshold + + Title : threshold + Usage : $qual->threshold($value); + Function: Sets the threshold for good quality values. + Returns : an integer + Args : new value, optional + +Value used by *clear_range* method below. + +=cut + +sub threshold { + my $self = shift; + my $value = shift; + if (defined $value) { + $self->throw("Threshold needs to be an integer [$value]") + unless $value =~ /^[-+]?\d+?$/; + $self->_empty_cache + if defined $self->{_threshold} and $self->{_threshold} ne $value; + $self->{_threshold} = $value; + } + return $self->{_threshold}; +} + + +=head2 count_clear_ranges + + Title : count_clear_ranges + Usage : $count = $obj->count_clear_ranges($threshold); + Function: Counts number of ranges in the sequence where quality + values are above the threshold + Returns : count integer + Args : threshold integer, optional + +Set threshold first using method L. + +=cut + +sub count_clear_ranges { + my $self = shift; + my $threshold = shift; + $self->threshold($threshold) if defined $threshold; + + # populate the cache if needed + $self->_find_clear_ranges unless defined $self->{_ranges}; + + return scalar @{$self->{_ranges}} +} + +=head2 clear_ranges_length + + Title : clear_ranges_length + Usage : $total_lenght = $obj->clear_ranges_length($threshold); + Function: Return number of residues with quality values above + the threshold in all clear ranges + Returns : an integer + Args : threshold, optional + +Set threshold first using method L. + +=cut + +sub clear_ranges_length { + my $self = shift; + my $threshold = shift; + $self->threshold($threshold) if defined $threshold; + + # populate the cache if needed + $self->_find_clear_ranges unless defined $self->{_ranges}; + + my $sum; + map {$sum += $_->{length}} @{$self->{_ranges}}; + return $sum; +} + +=head2 get_clear_range + + Title : get_clear_range + Usage : $newqualobj = $obj->get_clear_range($threshold); + Function: Return longest subsequence that has quality values above + the threshold + Returns : a new Bio::Seq::Quality object + Args : threshold, optional + +Set threshold first using method L. + +=cut + +sub get_clear_range { + my $self = shift; + my $threshold = shift; + $self->threshold($threshold) if defined $threshold; + + # populate the cache if needed + $self->_find_clear_ranges unless defined $self->{_ranges}; + + # pick the longest + for (sort {$b->{length} <=> $a->{length} } @{$self->{_ranges}} ){ + + return Bio::Seq::Quality->new + ( -seq => $self->subseq( $_->{start}, $_->{end}), + -qual => $self->subqual($_->{start}, $_->{end}) + ); + } +} + + + +=head2 get_all_clean_ranges + + Title : get_all_clean_ranges + Usage : @ranges = $obj->get_all_clean_ranges($minlength); + Function: Return all ranges where quality values are above + the threshold. Original ordering. + Returns : an ordered array of new Bio::Seq::Quality objects + Args : minimum length , optional + +Set threshold first using method L. + +=cut + +sub get_all_clean_ranges { + my $self = shift; + my $minl = shift; + + $minl ||= 0; + $self->throw("Mimimum length needs to be zero or a positive integer [$minl]") + unless $minl =~ /^\+?\d+?$/; + + # populate the cache if needed + $self->_find_clear_ranges unless defined $self->{_ranges}; + + # return in the order of occurrence + my @ranges; + for my $r (sort {$b->{start} <=> $a->{start} } @{$self->{_ranges}} ){ + next if $r->{length} < $minl; + + push @ranges, Bio::Seq::Quality->new + ( -seq => $self->subseq( $r->{start}, $r->{end}), + -qual => $self->subqual($r->{start}, $r->{end}) + ); + } + return @ranges; +} + + +# +# _find_clear_ranges: where range/threshold calculations happen +# + +sub _find_clear_ranges { + my $self = shift; + + $self->throw("You need to set the threshold value first") + unless defined $self->threshold; + + my $flag = 0; + my $threshold = $self->threshold; + my $i = 0; + foreach my $q (@{$self->qual}) { + $i++; + # print "$i -- $q\n"; + if ( $flag ){ + if ($q < $threshold) { + my $range->{end} = $i-1; + $range->{start} = $flag; + $range->{length} = $i - $flag; + push @{$self->{_ranges}}, $range; + $flag = 0; # reset flag + } + } else { + $flag = $i if $q >= $threshold; + } + } + + if( $flag ){ + ## Log the range + my $range->{end} = $i; + $range->{start} = $flag; + $range->{length} = $i - $flag + 1; + push @{$self->{_ranges}}, $range; + } + + 1; +} + + +sub _empty_cache { + my $self = shift; + undef $self->{_ranges}; +} + + + + ################## deprecated methods ################## Modified: bioperl-live/trunk/Bio/Tree/Statistics.pm =================================================================== --- bioperl-live/trunk/Bio/Tree/Statistics.pm 2009-04-22 12:57:32 UTC (rev 15656) +++ bioperl-live/trunk/Bio/Tree/Statistics.pm 2009-04-27 15:41:27 UTC (rev 15657) @@ -153,8 +153,8 @@ in a binary tree Returns : integer Exceptions : - Args : Bio::Tree::TreeI object - Bio::Tree::NodeI object within the tree, optional + Args : 1. Bio::Tree::TreeI object + 2. Bio::Tree::NodeI object within the tree, optional Commonly used statistics assume a binary tree, but this methods returns a value even for trees with polytomies. @@ -187,13 +187,60 @@ =head2 Tree-Trait statistics The following methods produce desciptors of trait distribution among -leaf nodes within the trees. They require that a trait has to be set +leaf nodes within the trees. They require that a trait has been set for each leaf node. The tag methods of Bio::Tree::Node are used to store them as key/value pairs. In this way, one tree can store more -than on trait. +than one trait. Trees have method add_traits() to set trait values from a file. + +=head2 fitch + + Example : fitch($tree, $key, $node); + Description: Calculates Parsimony Score (PS) and internal trait + values using the Fitch 1971 parsimony algorithm for + the subtree a defined by the (internal) node. + Node defaults to the root. + Returns : true on success + Exceptions : leaf nodes have to have the trait defined + Args : 1. Bio::Tree::TreeI object + 2. trait name string + 3. Bio::Tree::NodeI object within the tree, optional + +Runs first L that calculats parsimony scores and then +L that should resolve most of the trait/character state +ambiguities. + +Fitch, W.M., 1971. Toward de?ning the course of evolution: minimal +change for a speci?c tree topology. Syst. Zool. 20, 406-416. + +You can access calculated parsimony values using: + + $score = $node->->get_tag_values('ps_score'); + +and the trait value with: + + $traitvalue = $node->->get_tag_values('ps_trait'); + @traitvalues = $node->->get_tag_values('ps_trait'); + +Note that there can be more that one trait values, especially for the +root node. + +=cut + +sub fitch { + my $self = shift; + my $tree = shift; + my $key = shift || $self->throw("Trait name is needed"); + my $node = shift || $tree->get_root_node; + + $self->fitch_up($tree, $key, $node); + $self->fitch_down($tree, $node); +} + + + =head2 ps Example : ps($tree, $key, $node); @@ Diff output truncated at 10000 characters. @@ From heikki at dev.open-bio.org Tue Apr 28 04:52:08 2009 From: heikki at dev.open-bio.org (Heikki Lehvaslaiho) Date: Tue, 28 Apr 2009 04:52:08 -0400 Subject: [Bioperl-guts-l] [15658] bioperl-live/trunk/Bio/Tree/Statistics.pm: fix illegal division by zero. Message-ID: <200904280852.n3S8q8Pp018822@dev.open-bio.org> Revision: 15658 Author: heikki Date: 2009-04-28 04:52:06 -0400 (Tue, 28 Apr 2009) Log Message: ----------- fix illegal division by zero. Note: the code was committed by accident and is still under rapid development. Still, it is mostly functional. Modified Paths: -------------- bioperl-live/trunk/Bio/Tree/Statistics.pm Modified: bioperl-live/trunk/Bio/Tree/Statistics.pm =================================================================== --- bioperl-live/trunk/Bio/Tree/Statistics.pm 2009-04-27 15:41:27 UTC (rev 15657) +++ bioperl-live/trunk/Bio/Tree/Statistics.pm 2009-04-28 08:52:06 UTC (rev 15658) @@ -685,8 +685,10 @@ my $tree = shift; my $node = shift || $tree->get_root_node; - return $node->branch_length / - $self->genetic_diversity($tree, $node); + my $div = $self->genetic_diversity($tree, $node); + return 0 if $div == 0; + return $node->branch_length / $div; + } From thm09830 at dev.open-bio.org Tue Apr 28 11:59:46 2009 From: thm09830 at dev.open-bio.org (Marian Thieme) Date: Tue, 28 Apr 2009 11:59:46 -0400 Subject: [Bioperl-guts-l] [15659] bioperl-live/trunk/Bio/Microarray/Tools/MitoChipV2Parser.pm: fix authors email address Message-ID: <200904281559.n3SFxkPo020528@dev.open-bio.org> Revision: 15659 Author: thm09830 Date: 2009-04-28 11:59:46 -0400 (Tue, 28 Apr 2009) Log Message: ----------- fix authors email address Modified Paths: -------------- bioperl-live/trunk/Bio/Microarray/Tools/MitoChipV2Parser.pm Modified: bioperl-live/trunk/Bio/Microarray/Tools/MitoChipV2Parser.pm =================================================================== --- bioperl-live/trunk/Bio/Microarray/Tools/MitoChipV2Parser.pm 2009-04-28 08:52:06 UTC (rev 15658) +++ bioperl-live/trunk/Bio/Microarray/Tools/MitoChipV2Parser.pm 2009-04-28 15:59:46 UTC (rev 15659) @@ -21,7 +21,7 @@ =head1 AUTHORS -Marian Thieme (marian.thieme at arcor.de) +Marian Thieme (marian.thieme at gmail.com) =head1 COPYRIGHT