From bugzilla-daemon at portal.open-bio.org Tue Apr 1 10:14:18 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 1 Apr 2008 10:14:18 -0400 Subject: [Bioperl-guts-l] [Bug 2466] NCBIHelper redirecting RefSeq sequence download to EBI server In-Reply-To: Message-ID: <200804011414.m31EEICY014764@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2466 ------- Comment #2 from nsoranzo at tiscali.it 2008-04-01 10:14 EST ------- (In reply to comment #1) > Not sure why the redirect is in place; I'll try looking back to determine why > it was added in. If needed we can leave the code in but change the default > 'no_redirect' setting to 1. I don't know why it was added, surely changing the default would help, but anyway I think eliminating the code in question would be better. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Apr 1 10:26:02 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 1 Apr 2008 10:26:02 -0400 Subject: [Bioperl-guts-l] [Bug 2466] NCBIHelper redirecting RefSeq sequence download to EBI server In-Reply-To: Message-ID: <200804011426.m31EQ2sa016746@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2466 ------- Comment #3 from cjfields at uiuc.edu 2008-04-01 10:26 EST ------- I agree. I still need to track down the reason (I think it has something to do with better annotation with EMBL format). However, if it isn't working as advertised then the best fix is removing or commenting out the offending code. I'm picking the latter option with a comment that points to this bug report; I'll close this out when the fix is committed. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Tue Apr 1 12:31:17 2008 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Tue, 1 Apr 2008 12:31:17 -0400 Subject: [Bioperl-guts-l] [14638] bioperl-live/trunk/Bio/DB/NCBIHelper.pm: bug 2466 : Message-ID: <200804011631.m31GVHRG009990@dev.open-bio.org> Revision: 14638 Author: cjfields Date: 2008-04-01 12:31:16 -0400 (Tue, 01 Apr 2008) Log Message: ----------- bug 2466 : * change default behavior of Bio::DB::GenBank to always retrieve from NCBI * deprecate 'no_redirect' in favor of 'redirect_refseq', which must be set for RefSeq redirection (see note above) * make explicit getter/setters out of redirect_refseq, no_redirect, seq_start, seq_end, strand, complexity along with docs Modified Paths: -------------- bioperl-live/trunk/Bio/DB/NCBIHelper.pm Modified: bioperl-live/trunk/Bio/DB/NCBIHelper.pm =================================================================== --- bioperl-live/trunk/Bio/DB/NCBIHelper.pm 2008-04-01 00:02:24 UTC (rev 14637) +++ bioperl-live/trunk/Bio/DB/NCBIHelper.pm 2008-04-01 16:31:16 UTC (rev 14638) @@ -104,30 +104,21 @@ 'gbwithparts' => 'genbank', ); $DEFAULTFORMAT = 'gb'; - @ATTRIBUTES = qw(complexity strand seq_start seq_stop no_redirect); - for my $method (@ATTRIBUTES) { - eval <{'_$method'}; - \$self->{'_$method'} = shift if \@_; - \$d; + @ATTRIBUTES = qw(complexity strand seq_start seq_stop); } -END - } -} # the new way to make modules a little more lightweight sub new { my ($class, @args ) = @_; my $self = $class->SUPER::new(@args); - my ($seq_start,$seq_stop,$no_redirect,$complexity,$strand) = - $self->_rearrange([qw(SEQ_START SEQ_STOP NO_REDIRECT COMPLEXITY STRAND)], + my ($seq_start,$seq_stop,$no_redirect, $redirect, $complexity,$strand) = + $self->_rearrange([qw(SEQ_START SEQ_STOP NO_REDIRECT REDIRECT_REFSEQ COMPLEXITY STRAND)], @args); $seq_start && $self->seq_start($seq_start); $seq_stop && $self->seq_stop($seq_stop); $no_redirect && $self->no_redirect($no_redirect); + $redirect && $self->redirect_refseq($redirect); $strand && $self->strand($strand); # adjust statement to accept zero value defined $complexity && ($complexity >=0 && $complexity <=4) @@ -336,6 +327,121 @@ return @{$self->{'_format'}}; } +=head2 redirect_refseq + + Title : redirect_refseq + Usage : $db->redirect_refseq(1) + Function: simple getter/setter which redirects RefSeqs to use Bio::DB::RefSeq + Returns : Boolean value + Args : Boolean value (optional) + Throws : 'unparseable output exception' + Note : This replaces 'no_redirect' as a more straightforward flag to + redirect possible RefSeqs to use Bio::DB::RefSeq (EBI interface) + instead of retrievign the NCBI records + +=cut + +sub redirect_refseq { + my $self = shift; + return $self->{'_redirect_refseq'} = shift if @_; + return $self->{'_redirect_refseq'}; +} + +=head2 complexity + + Title : complexity + Usage : $db->complexity(3) + Function: get/set complexity value + Returns : value from 0-4 indicating level of complexity + Args : value from 0-4 (optional); if unset server assumes 1 + Throws : if arg is not an integer or falls outside of noted range above + Note : From efetch docs: + + Complexity regulates the display: + + * 0 - get the whole blob + * 1 - get the bioseq for gi of interest (default in Entrez) + * 2 - get the minimal bioseq-set containing the gi of interest + * 3 - get the minimal nuc-prot containing the gi of interest + * 4 - get the minimal pub-set containing the gi of interest + +=cut + +sub complexity { + my ($self, $comp) = @_; + if (defined $comp) { + $self->throw("Complexity value must be integer between 0 and 4") if + $comp !~ /^\d+$/ || $comp < 0 || $comp > 4; + $self->{'_complexity'} = $comp; + } + return $self->{'_complexity'}; +} + +=head2 strand + + Title : strand + Usage : $db->strand(1) + Function: get/set strand value + Returns : strand value if set + Args : value of 1 (plus) or 2 (minus); if unset server assumes 1 + Throws : if arg is not an integer or is not 1 or 2 + Note : This differs from BioPerl's use of strand: 1 = plus, -1 = minus 0 = not relevant. + We should probably add in some functionality to convert over in the future. + +=cut + +sub strand { + my ($self, $str) = @_; + if ($str) { + $self->throw("strand() must be integer value of 1 (plus strand) or 2 (minus strand) if set") if + $str !~ /^\d+$/ || $str < 1 || $str > 2; + $self->{'_strand'} = $str; + } + return $self->{'_strand'}; +} + +=head2 seq_start + + Title : seq_start + Usage : $db->seq_start(123) + Function: get/set sequence start location + Returns : sequence start value if set + Args : integer; if unset server assumes 1 + Throws : if arg is not an integer + +=cut + +sub seq_start { + my ($self, $start) = @_; + if ($start) { + $self->throw("seq_start() must be integer value if set") if + $start !~ /^\d+$/; + $self->{'_seq_start'} = $start; + } + return $self->{'_seq_start'}; +} + +=head2 seq_stop + + Title : seq_stop + Usage : $db->seq_stop(456) + Function: get/set sequence stop (end) location + Returns : sequence stop (end) value if set + Args : integer; if unset server assumes 1 + Throws : if arg is not an integer + +=cut + +sub seq_stop { + my ($self, $stop) = @_; + if ($stop) { + $self->throw("seq_stop() must be integer if set") if + $stop !~ /^\d+$/; + $self->{'_seq_stop'} = $stop; + } + return $self->{'_seq_stop'}; +} + =head2 Bio::DB::WebDBSeqI methods Overriding WebDBSeqI method to help newbies to retrieve sequences @@ -383,7 +489,7 @@ # Asking for a RefSeq from EMBL/GenBank - unless ($self->no_redirect) { + if ($self->redirect_refseq) { if ($ids =~ /N._/) { $self->warn("[$ids] is not a normal sequence database but a RefSeq entry.". " Redirecting the request.\n") @@ -461,6 +567,29 @@ my ($querykey) = $content =~ m!(\d+)!; $self->cookie(uri_unescape($cookie),$querykey); } + +########### DEPRECATED!!!! ########### + +=head2 no_redirect + + Title : no_redirect + Usage : $db->no_redirect($content) + Function: Used to indicate that Bio::DB::GenBank instance retrieves + possible RefSeqs from EBI instead; default behavior is now to + retrieve directly from NCBI + Returns : None + Args : None + Throws : Method is deprecated in favor of positive flag method 'redirect_refseq' + +=cut + +sub no_redirect { + shift->throw( + "Use of no_redirect() is deprecated. Bio::DB::GenBank default is to always\n". + "retrieve from NCBI. In order to redirect possible RefSeqs to EBI, set\n". + "redirect_refseq flag to 1"); +} + 1; __END__ From cjfields at dev.open-bio.org Tue Apr 1 12:39:11 2008 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Tue, 1 Apr 2008 12:39:11 -0400 Subject: [Bioperl-guts-l] [14639] bioperl-live/trunk/Bio/DB: Remove extraneous code; update docs Message-ID: <200804011639.m31GdBkW010041@dev.open-bio.org> Revision: 14639 Author: cjfields Date: 2008-04-01 12:39:11 -0400 (Tue, 01 Apr 2008) Log Message: ----------- Remove extraneous code; update docs Modified Paths: -------------- bioperl-live/trunk/Bio/DB/GenBank.pm bioperl-live/trunk/Bio/DB/NCBIHelper.pm Modified: bioperl-live/trunk/Bio/DB/GenBank.pm =================================================================== --- bioperl-live/trunk/Bio/DB/GenBank.pm 2008-04-01 16:31:16 UTC (rev 14638) +++ bioperl-live/trunk/Bio/DB/GenBank.pm 2008-04-01 16:39:11 UTC (rev 14639) @@ -106,10 +106,11 @@ (the reason is that NT contigs are rather annotation with references to clones). -Some work has been done to automatically detect and retrieve whole NT_ -clones when the data is in that format (NCBI RefSeq clones). More -testing and feedback from users is needed to achieve a good fit of -functionality and ease of use. +Some work has been done to automatically detect and retrieve whole NT_ clones +when the data is in that format (NCBI RefSeq clones). The former behavior prior +to bioperl 1.6 was to retrieve these from EBI, but now these are retrieved +directly from NCBI. The older behavior can be regained by setting the +'redirect_refseq' flag to a value evaluating to TRUE. =head1 FEEDBACK Modified: bioperl-live/trunk/Bio/DB/NCBIHelper.pm =================================================================== --- bioperl-live/trunk/Bio/DB/NCBIHelper.pm 2008-04-01 16:31:16 UTC (rev 14638) +++ bioperl-live/trunk/Bio/DB/NCBIHelper.pm 2008-04-01 16:39:11 UTC (rev 14639) @@ -35,7 +35,7 @@ common HTML stripping done in L(). The base NCBI query URL used is: -http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi +http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi =head1 FEEDBACK @@ -74,7 +74,7 @@ package Bio::DB::NCBIHelper; use strict; -use vars qw($HOSTBASE %CGILOCATION %FORMATMAP $DEFAULTFORMAT $MAX_ENTRIES $VERSION @ATTRIBUTES); +use vars qw($HOSTBASE %CGILOCATION %FORMATMAP $DEFAULTFORMAT $MAX_ENTRIES $VERSION); use Bio::DB::Query::GenBank; use HTTP::Request::Common; @@ -104,7 +104,6 @@ 'gbwithparts' => 'genbank', ); $DEFAULTFORMAT = 'gb'; - @ATTRIBUTES = qw(complexity strand seq_start seq_stop); } # the new way to make modules a little more lightweight From bugzilla-daemon at portal.open-bio.org Tue Apr 1 12:45:28 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 1 Apr 2008 12:45:28 -0400 Subject: [Bioperl-guts-l] [Bug 2466] NCBIHelper redirecting RefSeq sequence download to EBI server In-Reply-To: Message-ID: <200804011645.m31GjS3T032091@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2466 cjfields at uiuc.edu changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #4 from cjfields at uiuc.edu 2008-04-01 12:45 EST ------- The redirection was added in to retrieve RefSeqs from EBI (which has slightly better annotation). However, my take on that is one should use Bio::DB::RefSeq under these circumstances; using Bio::DB::GenBank implies the sequences are retrieved from GenBank by default. As a compromise, I have deprecated use of the 'no_redirect' flag in favor of always retrieving the RefSeq from GenBank. If one wants the old redirection behavior (present in the last few BioPerl dev. releases) they must explicitly code for it using the 'redirect_refseq' flag. I have also cleaned up some of the implicit getter/setters and added relevant docs where needed. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Tue Apr 1 23:57:25 2008 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Tue, 1 Apr 2008 23:57:25 -0400 Subject: [Bioperl-guts-l] [14640] bioperl-live/trunk/Bio/SearchIO/hmmer.pm: Squash uninit. Message-ID: <200804020357.m323vP9B011020@dev.open-bio.org> Revision: 14640 Author: cjfields Date: 2008-04-01 23:57:25 -0400 (Tue, 01 Apr 2008) Log Message: ----------- Squash uninit. value warnings Modified Paths: -------------- bioperl-live/trunk/Bio/SearchIO/hmmer.pm Modified: bioperl-live/trunk/Bio/SearchIO/hmmer.pm =================================================================== --- bioperl-live/trunk/Bio/SearchIO/hmmer.pm 2008-04-01 16:39:11 UTC (rev 14639) +++ bioperl-live/trunk/Bio/SearchIO/hmmer.pm 2008-04-02 03:57:25 UTC (rev 14640) @@ -1116,7 +1116,7 @@ if ( $nm eq 'Hsp' ) { foreach (qw(Hsp_qseq Hsp_midline Hsp_hseq)) { my $data = $self->{'_last_hspdata'}->{$_}; - if ($_ eq 'Hsp_hseq') { + if ($data && $_ eq 'Hsp_hseq') { # replace hmm '.' gap symbol by '-' $data =~ s/\./-/g; } From cjfields at dev.open-bio.org Wed Apr 2 00:14:00 2008 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Wed, 2 Apr 2008 00:14:00 -0400 Subject: [Bioperl-guts-l] [14641] bioperl-live/trunk/t/RefSeq.t: RefSeq redirection now requires explicit setting ( note that sequence length returned by Bio::DB::RefSeq is off by one, needs investigating). Message-ID: <200804020414.m324E0XT011115@dev.open-bio.org> Revision: 14641 Author: cjfields Date: 2008-04-02 00:14:00 -0400 (Wed, 02 Apr 2008) Log Message: ----------- RefSeq redirection now requires explicit setting (note that sequence length returned by Bio::DB::RefSeq is off by one, needs investigating). Modified Paths: -------------- bioperl-live/trunk/t/RefSeq.t Modified: bioperl-live/trunk/t/RefSeq.t =================================================================== --- bioperl-live/trunk/t/RefSeq.t 2008-04-02 03:57:25 UTC (rev 14640) +++ bioperl-live/trunk/t/RefSeq.t 2008-04-02 04:14:00 UTC (rev 14641) @@ -26,9 +26,9 @@ #test redirection from GenBank and EMBL #GenBank -ok $db = Bio::DB::GenBank->new('-verbose'=>$verbose); +ok $db = Bio::DB::GenBank->new('-verbose'=> $verbose, -redirect_refseq => 1); #EMBL -ok $db2 = Bio::DB::EMBL->new('-verbose'=>$verbose); +ok $db2 = Bio::DB::EMBL->new('-verbose'=> $verbose, -redirect_refseq => 1); eval { $seq = $db->get_Seq_by_acc('NT_006732'); @@ -41,19 +41,19 @@ eval { ok($seq = $db->get_Seq_by_acc('NM_006732')); - is($seq->length, 3776); + is($seq->length, 3775); ok $seq2 = $db2->get_Seq_by_acc('NM_006732'); - is($seq2->length, 3776); + is($seq2->length, 3775); }; skip "Warning: Couldn't connect to RefSeq with Bio::DB::RefSeq.pm!", 4 if $@; eval { ok defined($db = Bio::DB::RefSeq->new(-verbose=>$verbose)); ok(defined($seq = $db->get_Seq_by_acc('NM_006732'))); - is( $seq->length, 3776); + is( $seq->length, 3775); ok defined ($db->request_format('fasta')); ok(defined($seq = $db->get_Seq_by_acc('NM_006732'))); - is( $seq->length, 3776); + is( $seq->length, 3775); }; skip "Warning: Couldn't connect to RefSeq with Bio::DB::RefSeq.pm!", 6 if $@; } From cjfields at dev.open-bio.org Wed Apr 2 00:22:48 2008 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Wed, 2 Apr 2008 00:22:48 -0400 Subject: [Bioperl-guts-l] [14642] bioperl-live/trunk/t/RestrictionAnalysis.t: test not matching error message (oops!) Message-ID: <200804020422.m324MmHZ011143@dev.open-bio.org> Revision: 14642 Author: cjfields Date: 2008-04-02 00:22:48 -0400 (Wed, 02 Apr 2008) Log Message: ----------- test not matching error message (oops!) Modified Paths: -------------- bioperl-live/trunk/t/RestrictionAnalysis.t Modified: bioperl-live/trunk/t/RestrictionAnalysis.t =================================================================== --- bioperl-live/trunk/t/RestrictionAnalysis.t 2008-04-02 04:14:00 UTC (rev 14641) +++ bioperl-live/trunk/t/RestrictionAnalysis.t 2008-04-02 04:22:48 UTC (rev 14642) @@ -88,7 +88,7 @@ eval {$re->is_prototype}; ok($@); -like($@, qr/Couldn't unequivicably assign prototype/, 'bug 2179'); +like($@, qr/Can't unequivocally assign prototype based on input format alone/, 'bug 2179'); $re->verbose(2); is $re->is_prototype(0), 0; From cjfields at dev.open-bio.org Wed Apr 2 00:23:23 2008 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Wed, 2 Apr 2008 00:23:23 -0400 Subject: [Bioperl-guts-l] [14643] bioperl-live/trunk/t/Genpred.t: get rid of variable redefined warnings Message-ID: <200804020423.m324NNVA011171@dev.open-bio.org> Revision: 14643 Author: cjfields Date: 2008-04-02 00:23:23 -0400 (Wed, 02 Apr 2008) Log Message: ----------- get rid of variable redefined warnings Modified Paths: -------------- bioperl-live/trunk/t/Genpred.t Modified: bioperl-live/trunk/t/Genpred.t =================================================================== --- bioperl-live/trunk/t/Genpred.t 2008-04-02 04:22:48 UTC (rev 14642) +++ bioperl-live/trunk/t/Genpred.t 2008-04-02 04:23:23 UTC (rev 14643) @@ -302,8 +302,8 @@ is($fghgene->end(), 1869); cmp_ok($fghgene->strand(), '<', 0); -my $i = 0; -my @num_exons = (2,5,4,8); +$i = 0; + at num_exons = (2,5,4,8); while ($fghgene = $fgh->next_prediction()) { From bugzilla-daemon at portal.open-bio.org Wed Apr 2 02:12:33 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 2 Apr 2008 02:12:33 -0400 Subject: [Bioperl-guts-l] [Bug 2338] The first 4 bytes of flatfile index is wrong (--indextype flat) In-Reply-To: Message-ID: <200804020612.m326CXLq006569@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2338 chad-bioperl-bugzilla at superfrink.net changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |chad-bioperl- | |bugzilla at superfrink.net ------- Comment #4 from chad-bioperl-bugzilla at superfrink.net 2008-04-02 02:12 EST ------- I tried this out on a Fedora 6 machine and saw the described behaviour. I made the following change to Bio::DB::Flat::BinarySearch and the problem went away. I have not tested other programs and do not know how this change will impact programs that rely on the original file format. Regards, Chad # head -2 BinarySearch.pm # $Id: BinarySearch.pm,v 1.23.4.1 2006/10/02 23:10:16 sendu Exp $ # diff -u BinarySearch.pm BinarySearch.pm.orig --- BinarySearch.pm 2008-04-01 23:59:29.000000000 -0600 +++ BinarySearch.pm.orig 2007-04-19 22:10:40.000000000 -0600 @@ -915,7 +915,7 @@ $self->{_maxlengthlength} + 3; - print $INDEX sprintf("%04d",$recordlength); + print $INDEX sprintf("%4d",$recordlength); foreach my $id (@ids) { @@ -982,7 +982,7 @@ my $fh = $self->new_secondary_filehandle($name); - print $fh sprintf("%04d",$length); + print $fh sprintf("%4d",$length); @seconds = sort @seconds; foreach my $second (@seconds) { -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From avilella at dev.open-bio.org Wed Apr 2 09:50:46 2008 From: avilella at dev.open-bio.org (Albert Vilella) Date: Wed, 2 Apr 2008 09:50:46 -0400 Subject: [Bioperl-guts-l] [14644] bioperl-run/trunk/Bio/Tools/Run/Phylo/Hyphy/FEL.pm: recently seems to need comma-separated values for the header in the results -- not sure if this was a problem at the very beginning Message-ID: <200804021350.m32DokM4018801@dev.open-bio.org> Revision: 14644 Author: avilella Date: 2008-04-02 09:50:45 -0400 (Wed, 02 Apr 2008) Log Message: ----------- recently seems to need comma-separated values for the header in the results -- not sure if this was a problem at the very beginning Modified Paths: -------------- bioperl-run/trunk/Bio/Tools/Run/Phylo/Hyphy/FEL.pm Modified: bioperl-run/trunk/Bio/Tools/Run/Phylo/Hyphy/FEL.pm =================================================================== --- bioperl-run/trunk/Bio/Tools/Run/Phylo/Hyphy/FEL.pm 2008-04-02 04:23:23 UTC (rev 14643) +++ bioperl-run/trunk/Bio/Tools/Run/Phylo/Hyphy/FEL.pm 2008-04-02 13:50:45 UTC (rev 14644) @@ -239,7 +239,7 @@ push @{$results->{$elems[$i]}}, $values[$i]; } } else { - @elems = split("\t",$_); + @elems = split("\,",$_); $readed_header = 1; } } From cjfields at dev.open-bio.org Wed Apr 2 11:52:41 2008 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Wed, 2 Apr 2008 11:52:41 -0400 Subject: [Bioperl-guts-l] [14645] bioperl-live/trunk: merge back previous commits from Sendu and I Message-ID: <200804021552.m32Fqf1a019207@dev.open-bio.org> Revision: 14645 Author: cjfields Date: 2008-04-02 11:52:41 -0400 (Wed, 02 Apr 2008) Log Message: ----------- merge back previous commits from Sendu and I Modified Paths: -------------- bioperl-live/trunk/Build.PL bioperl-live/trunk/ModuleBuildBioperl.pm Modified: bioperl-live/trunk/Build.PL =================================================================== --- bioperl-live/trunk/Build.PL 2008-04-02 13:50:45 UTC (rev 14644) +++ bioperl-live/trunk/Build.PL 2008-04-02 15:52:41 UTC (rev 14645) @@ -15,6 +15,8 @@ our @drivers; +my $mysql_ok = 0; + # Set up the ModuleBuildBioperl object my $build = ModuleBuildBioperl->new( module_name => 'Bio', @@ -92,7 +94,7 @@ BioDBSeqFeature_mysql => { description => "MySQL tests for Bio::DB::SeqFeature::Store", feature_requires => { 'DBI' => 0, 'DBD::mysql' => 0 }, - test => \&test_db + test => \&test_db_sf }, Network => { description => "Enable tests that need an internet connection", @@ -109,8 +111,12 @@ my $accept = $build->args->{accept}; +prompt_for_biodb($accept) if $build->feature('BioDBGFF') || $build->feature('BioDBSeqFeature_mysql'); + # Handle auto features -if ($build->feature('BioDBSeqFeature_BDB')) { +if ($build->feature('BioDBSeqFeature_BDB') && $mysql_ok) { + # will return without doing anything if user chose not to run tests during + # prompt_for_biodb() above make_bdb_test(); } if ($build->feature('BioDBSeqFeature_mysql')) { @@ -119,7 +125,7 @@ # Ask questions $build->choose_scripts($accept); -prompt_for_biodbgff($accept) if $build->feature('BioDBGFF'); +#prompt_for_biodbgff($accept) if $build->feature('BioDBGFF'); { if ($build->args('network')) { if ($build->feature('Network')) { @@ -155,7 +161,8 @@ sub make_bdb_test { my $path0 = File::Spec->catfile('t', 'BioDBSeqFeature.t'); my $path = File::Spec->catfile('t', 'BioDBSeqFeature_BDB.t'); - open my $F, ">$path"; + unlink($path) if (-e $path); + open(my $F, ">", $path) || die "Can't create test file\n"; print $F <add_to_manifest_skip($path); } -sub test_db { +sub test_db_sf { eval {require DBI;}; # if not installed, this sub won't actually be called - unless (eval {DBI->connect('dbi:mysql:test',undef,undef,{RaiseError=>0,PrintError=>0})}) { - return "Could not connect to test database"; + @drivers = DBI->available_drivers; + unless (grep {/mysql/i} @drivers) { + $mysql_ok = 0; + return "Only MySQL DBI driver supported for BioDBSeqFeature_mysql tests"; } + $mysql_ok = 1; return; } sub make_dbi_test { + my $dsn = $build->notes('test_dsn') || return; my $path0 = File::Spec->catfile('t', 'BioDBSeqFeature.t'); my $path = File::Spec->catfile('t', 'BioDBSeqFeature_mysql.t'); + my $test_db = $build->notes('test_db'); + my $user = $build->notes('test_user'); + my $pass = $build->notes('test_pass'); open my $F,">$path"; + my $str = "$path0 -adaptor DBI::mysql -create 1 -temp 1 -dsn $dsn"; + $str .= " -user $user" if $user; + $str .= " -password $pass" if $pass; print $F <add_to_cleanup($path); @@ -193,10 +210,11 @@ return; } -sub prompt_for_biodbgff { +sub prompt_for_biodb { my $accept = shift; - my $proceed = $accept - ? 0 : $build->y_n("Do you want to run the BioDBGFF live database tests? y/n", 'n'); + my $proceed = $accept ? 0 : $build->y_n("Do you want to run the BioDBGFF or ". + "BioDBSeqFeature_mysql live database tests? ". + "y/n", 'n'); if ($proceed) { my @driver_choices; @@ -239,9 +257,11 @@ my $test_dsn; if ($driver eq 'Pg') { $test_dsn = "dbi:$driver:dbname=$test_db"; + $mysql_ok = 0; } else { $test_dsn = "dbi:$driver:database=$test_db"; + $mysql_ok = 0; } if ($use_host) { $test_dsn .= ";host=$test_host"; @@ -254,15 +274,17 @@ $build->notes(test_pass => $test_pass eq 'undef' ? undef : $test_pass); $build->notes(test_dsn => $test_dsn); - $build->log_info(" - will run the BioDBGFF tests with database driver '$driver' and these settings:\n", + $build->log_info(" - will run tests with database driver '$driver' and these settings:\n", " Database $test_db\n", " Host $test_host\n", " DSN $test_dsn\n", " User $test_user\n", " Password $test_pass\n"); + $build->log_info(" - will not run the BioDBSeqFeature_mysql live ". + "database tests (requires MySQL driver)\n") unless $mysql_ok; } else { - $build->log_info(" - will not run the BioDBGFF live database tests\n"); + $build->log_info(" - will not run the BioDBGFF or BioDBSeqFeature live database tests\n"); } $build->log_info("\n"); Modified: bioperl-live/trunk/ModuleBuildBioperl.pm =================================================================== --- bioperl-live/trunk/ModuleBuildBioperl.pm 2008-04-02 13:50:45 UTC (rev 14644) +++ bioperl-live/trunk/ModuleBuildBioperl.pm 2008-04-02 15:52:41 UTC (rev 14645) @@ -282,7 +282,17 @@ my $status = {}; if ($type eq 'test') { unless (keys %$out) { - $status->{message} = &{$prereqs}; + if (ref($prereqs) eq 'CODE') { + $status->{message} = &{$prereqs}; + + # drop the code-ref to avoid Module::Build trying to store + # it with Data::Dumper, generating warnings. (And also, may + # be expensive to run the sub multiple times.) + $info->{$type} = $status->{message}; + } + else { + $status->{message} = $prereqs; + } $out->{$type}{'test'} = $status if $status->{message}; } } @@ -336,6 +346,11 @@ } elsif ($type =~ /^feature_requires/) { next if $status->{ok}; + + # if there is a test code-ref, drop it to avoid + # Module::Build trying to store it with Data::Dumper, + # generating warnings. + delete $info->{test}; } else { next if $status->{ok}; From cjfields at dev.open-bio.org Wed Apr 2 12:24:49 2008 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Wed, 2 Apr 2008 12:24:49 -0400 Subject: [Bioperl-guts-l] [14646] bioperl-live/trunk/Bio/Tools/Fgenesh.pm: Fix bad silent bug which doesn't set exon tags correctly ( showed up when using -W flag with tests) Message-ID: <200804021624.m32GOn9I019260@dev.open-bio.org> Revision: 14646 Author: cjfields Date: 2008-04-02 12:24:49 -0400 (Wed, 02 Apr 2008) Log Message: ----------- Fix bad silent bug which doesn't set exon tags correctly (showed up when using -W flag with tests) Modified Paths: -------------- bioperl-live/trunk/Bio/Tools/Fgenesh.pm Modified: bioperl-live/trunk/Bio/Tools/Fgenesh.pm =================================================================== --- bioperl-live/trunk/Bio/Tools/Fgenesh.pm 2008-04-02 15:52:41 UTC (rev 14645) +++ bioperl-live/trunk/Bio/Tools/Fgenesh.pm 2008-04-02 16:24:49 UTC (rev 14646) @@ -275,7 +275,7 @@ } # split into fields chomp(); - my @flds = split(/\s+/, ' ' . $line); + my @flds = split(/\s+/, ' ' . $line); ## NB - the above adds leading whitespace before the gene ## number in case there was none (as quick patch to code ## below which expects it but it is not present after 999 @@ -320,7 +320,7 @@ # are set, in order to allow for proper expansion of the range) if($is_exon) { # first, set fields unique to exons - $predobj->primary_tag($ExonTags{$flds[3]} . 'Exon'); + $predobj->primary_tag($ExonTags{$flds[4]} . 'Exon'); $predobj->is_coding(1); my $cod_offset; if($predobj->strand() == 1) { From cjfields at dev.open-bio.org Wed Apr 2 12:28:57 2008 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Wed, 2 Apr 2008 12:28:57 -0400 Subject: [Bioperl-guts-l] [14647] bioperl-live/trunk/t/Genpred.t: Add some tests to catch tag naming Message-ID: <200804021628.m32GSvPn019288@dev.open-bio.org> Revision: 14647 Author: cjfields Date: 2008-04-02 12:28:57 -0400 (Wed, 02 Apr 2008) Log Message: ----------- Add some tests to catch tag naming Modified Paths: -------------- bioperl-live/trunk/t/Genpred.t Modified: bioperl-live/trunk/t/Genpred.t =================================================================== --- bioperl-live/trunk/t/Genpred.t 2008-04-02 16:24:49 UTC (rev 14646) +++ bioperl-live/trunk/t/Genpred.t 2008-04-02 16:28:57 UTC (rev 14647) @@ -7,7 +7,7 @@ use lib 't/lib'; use BioperlTest; - test_begin(-tests => 180); + test_begin(-tests => 182); use_ok('Bio::Tools::Fgenesh'); use_ok('Bio::Tools::Genscan'); @@ -313,9 +313,11 @@ if ($i == 2) { cmp_ok($fghexons[0]->strand(), '>', 0); + is($fghexons[0]->primary_tag(), 'InitialExon'); is($fghexons[0]->start(), 14778); is($fghexons[0]->end(), 15104); cmp_ok($fghexons[3]->strand(), '>', 0); + is($fghexons[3]->primary_tag(), 'TerminalExon'); is($fghexons[3]->start(), 16988); is($fghexons[3]->end(), 17212); } From fangly at dev.open-bio.org Wed Apr 2 17:36:46 2008 From: fangly at dev.open-bio.org (Florent E Angly) Date: Wed, 2 Apr 2008 17:36:46 -0400 Subject: [Bioperl-guts-l] [14648] bioperl-live/trunk/Bio/Assembly: Misc cleaning, and bug fixing Message-ID: <200804022136.m32Lakuo019678@dev.open-bio.org> Revision: 14648 Author: fangly Date: 2008-04-02 17:36:46 -0400 (Wed, 02 Apr 2008) Log Message: ----------- Misc cleaning, and bug fixing Bio::Assembly::Contig Fixed bug: replaced an occurence of 'elem' by '_elem' Bio::Assembly::Scaffold Fixed bug: scaffold source is implemented, stored in the scaffold at scaffold creation Fixed bug: contig or singlet addition to a scaffold now really creates a reference to that scaffold as a contig or singlet attribute Improvement: a list of singlets can be given at scaffold creation (just like a list of contigs can be specified) Improvement: adding a non singlet object using the add_singlet method is now a fatal error (like all other such errors) Improvement: adding singlet to scaffold now generates a singlet name if singlet is unnamed Improvement: method update_seq_list now also updates singlets Improvement: adding a singlet to a scaffold now puts the singlet sequence in the list of sequences belonging to the scaffold (just like for contigs) Improvement: implemented 'remove_features_collection' method Bio::Assembly::IO::Ace Fixed bug: under certain conditions, the list of scaffold sequences ('_seqs') was not populated Modified Paths: -------------- bioperl-live/trunk/Bio/Assembly/Contig.pm bioperl-live/trunk/Bio/Assembly/IO/ace.pm bioperl-live/trunk/Bio/Assembly/Scaffold.pm bioperl-live/trunk/Bio/Assembly/Singlet.pm Modified: bioperl-live/trunk/Bio/Assembly/Contig.pm =================================================================== --- bioperl-live/trunk/Bio/Assembly/Contig.pm 2008-04-02 16:28:57 UTC (rev 14647) +++ bioperl-live/trunk/Bio/Assembly/Contig.pm 2008-04-02 21:36:46 UTC (rev 14648) @@ -220,17 +220,16 @@ Usage : my $contig = Bio::Assembly::Contig->new(); Function : Creates a new contig object Returns : Bio::Assembly::Contig - Args : -source => string representing the source - program where this contig came - from - -id => contig unique ID + Args : -id => contig unique ID + -source => string for the sequence assembly program used + =cut #----------- sub new { #----------- - my ($class, at args) = @_; + my ($class, @args) = @_; my $self = $class->SUPER::new(@args); @@ -243,8 +242,8 @@ # Bio::SimpleAlign derived fields (check which ones are needed for AlignI compatibility) $self->{'_elem'} = {}; # contig elements: aligned sequence objects (keyed by ID) $self->{'_order'} = {}; # store sequence order -# $self->{'start_end_lists'} = {}; # References to entries in {'_seq'}. Keyed by seq ids. -# $self->{'_dis_name'} = {}; # Display names for each sequence + # $self->{'start_end_lists'} = {}; # References to entries in {'_seq'}. Keyed by seq ids. + # $self->{'_dis_name'} = {}; # Display names for each sequence $self->{'_symbols'} = {}; # List of symbols #Contig specific slots @@ -252,10 +251,10 @@ $self->{'_consensus_quality'} = undef; $self->{'_nof_residues'} = 0; $self->{'_nof_seqs'} = 0; -# $self->{'_nof_segments'} = 0; # Let's not make it heavier than needed by now... + # $self->{'_nof_segments'} = 0; # Let's not make it heavier than needed by now... $self->{'_sfc'} = Bio::SeqFeature::Collection->new(); - # Assembly specifcs + # Assembly specifics $self->{'_assembly'} = undef; # Reference to a Bio::Assembly::Scaffold object, if contig belongs to one. $self->{'_strand'} = 0; # Reverse (-1) or forward (1), if contig is in a scaffold. 0 otherwise $self->{'_neighbor_start'} = undef; # Will hold a reference to another contig @@ -305,7 +304,7 @@ my $assembly = shift; $self->throw("Using non Bio::Assembly::Scaffold object when assign contig to assembly") - if (defined $assembly && ! $assembly->isa("Bio::Assembly::Scaffold")); + if (defined $assembly && ! $assembly->isa("Bio::Assembly::Scaffold")); $self->{'_assembly'} = $assembly if (defined $assembly); return $self->{'_assembly'}; @@ -330,11 +329,10 @@ my $self = shift; my $ori = shift; - if (defined $ori) { - $self->throw("Contig strand must be either 1, -1 or 0") + if (defined $ori) { + $self->throw("Contig strand must be either 1, -1 or 0") unless $ori == 1 || $ori == 0 || $ori == -1; - - $self->{'_strand'} = $ori; + $self->{'_strand'} = $ori; } return $self->{'_strand'}; @@ -357,7 +355,7 @@ my $ref = shift; $self->throw("Trying to assign a non Bio::Assembly::Contig object to upstream contig") - if (defined $ref && ! $ref->isa("Bio::Assembly::Contig")); + if (defined $ref && ! $ref->isa("Bio::Assembly::Contig")); $self->{'_neighbor_start'} = $ref if (defined $ref); return $self->{'_neighbor_start'}; @@ -380,7 +378,7 @@ my $ref = shift; $self->throw("Trying to assign a non Bio::Assembly::Contig object to downstream contig") - if (defined $ref && ! $ref->isa("Bio::Assembly::Contig")); + if (defined $ref && ! $ref->isa("Bio::Assembly::Contig")); $self->{'_neighbor_end'} = $ref if (defined $ref); return $self->{'_neighbor_end'}; } @@ -424,20 +422,20 @@ # Adding shortcuts for aligned sequence features $flag = 0 unless (defined $flag); if ($flag && defined $self->{'_consensus_sequence'}) { - foreach my $feat (@$args) { - next if (defined $feat->seq); - $feat->attach_seq($self->{'_consensus_sequence'}); - } + foreach my $feat (@$args) { + next if (defined $feat->seq); + $feat->attach_seq($self->{'_consensus_sequence'}); + } } elsif (!$flag) { # Register aligned sequence features - foreach my $feat (@$args) { - if (my $seq = $feat->entire_seq()) { - my $seqID = $seq->id() || $seq->display_id || $seq->primary_id; - $self->warn("Adding contig feature attached to unknown sequence $seqID!") - unless (exists $self->{'_elem'}{$seqID}); - my $tag = $feat->primary_tag; - $self->{'_elem'}{$seqID}{'_feat'}{$tag} = $feat; - } - } + foreach my $feat (@$args) { + if (my $seq = $feat->entire_seq()) { + my $seqID = $seq->id() || $seq->display_id || $seq->primary_id; + $self->warn("Adding contig feature attached to unknown sequence $seqID!") + unless (exists $self->{'_elem'}{$seqID}); + my $tag = $feat->primary_tag; + $self->{'_elem'}{$seqID}{'_feat'}{$tag} = $feat; + } + } } # Add feature to feature collection @@ -461,16 +459,17 @@ # Removing shortcuts for aligned sequence features foreach my $feat (@args) { - if (my $seq = $feat->entire_seq()) { - my $seqID = $seq->id() || $seq->display_id || $seq->primary_id; - my $tag = $feat->primary_tag; - $tag =~ s/:$seqID$/$1/g; - delete( $self->{'_elem'}{$seqID}{'_feat'}{$tag} ) - if (exists $self->{'_elem'}{$seqID}{'_feat'}{$tag} && - $self->{'_elem'}{$seqID}{'_feat'}{$tag} eq $feat); - } + if (my $seq = $feat->entire_seq()) { + my $seqID = $seq->id() || $seq->display_id || $seq->primary_id; + my $tag = $feat->primary_tag; + $tag =~ s/:$seqID$/$1/g; + delete( $self->{'_elem'}{$seqID}{'_feat'}{$tag} ) + if (exists $self->{'_elem'}{$seqID}{'_feat'}{$tag} && + $self->{'_elem'}{$seqID}{'_feat'}{$tag} eq $feat); + } } - + + # Removing Bio::SeqFeature::Collection features return $self->{'_sfc'}->remove_features(\@args); } @@ -486,10 +485,31 @@ sub get_features_collection { my $self = shift; - return $self->{'_sfc'}; } +=head2 remove_features_collection + + Title : remove_features_collection + Usage : $contig->remove_features_collection() + Function : Remove the collection of all contig features. It is useful + to save some memory (when contig features are not needed). + Returns : none + Argument : none + +=cut + +sub remove_features_collection { + my $self = shift; + # Removing shortcuts for aligned sequence features + for my $seqID (keys %{$self->{'_elem'}}) { + delete $self->{'_elem'}{$seqID}; + } + # Removing Bio::SeqFeature::Collection features + $self->{'_sfc'} = {}; + return; +} + =head1 Coordinate system's related methods See L above. @@ -533,140 +553,140 @@ my $out_ID = ( split(' ',$type_out) )[1]; if ($in_ID ne 'consensus') { - $read_in = $self->get_seq_coord( $self->get_seq_by_name($in_ID) ); - $self->throw("Can't change coordinates without sequence location for $in_ID") - unless (defined $read_in); + $read_in = $self->get_seq_coord( $self->get_seq_by_name($in_ID) ); + $self->throw("Can't change coordinates without sequence location for $in_ID") + unless (defined $read_in); } if ($out_ID ne 'consensus') { - $read_out = $self->get_seq_coord( $self->get_seq_by_name($out_ID) ); - $self->throw("Can't change coordinates without sequence location for $out_ID") - unless (defined $read_out); + $read_out = $self->get_seq_coord( $self->get_seq_by_name($out_ID) ); + $self->throw("Can't change coordinates without sequence location for $out_ID") + unless (defined $read_out); } # Performing transformation between coordinates - SWITCH1: { + SWITCH1: { - # Transformations between contig padded and contig unpadded - (($type_in eq 'gapped consensus') && ($type_out eq 'ungapped consensus')) && do { - $self->throw("Can't use ungapped consensus coordinates without a consensus sequence") - unless (defined $self->{'_consensus_sequence'}); - $query = &_padded_unpadded($self->{'_consensus_gaps'}, $query); - last SWITCH1; - }; - (($type_in eq 'ungapped consensus') && ($type_out eq 'gapped consensus')) && do { - $self->throw("Can't use ungapped consensus coordinates without a consensus sequence") - unless (defined $self->{'_consensus_sequence'}); - $query = &_unpadded_padded($self->{'_consensus_gaps'},$query); - last SWITCH1; - }; + # Transformations between contig padded and contig unpadded + (($type_in eq 'gapped consensus') && ($type_out eq 'ungapped consensus')) && do { + $self->throw("Can't use ungapped consensus coordinates without a consensus sequence") + unless (defined $self->{'_consensus_sequence'}); + $query = &_padded_unpadded($self->{'_consensus_gaps'}, $query); + last SWITCH1; + }; + (($type_in eq 'ungapped consensus') && ($type_out eq 'gapped consensus')) && do { + $self->throw("Can't use ungapped consensus coordinates without a consensus sequence") + unless (defined $self->{'_consensus_sequence'}); + $query = &_unpadded_padded($self->{'_consensus_gaps'},$query); + last SWITCH1; + }; - # Transformations between contig (padded) and read (padded) - (($type_in eq 'gapped consensus') && - ($type_out =~ /^aligned /) && defined($read_out)) && do { - $query = $query - $read_out->start() + 1; - last SWITCH1; - }; - (($type_in =~ /^aligned /) && defined($read_in) && - ($type_out eq 'gapped consensus')) && do { - $query = $query + $read_in->start() - 1; - last SWITCH1; - }; + # Transformations between contig (padded) and read (padded) + (($type_in eq 'gapped consensus') && + ($type_out =~ /^aligned /) && defined($read_out)) && do { @@ Diff output truncated at 10000 characters. @@ From lstein at dev.open-bio.org Thu Apr 3 09:24:40 2008 From: lstein at dev.open-bio.org (Lincoln Stein) Date: Thu, 3 Apr 2008 09:24:40 -0400 Subject: [Bioperl-guts-l] [14649] bioperl-live/trunk/Bio/Graphics/FeatureFile.pm: restored the ability to have a "tag=value #comment" style comment Message-ID: <200804031324.m33DOekg022361@dev.open-bio.org> Revision: 14649 Author: lstein Date: 2008-04-03 09:24:40 -0400 (Thu, 03 Apr 2008) Log Message: ----------- restored the ability to have a "tag=value #comment" style comment Modified Paths: -------------- bioperl-live/trunk/Bio/Graphics/FeatureFile.pm Modified: bioperl-live/trunk/Bio/Graphics/FeatureFile.pm =================================================================== --- bioperl-live/trunk/Bio/Graphics/FeatureFile.pm 2008-04-02 21:36:46 UTC (rev 14648) +++ bioperl-live/trunk/Bio/Graphics/FeatureFile.pm 2008-04-03 13:24:40 UTC (rev 14649) @@ -514,6 +514,8 @@ my $self = shift; local $_ = shift; + s/\s+\#.*$//; # strip right-column comments + if (/^\s+(.+)/ && $self->{current_tag}) { # configuration continuation line my $value = $1; my $cc = $self->{current_config} ||= 'general'; # in case no configuration named From lstein at dev.open-bio.org Thu Apr 3 09:58:49 2008 From: lstein at dev.open-bio.org (Lincoln Stein) Date: Thu, 3 Apr 2008 09:58:49 -0400 Subject: [Bioperl-guts-l] [14650] bioperl-live/trunk/Bio: added an API to prevent FeatureFile->render () from rendering indiscriminately without paying attention to the seq_id of the underlying reference sequence Message-ID: <200804031358.m33Dwns6022456@dev.open-bio.org> Revision: 14650 Author: lstein Date: 2008-04-03 09:58:49 -0400 (Thu, 03 Apr 2008) Log Message: ----------- added an API to prevent FeatureFile->render() from rendering indiscriminately without paying attention to the seq_id of the underlying reference sequence Modified Paths: -------------- bioperl-live/trunk/Bio/DB/SeqFeature/Store/FeatureFileLoader.pm bioperl-live/trunk/Bio/Graphics/FeatureFile.pm Modified: bioperl-live/trunk/Bio/DB/SeqFeature/Store/FeatureFileLoader.pm =================================================================== --- bioperl-live/trunk/Bio/DB/SeqFeature/Store/FeatureFileLoader.pm 2008-04-03 13:24:40 UTC (rev 14649) +++ bioperl-live/trunk/Bio/DB/SeqFeature/Store/FeatureFileLoader.pm 2008-04-03 13:58:49 UTC (rev 14650) @@ -435,7 +435,7 @@ # either create a new feature or add a segment to it my $feature = $ld->{CurrentFeature}; if ($feature) { - + # if this is a different feature from what we have now, then we # store the current one, and create a new one if ($feature->display_name ne $name || Modified: bioperl-live/trunk/Bio/Graphics/FeatureFile.pm =================================================================== --- bioperl-live/trunk/Bio/Graphics/FeatureFile.pm 2008-04-03 13:24:40 UTC (rev 14649) +++ bioperl-live/trunk/Bio/Graphics/FeatureFile.pm 2008-04-03 13:58:49 UTC (rev 14650) @@ -279,7 +279,9 @@ sub render { my $self = shift; my $panel = shift; - my ($position_to_insert,$options,$max_bump,$max_label,$selector) = @_; + my ($position_to_insert,$options, + $max_bump,$max_label, + $selector,$range) = @_; my %seenit; $panel ||= $self->new_panel; @@ -296,7 +298,7 @@ } map { shellwords ($self->setting($_=>'feature')||$_) } @labels; my %lc_types = map {lc($_)}%types; - + my @unconfigured_types = sort grep {!exists $lc_types{lc $_} && !exists $lc_types{lc $_->method} } $self->types; @@ -332,7 +334,12 @@ next if defined $selector and !$selector->($self,$label); - my @features = grep {$self->_visible($_)} $self->features(\@types); + my @features = !$range ? grep {$self->_visible($_)} $self->features(\@types) + : $self->features(-types => \@types, + -seq_id => $range->seq_id, + -start => $range->start, + -end => $range->end + ); next unless @features; # suppress tracks for features that don't appear @@ -343,7 +350,6 @@ my @auto_bump; push @auto_bump,(-bump => @$features < $max_bump) if defined $max_bump; push @auto_bump,(-label => @$features < $max_label) if defined $max_label; - my @config = ( -glyph => 'segments', # really generic -bgcolor => $COLORS[$color++ % @COLORS], @@ -944,6 +950,8 @@ $features = $features-Efeatures(-type=>'a type'); $iterator = $features-Efeatures(-type=>'a type',-iterator=>1); + $iterator = $features-Efeatures(-type=>'a type',-seq_id=>$id,-start=>$start,-end=>$end); + =back =cut @@ -951,10 +959,16 @@ # return features sub features { my $self = shift; - my ($types,$iterator, at rest) = defined($_[0] && $_[0]=~/^-/) - ? rearrange([['TYPE','TYPES']], at _) : (\@_); + my ($types,$iterator,$seq_id,$start,$end, at rest) = defined($_[0] && $_[0]=~/^-/) + ? rearrange([['TYPE','TYPES'],'ITERATOR','SEQ_ID','START','END'], at _) : (\@_); + $types = [$types] if $types && !ref($types); my @args = $types && @$types ? (-type=>$types) : (); + + push @args,(-seq_id => $seq_id) if $seq_id; + push @args,(-start => $start) if defined $start; + push @args,(-end => $end) if defined $end; + my $db = $self->db; if ($iterator) { From bugzilla-daemon at portal.open-bio.org Sun Apr 6 23:17:03 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 6 Apr 2008 23:17:03 -0400 Subject: [Bioperl-guts-l] [Bug 2337] BDB flatfile index should store global configuration data in BDB In-Reply-To: Message-ID: <200804070317.m373H35T008698@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2337 ------- Comment #3 from cjfields at uiuc.edu 2008-04-06 23:17 EST ------- (In reply to comment #2) > Naohisa is right, the spec says that some meta-data should go into a BerkeleyDB > db. But it looks like what Lincoln did was to put information on primary and > secondary namespaces in a human-readable file called config.dat. He did not put > it into a BerkeleyDB db. His decision was the correct one, this information > should be human-readable. Of course for OBDA to work across platforms Bioperl > has to do what the other platforms do, even if it's not the right approach. > Hmm. I agree the namespace info should be human-readable. Maybe the best solution is to require BDB storage as stated in the OBDA spec, and modify the spec to (optionally) allow adding human-readable data to config.dat. Might be something to bring up at BOSC, to see how other OBDA implementations in other Bio* langs are doing this. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Mon Apr 7 14:24:43 2008 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Mon, 7 Apr 2008 14:24:43 -0400 Subject: [Bioperl-guts-l] [14651] bioperl-live/trunk/Bio: updates; may switch some data objects over to (lightweight) Data:: Stag implementation at some point Message-ID: <200804071824.m37IOheV006309@dev.open-bio.org> Revision: 14651 Author: cjfields Date: 2008-04-07 14:24:42 -0400 (Mon, 07 Apr 2008) Log Message: ----------- updates; may switch some data objects over to (lightweight) Data::Stag implementation at some point Modified Paths: -------------- bioperl-live/trunk/Bio/DB/EUtilParameters.pm bioperl-live/trunk/Bio/DB/GenericWebAgent.pm bioperl-live/trunk/Bio/Tools/EUtilities/Summary/DocSum.pm bioperl-live/trunk/Bio/Tools/EUtilities/Summary/Item.pm bioperl-live/trunk/Bio/Tools/EUtilities.pm Modified: bioperl-live/trunk/Bio/DB/EUtilParameters.pm =================================================================== --- bioperl-live/trunk/Bio/DB/EUtilParameters.pm 2008-04-03 13:58:49 UTC (rev 14650) +++ bioperl-live/trunk/Bio/DB/EUtilParameters.pm 2008-04-07 18:24:42 UTC (rev 14651) @@ -561,7 +561,6 @@ { # default retmode if one is not supplied my %NCBI_DATABASE = ( - 'pubmed' => 'xml', 'protein' => 'text', 'nucleotide' => 'text', 'nuccore' => 'text', @@ -569,42 +568,16 @@ 'nucest' => 'text', 'structure' => 'text', 'genome' => 'text', - 'books' => 'xml', - 'cancerchromosomes'=> 'xml', - 'cdd' => 'xml', - 'domains' => 'xml', 'gene' => 'asn1', - 'genomeprj' => 'xml', - 'gensat' => 'xml', - 'geo' => 'xml', - 'gds' => 'xml', - 'homologene' => 'xml', 'journals' => 'text', - 'mesh' => 'xml', - 'ncbisearch' => 'xml', - 'nlmcatalog' => 'xml', - 'omia' => 'xml', - 'omim' => 'xml', - 'pmc' => 'xml', - 'popset' => 'xml', - 'probe' => 'xml', - 'pcassay' => 'xml', - 'pccompound' => 'xml', - 'pcsubstance' => 'xml', - 'snp' => 'xml', - 'taxonomy' => 'xml', - 'unigene' => 'xml', - 'unists' => 'xml', ); sub set_default_retmode { my $self = shift; if ($self->eutil eq 'efetch') { my $db = $self->db || return; # assume retmode will be set along with db - $self->throw('Database $db not recognized') - if !exists $NCBI_DATABASE{$db}; - # set efetch-based retmode - $self->retmode($NCBI_DATABASE{$db}); + my $mode = exists $NCBI_DATABASE{$db} ? $NCBI_DATABASE{$db} : 'xml'; + $self->retmode($mode); } else { $self->retmode('xml'); } Modified: bioperl-live/trunk/Bio/DB/GenericWebAgent.pm =================================================================== --- bioperl-live/trunk/Bio/DB/GenericWebAgent.pm 2008-04-03 13:58:49 UTC (rev 14650) +++ bioperl-live/trunk/Bio/DB/GenericWebAgent.pm 2008-04-07 18:24:42 UTC (rev 14651) @@ -1,6 +1,6 @@ # $Id$ # -# BioPerl module for Bio::DB::EUtilities +# BioPerl module for Bio::DB::GenericWebAgent # # Cared for by Chris Fields # @@ -10,11 +10,11 @@ # # POD documentation - main docs before the code # -# Interfaces with new GenericWebDBI interface +# Interfaces with new GenericWebAgent interface =head1 NAME -Bio::DB::GenericWebDBI - helper base class for parameter-based remote server +Bio::DB::GenericWebAgent - helper base class for parameter-based remote server access and response retrieval. =head1 SYNOPSIS Modified: bioperl-live/trunk/Bio/Tools/EUtilities/Summary/DocSum.pm =================================================================== --- bioperl-live/trunk/Bio/Tools/EUtilities/Summary/DocSum.pm 2008-04-03 13:58:49 UTC (rev 14650) +++ bioperl-live/trunk/Bio/Tools/EUtilities/Summary/DocSum.pm 2008-04-07 18:24:42 UTC (rev 14651) @@ -17,6 +17,8 @@ Bio::DB::EUtilities::Summary::DocSum - data object for document summary data from esummary +############ NOTE : Undergoing reimplementation to use simple Data::Stag ############ + =head1 SYNOPSIS @@ -128,7 +130,7 @@ Function : iterates through Items (nested layer of Item) Returns : single Item Args : [optional] single arg (string) - 'flattened' - iterates through a flattened list ala + 'flatten' - iterates through a flattened list ala get_all_DocSum_Items() =cut @@ -138,7 +140,7 @@ unless ($self->{"_items_it"}) { #my @items = $self->get_Items; my @items = ($request && $request eq 'flatten') ? - $self->get_all_DocSum_Items : + $self->get_all_Items : $self->get_Items ; $self->{"_items_it"} = sub {return shift @items} } @@ -160,10 +162,10 @@ return ref $self->{'_items'} ? @{ $self->{'_items'} } : return (); } -=head2 get_all_DocSum_Items +=head2 get_all_Items - Title : get_all_DocSum_Items - Usage : my @items = $docsum->get_all_DocSum_Items + Title : get_all_Items + Usage : my @items = $docsum->get_all_Items Function : returns flattened list of all Item objects (Items, ListItems, StructureItems) Returns : array of Items @@ -182,21 +184,60 @@ =cut -sub get_all_DocSum_Items { +sub get_all_Items { my $self = shift; - my @items; - for my $item ($self->get_Items) { - push @items, $item; - for my $ls ($item->get_ListItems) { - push @items, $ls; - for my $st ($ls->get_StructureItems) { - push @items, $st; - } + unless ($self->{'_ordered_items'}) { + for my $item ($self->get_Items) { + push @{$self->{'_ordered_items'}}, $item; + for my $ls ($item->get_ListItems) { + push @{$self->{'_ordered_items'}}, $ls; + for my $st ($ls->get_StructureItems) { + push @{$self->{'_ordered_items'}}, $st; + } + } } } - return @items; + return @{$self->{'_ordered_items'}}; } +=head2 get_content_by_name + + Title : get_content_by_Item_name + Usage : my $data = get_content_by_name('CreateDate') + Function : Returns scalar content for named Item in DocSum (indicated by + passed argument) + Returns : scalar value (string) if present + Args : string (Item name) + Warns : If Item with name is not found + +=cut + +sub get_content_by_name { + my ($self, $key) = @_; + return unless $key; + my ($it) = grep {$_->get_name eq $key} $self->get_all_Items; + return $it->get_content; +} + +=head2 get_type_by_name + + Title : get_type_by_name + Usage : my $data = get_type_by_name('CreateDate') + Function : Returns data type for named Item in DocSum (indicated by + passed argument) + Returns : scalar value (string) if present + Args : string (Item name) + Warns : If Item with name is not found + +=cut + +sub get_type_by_name { + my ($self, $key) = @_; + return unless $key; + my ($it) = grep {$_->get_name eq $key} $self->get_all_Items; + return $it->get_type; +} + =head2 rewind Title : rewind Modified: bioperl-live/trunk/Bio/Tools/EUtilities/Summary/Item.pm =================================================================== --- bioperl-live/trunk/Bio/Tools/EUtilities/Summary/Item.pm 2008-04-03 13:58:49 UTC (rev 14650) +++ bioperl-live/trunk/Bio/Tools/EUtilities/Summary/Item.pm 2008-04-07 18:24:42 UTC (rev 14651) @@ -241,7 +241,7 @@ Returns : string Args : none Note : this is not the same as the datatype(), which describes the - group this Item ojbect belongs to + group this Item object belongs to =cut Modified: bioperl-live/trunk/Bio/Tools/EUtilities.pm =================================================================== --- bioperl-live/trunk/Bio/Tools/EUtilities.pm 2008-04-03 13:58:49 UTC (rev 14650) +++ bioperl-live/trunk/Bio/Tools/EUtilities.pm 2008-04-07 18:24:42 UTC (rev 14651) @@ -827,7 +827,7 @@ my $ds = shift; my $string = sprintf("UID: %s\n",$ds->get_id); # flattened mode - while (my $item = $ds->next_Item('flattened')) { + while (my $item = $ds->next_Item('flatten')) { # not all Items have content, so need to check... my $content = $item->get_content || ''; $string .= sprintf("%-20s%s\n",$item->get_name(), From bugzilla-daemon at portal.open-bio.org Mon Apr 7 14:50:15 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 7 Apr 2008 14:50:15 -0400 Subject: [Bioperl-guts-l] [Bug 2482] New: paml4 mlc file fails to parse Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2482 Summary: paml4 mlc file fails to parse Product: BioPerl Version: 1.5 branch Platform: Other OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Core Components AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: jayoung at fhcrc.org CC: jayoung at fhcrc.org Hi, I have just updated our version of PAML to v4, and now have problems parsing the mlc file with Bio::Tools::Phylo::PAML. I think I have also updated to the latest version of bioperl: $Bio::Tools::Phylo::PAML::VERSION gives 1.0050021 My script is based on http://bioperl.org/wiki/HOWTO:PAML, and the basics of it are here: -------------------- #!/usr/bin/perl #specify the mlc file(s) on the command line use Bio::Tools::Phylo::PAML; use warnings; print "PAML version ", $Bio::Tools::Phylo::PAML::VERSION, "\n\n"; foreach my $file (@ARGV) { my $outcodeml = $file; if (!-e $outcodeml) {die "\ncan't find the file you specified $outcodeml - terminating\n\n";} my $out = "$outcodeml.treeinfo"; print "file $file - output will be in $out\n"; my $paml_parser = new Bio::Tools::Phylo::PAML(-file => $outcodeml, -dir => "./", -ctlf => "./codeml.ctl"); open (OUT, "> $out"); print OUT "Descendants\tt\tS\tN\tdN/dS\tdN\tdS\tS*dS\tN*dN\n"; if( my $result = $paml_parser->next_result() ) { print "got a result\n"; while ( my $tree = $result->next_tree ) { print "found a tree\n"; my $newtree = new Bio::TreeIO(-file=>'> temp.xml', -format=>'svggraph'); $newtree->write_tree($tree); #do stuff with the tree here.... } } else {print "no results\n";} close OUT; } -------------------- It works fine on output from paml 3.15 but on output from paml4 I get the following: PAML version 1.0050021 file mlc - output will be in mlc.treeinfo no results which tells me that the parser didn't recognize the output. I'll attach the mlc file in a few minutes. thanks in advance for any help, Janet Young ------------------------------------------------------------------- Dr. Janet Young (Trask lab) Fred Hutchinson Cancer Research Center 1100 Fairview Avenue N., C3-168, P.O. Box 19024, Seattle, WA 98109-1024, USA. tel: (206) 667 1471 fax: (206) 667 6524 email: jayoung at fhcrc.org http://www.fhcrc.org/labs/trask/ ------------------------------------------------------------------- -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Apr 7 14:52:59 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 7 Apr 2008 14:52:59 -0400 Subject: [Bioperl-guts-l] [Bug 2482] paml4 mlc file fails to parse In-Reply-To: Message-ID: <200804071852.m37IqxcP025782@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2482 ------- Comment #1 from jayoung at fhcrc.org 2008-04-07 14:52 EST ------- Created an attachment (id=897) --> (http://bugzilla.open-bio.org/attachment.cgi?id=897&action=view) mlc file, can't parse this one -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Apr 8 03:26:19 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 8 Apr 2008 03:26:19 -0400 Subject: [Bioperl-guts-l] [Bug 2474] postgres 8.3 - load_seqdatabase.pl / swissprot In-Reply-To: Message-ID: <200804080726.m387QJ4N005423@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2474 ------- Comment #2 from Bank.Beszteri at awi.de 2008-04-08 03:26 EST ------- Created an attachment (id=898) --> (http://bugzilla.open-bio.org/attachment.cgi?id=898&action=view) Another output from load_seqdatabase.pl illustrating taxonomic conflicts between Swissprot flat file (v.13.1) & NCBI taxonomy -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Apr 8 03:32:32 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 8 Apr 2008 03:32:32 -0400 Subject: [Bioperl-guts-l] [Bug 2474] postgres 8.3 - load_seqdatabase.pl / swissprot In-Reply-To: Message-ID: <200804080732.m387WWVk005848@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2474 ------- Comment #3 from Bank.Beszteri at awi.de 2008-04-08 03:32 EST ------- (From update of attachment 898) Forgot to add: MySQL this time (client v.4.0.18, server v.5.0.45) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Tue Apr 8 11:58:20 2008 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Tue, 8 Apr 2008 11:58:20 -0400 Subject: [Bioperl-guts-l] [14652] bioperl-live/trunk/Bio/Seq/Meta/Array.pm: bug 2478 Message-ID: <200804081558.m38FwKOA008026@dev.open-bio.org> Revision: 14652 Author: cjfields Date: 2008-04-08 11:58:19 -0400 (Tue, 08 Apr 2008) Log Message: ----------- bug 2478 Modified Paths: -------------- bioperl-live/trunk/Bio/Seq/Meta/Array.pm Modified: bioperl-live/trunk/Bio/Seq/Meta/Array.pm =================================================================== --- bioperl-live/trunk/Bio/Seq/Meta/Array.pm 2008-04-07 18:24:42 UTC (rev 14651) +++ bioperl-live/trunk/Bio/Seq/Meta/Array.pm 2008-04-08 15:58:19 UTC (rev 14652) @@ -394,7 +394,7 @@ $start =~ /^[+]?\d+$/ and $start > 0 or $self->throw("Need at least a positive integer start value"); $start--; - + my $meta_len = scalar(@{$self->{_meta}->{$name}}); if (defined $value) { my $arrayref; @@ -428,12 +428,17 @@ return $arrayref; } else { - - $end or $end = $self->length; - $end = $self->length if $end > $self->length; + # don't set by seq length; use meta array length instead; bug 2478 + $end ||= $meta_len; + if ($end > $meta_len) { + $self->warn("End is longer than meta sequence $name length; resetting to $meta_len"); + $end = $meta_len; + } + # warn but don't reset (push use of trunc() instead) + $self->warn("End is longer than sequence length; use trunc() \n". + "if you want a fully truncated object") if $end > $self->length; $end--; return [@{$self->{_meta}->{$name}}[$start..$end]]; - } } @@ -661,15 +666,14 @@ # test arguments $start =~ /^[+]?\d+$/ and $start > 0 or - $self->throw("Need at least a positive integer start value as start"); + $self->throw("Need at least a positive integer start value as start; got [$start]"); $end =~ /^[+]?\d+$/ and $end > 0 or - $self->throw("Need at least a positive integer start value as end"); + $self->throw("Need at least a positive integer start value as end; got [$end]"); $end >= $start or - $self->throw("End position has to be larger or equal to start"); + $self->throw("End position has to be larger or equal to start; got [$start..$end]"); $end <= $self->length or - $self->throw("End position can not be larger than sequence length"); + $self->throw("End position can not be larger than sequence length; got [$end]"); - my $new = $self->SUPER::trunc($start, $end); $start--; $end--; From bugzilla-daemon at portal.open-bio.org Tue Apr 8 12:03:27 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 8 Apr 2008 12:03:27 -0400 Subject: [Bioperl-guts-l] [Bug 2478] Bio::SeqIO::fastq subqual reassignment of qualities results in replacement of end characters with '!' In-Reply-To: Message-ID: <200804081603.m38G3Ri0013170@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2478 cjfields at uiuc.edu changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #4 from cjfields at uiuc.edu 2008-04-08 12:03 EST ------- subqual() currently checks the passed start/end coordinates against the sequence coordinates (start=1, end=seq length). When resetting the sequence and then calling subqual(), this resets the qual end to the newly set seq(). subqual() (and submeta(), by extension) should be independent of the sequence and checked against the qual array length, then warn if it doesn't match the seq length. I've added a fix for that as well as a few warnings. For future reference, the proper (easier) method to retrieve a fully truncated object of the same class is trunc(). Removing the calls to subseq/subqual and using trunc directly like so: $out->write_fastq($seq->trunc($opt_b+1,$seq_length-$opt_e)); gets: @fake header 1 trimmed by 3 at beginning and 2 at end gacaatatat +fake header 1 trimmed by 3 at beginning and 2 at end sfiojeq%!@ @fake header 2 trimmed by 3 at beginning and 2 at end ctagagagg +fake header 2 trimmed by 3 at beginning and 2 at end 2v1cty1f5 -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Tue Apr 8 12:05:13 2008 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Tue, 8 Apr 2008 12:05:13 -0400 Subject: [Bioperl-guts-l] [14653] bioperl-live/trunk/scripts/Bio-DB-GFF/bulk_load_gff.PLS: bug 2479 Message-ID: <200804081605.m38G5DCI008078@dev.open-bio.org> Revision: 14653 Author: cjfields Date: 2008-04-08 12:05:13 -0400 (Tue, 08 Apr 2008) Log Message: ----------- bug 2479 Modified Paths: -------------- bioperl-live/trunk/scripts/Bio-DB-GFF/bulk_load_gff.PLS Modified: bioperl-live/trunk/scripts/Bio-DB-GFF/bulk_load_gff.PLS =================================================================== --- bioperl-live/trunk/scripts/Bio-DB-GFF/bulk_load_gff.PLS 2008-04-08 15:58:19 UTC (rev 14652) +++ bioperl-live/trunk/scripts/Bio-DB-GFF/bulk_load_gff.PLS 2008-04-08 16:05:13 UTC (rev 14653) @@ -91,7 +91,8 @@ GFF and/or FASTA files --password Password to use for authentication (Does not work with Postgres, password must be - supplied interactively) + supplied interactively or be left empty for + ident authentication) --maxbin Set the value of the maximum bin size --local Flag to indicate that the data source is local --maxfeature Set the value of the maximum feature size (power of 10) @@ -207,7 +208,7 @@ # If called as pg_bulk_load_gff.pl behave as that did. if ($0 =~/pg_bulk_load_gff.pl/){ - $ADAPTOR ||= 'pg'; + $ADAPTOR ||= 'Pg'; $DSN ||= 'test'; } $DSN ||= 'dbi:mysql:test'; @@ -227,7 +228,13 @@ die "Aborted\n" unless $f =~ /^[yY]/; close TTY; } +# postgres DBD::Pg allows 'database', but also 'dbname', and 'db': +# and it must be Pg (not pg) +$DSN=~s/pg:database=/Pg:/i; +$DSN=~s/pg:dbname=/Pg:/i; +$DSN=~s/pg:db=/Pg:/i; +# leave these lines for mysql $DSN=~s/database=//i; $DSN=~s/;host=/:/i; #cater for dsn in the form of "dbi:mysql:database=$dbname;host=$host" @@ -237,6 +244,12 @@ $ADAPTOR ||= $DBD; $ADAPTOR ||= 'mysql'; +if ($DBD eq 'Pg') { + # rebuild DSN, DBD::Pg requires full dbname= format + $DSN = "dbi:Pg:dbname=$DBNAME"; + if ($HOST) { $DSN .= ";host=$HOST"; } +} + my ($use_mysql,$use_mysqlcmap,$use_pg) = (0,0,0); if ( $ADAPTOR eq 'mysqlcmap' ) { $use_mysqlcmap = 1; @@ -244,7 +257,7 @@ elsif ( $ADAPTOR =~ /^mysql/ ) { $use_mysql = 1; } -elsif ( $ADAPTOR eq "pg" ) { +elsif ( $ADAPTOR eq "Pg" ) { $use_pg = 1; } else{ @@ -575,8 +588,8 @@ foreach (@files) { my $file = "$tmpdir/$_.$$"; - $AUTH ? system("psql $AUTH -f $file $DSN") - : system('psql','-f', $file, $DSN); + $AUTH ? system("psql $AUTH -f $file $DBNAME") + : system('psql','-f', $file, $DBNAME); unlink $file; } From bugzilla-daemon at portal.open-bio.org Tue Apr 8 12:05:48 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 8 Apr 2008 12:05:48 -0400 Subject: [Bioperl-guts-l] [Bug 2479] bp_pg_bulk_load_gff.pl postgres GFF bulk loader broken In-Reply-To: Message-ID: <200804081605.m38G5mFM013343@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2479 cjfields at uiuc.edu changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #2 from cjfields at uiuc.edu 2008-04-08 12:05 EST ------- Committed to svn. Thanks! -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Apr 8 14:34:56 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 8 Apr 2008 14:34:56 -0400 Subject: [Bioperl-guts-l] [Bug 2483] New: request for implementation of write_assembly Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2483 Summary: request for implementation of write_assembly Product: BioPerl Version: 1.5 branch Platform: All OS/Version: All Status: NEW Severity: enhancement Priority: P2 Component: Core Components AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: jayoung at fhcrc.org CC: jayoung at fhcrc.org Hi, I would really love it if write_assembly (ace format) could be implemented in Bio::Assembly::IO::ace I realise it's probably a low priority thing, but thought I'd just throw it out there in case anyone is able to do it. thanks, Janet ------------------------------------------------------------------- Dr. Janet Young (Trask lab) Fred Hutchinson Cancer Research Center 1100 Fairview Avenue N., C3-168, P.O. Box 19024, Seattle, WA 98109-1024, USA. tel: (206) 667 1471 fax: (206) 667 6524 email: jayoung at fhcrc.org http://www.fhcrc.org/labs/trask/ ------------------------------------------------------------------- -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Apr 8 14:48:39 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 8 Apr 2008 14:48:39 -0400 Subject: [Bioperl-guts-l] [Bug 2483] request for implementation of write_assembly In-Reply-To: Message-ID: <200804081848.m38Imd5W022232@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2483 cjfields at uiuc.edu changed: What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|1.6 release |1.7 release ------- Comment #1 from cjfields at uiuc.edu 2008-04-08 14:48 EST ------- Bio::Assembly issues will be tackled in the next dev release. I agree it would be nice to have this, though. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Apr 8 21:44:41 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 8 Apr 2008 21:44:41 -0400 Subject: [Bioperl-guts-l] [Bug 2350] Bio::Assembly::Scaffold->add_singlet has a bug In-Reply-To: Message-ID: <200804090144.m391ifkD010215@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2350 cjfields at uiuc.edu changed: What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|1.6 release |1.7 release ------- Comment #4 from cjfields at uiuc.edu 2008-04-08 21:44 EST ------- Changing milestone to 1.7, along with other Bio::Assembly -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Apr 8 21:46:38 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 8 Apr 2008 21:46:38 -0400 Subject: [Bioperl-guts-l] [Bug 2370] Bio::Assembly::Scaffold Source In-Reply-To: Message-ID: <200804090146.m391kc5S010408@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2370 cjfields at uiuc.edu changed: What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|1.6 release |1.7 release ------- Comment #5 from cjfields at uiuc.edu 2008-04-08 21:46 EST ------- Pushing to 1.7. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Apr 9 17:44:36 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 9 Apr 2008 17:44:36 -0400 Subject: [Bioperl-guts-l] [Bug 2482] paml4 mlc file fails to parse In-Reply-To: Message-ID: <200804092144.m39LiaTX009507@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2482 ------- Comment #2 from cjfields at uiuc.edu 2008-04-09 17:44 EST ------- Confirmed in bioperl-live. Anyone familiar with PAML want to comment? -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Apr 9 18:03:07 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 9 Apr 2008 18:03:07 -0400 Subject: [Bioperl-guts-l] [Bug 2482] paml4 mlc file fails to parse In-Reply-To: Message-ID: <200804092203.m39M37wZ010482@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2482 ------- Comment #3 from jason at bioperl.org 2008-04-09 18:03 EST ------- uh, it sucks that the output format changes way too much between any possible release so that keeping this up-to-date is a bit of a losing battle.... Someone just has to spend a few minutes figuring out which extra lines are breaking it or if the order is changing. Stefan had reported problems with the different order of the sequence header lines in different versions making it really hard to make one parser that worked. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Apr 9 18:35:58 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 9 Apr 2008 18:35:58 -0400 Subject: [Bioperl-guts-l] [Bug 2482] paml4 mlc file fails to parse In-Reply-To: Message-ID: <200804092235.m39MZw1Y012196@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2482 ------- Comment #4 from cjfields at uiuc.edu 2008-04-09 18:35 EST ------- (In reply to comment #3) > uh, it sucks that the output format changes way too much between any possible > release so that keeping this up-to-date is a bit of a losing battle.... > > Someone just has to spend a few minutes figuring out which extra lines are > breaking it or if the order is changing. Stefan had reported problems with the > different order of the sequence header lines in different versions making it > really hard to make one parser that worked. I remember something about that, maybe from the list. I can try looking into it when I can, just not too familiar with the code (yet). -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 10 12:11:38 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 10 Apr 2008 12:11:38 -0400 Subject: [Bioperl-guts-l] [Bug 2482] paml4 mlc file fails to parse In-Reply-To: Message-ID: <200804101611.m3AGBcbj030260@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2482 ------- Comment #5 from kirovs at gmail.com 2008-04-10 12:11 EST ------- Seems like a different problem: in next_result the tree is not parsed, so %data is empty (but the branch data is read). Reason for that- no model line is detected before the tree data comes. Sorry cannot follow further for now. It would be useful if there is an old mlc file and what were the params with which codeml was called. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 10 13:01:02 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 10 Apr 2008 13:01:02 -0400 Subject: [Bioperl-guts-l] [Bug 2482] paml4 mlc file fails to parse In-Reply-To: Message-ID: <200804101701.m3AH12hf000491@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2482 ------- Comment #6 from jayoung at fhcrc.org 2008-04-10 13:01 EST ------- Thanks for looking into this. I'll add a couple more attachments, as suggested by Stefan. One is the codeml.ctl file associated with that output. The other is the mlc file generated using paml 3.15 for the same data, same parameters - this one parses fine. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 10 13:06:48 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 10 Apr 2008 13:06:48 -0400 Subject: [Bioperl-guts-l] [Bug 2482] paml4 mlc file fails to parse In-Reply-To: Message-ID: <200804101706.m3AH6mZ7000881@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2482 ------- Comment #7 from jayoung at fhcrc.org 2008-04-10 13:06 EST ------- Created an attachment (id=899) --> (http://bugzilla.open-bio.org/attachment.cgi?id=899&action=view) codeml.ctl file (parameters used) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 10 13:53:40 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 10 Apr 2008 13:53:40 -0400 Subject: [Bioperl-guts-l] [Bug 2482] paml4 mlc file fails to parse In-Reply-To: Message-ID: <200804101753.m3AHreeX003063@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2482 ------- Comment #8 from jayoung at fhcrc.org 2008-04-10 13:53 EST ------- Hi again, the plot thickens... Before uploading the mlc file from PAML v 3.15, I checked again whether it would parse. It did better than for the v4 mlc file, but also failed. next_result succeeded (unlike for the PAML v4 output, which failed at this step) but next_tree failed. Using an older version of bioperl I had successfully parsed that mlc file (PAML v 3.15) and got the tree information out, but when the PAML v4 mlc file failed to parse, I updated bioperl and now I can't parse the file I could parse before. I still have the output of the first parsing so I know it worked... I've been trying to figure out what older version of bioperl I was using but am having some trouble. I only recently finally figured out how to get Build.PL working on our system so I could do updates myself - before that I was using a version of Bioperl that the sysadmins people installed for me. I also used uninst when I built bioperl, so I think it removed any older versions of the modules it could find. Sorry! I know that's not helpful. I'm still not a very advanced user. Janet -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 10 13:55:48 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 10 Apr 2008 13:55:48 -0400 Subject: [Bioperl-guts-l] [Bug 2482] paml4 mlc file fails to parse In-Reply-To: Message-ID: <200804101755.m3AHtmVd003174@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2482 ------- Comment #9 from jayoung at fhcrc.org 2008-04-10 13:55 EST ------- Created an attachment (id=900) --> (http://bugzilla.open-bio.org/attachment.cgi?id=900&action=view) mlc file from PAML v 3.15. next_result succeeds, next_tree fails -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From lstein at dev.open-bio.org Thu Apr 10 17:10:39 2008 From: lstein at dev.open-bio.org (Lincoln Stein) Date: Thu, 10 Apr 2008 17:10:39 -0400 Subject: [Bioperl-guts-l] [14654] bioperl-live/trunk/Bio: corrected case in which seq_id of Bio:: Location::Split could unintentionally be set to undef Message-ID: <200804102110.m3ALAdhw014037@dev.open-bio.org> Revision: 14654 Author: lstein Date: 2008-04-10 17:10:38 -0400 (Thu, 10 Apr 2008) Log Message: ----------- corrected case in which seq_id of Bio::Location::Split could unintentionally be set to undef Modified Paths: -------------- bioperl-live/trunk/Bio/DB/GFF.pm bioperl-live/trunk/Bio/DB/SeqFeature/NormalizedFeature.pm bioperl-live/trunk/Bio/DB/SeqFeature/Store.pm bioperl-live/trunk/Bio/Graphics/FeatureFile.pm bioperl-live/trunk/Bio/Graphics/Glyph/cds.pm bioperl-live/trunk/Bio/Graphics/Glyph/generic.pm bioperl-live/trunk/Bio/Location/Split.pm Modified: bioperl-live/trunk/Bio/DB/GFF.pm =================================================================== --- bioperl-live/trunk/Bio/DB/GFF.pm 2008-04-08 16:05:13 UTC (rev 14653) +++ bioperl-live/trunk/Bio/DB/GFF.pm 2008-04-10 21:10:38 UTC (rev 14654) @@ -1031,15 +1031,20 @@ sub features { my $self = shift; - my ($types,$automerge,$sparse,$iterator,$other); + my ($types,$automerge,$sparse,$iterator,$refseq,$start,$end,$other); if (defined $_[0] && $_[0] =~ /^-/) { - ($types,$automerge,$sparse,$iterator,$other) = rearrange([ - [qw(TYPE TYPES)], - [qw(MERGE AUTOMERGE)], - [qw(RARE SPARSE)], - 'ITERATOR' - ], at _); + ($types,$automerge,$sparse,$iterator, + $refseq,$start,$end, + $other) = rearrange([ + [qw(TYPE TYPES)], + [qw(MERGE AUTOMERGE)], + [qw(RARE SPARSE)], + 'ITERATOR', + [qw(REFSEQ SEQ_ID)], + 'START', + [qw(STOP END)], + ], at _); } else { $types = \@_; } @@ -1048,8 +1053,11 @@ $automerge = $self->automerge unless defined $automerge; $other ||= {}; $self->_features({ - rangetype => 'contains', + rangetype => $refseq ? 'overlaps' : 'contains', types => $types, + refseq => $refseq, + start => $start, + stop => $end, }, { sparse => $sparse, automerge => $automerge, @@ -3377,6 +3385,7 @@ my ($search,$options,$parent) = @_; (@{$search}{qw(start stop)}) = (@{$search}{qw(stop start)}) if defined($search->{start}) && $search->{start} > $search->{stop}; + $search->{refseq} = $search->{seq_id} if exists $search->{seq_id}; my $types = $self->parse_types($search->{types}); # parse out list of types my @aggregated_types = @$types; # keep a copy Modified: bioperl-live/trunk/Bio/DB/SeqFeature/NormalizedFeature.pm =================================================================== --- bioperl-live/trunk/Bio/DB/SeqFeature/NormalizedFeature.pm 2008-04-08 16:05:13 UTC (rev 14653) +++ bioperl-live/trunk/Bio/DB/SeqFeature/NormalizedFeature.pm 2008-04-10 21:10:38 UTC (rev 14654) @@ -176,7 +176,7 @@ return Bio::PrimarySeq->new(-seq => $store->fetch_sequence($self->seq_id,$start,$end) || '', -id => $self->display_name); } else { - return $self->SUPER::seq($self->seq_id,$start,$end); + return $self->SUPER::seq($self->seq_id,$start,$end); } } Modified: bioperl-live/trunk/Bio/DB/SeqFeature/Store.pm =================================================================== --- bioperl-live/trunk/Bio/DB/SeqFeature/Store.pm 2008-04-08 16:05:13 UTC (rev 14653) +++ bioperl-live/trunk/Bio/DB/SeqFeature/Store.pm 2008-04-10 21:10:38 UTC (rev 14654) @@ -1170,7 +1170,8 @@ # sub fetch_sequence { my $self = shift; - my ($seqid,$start,$end,$class,$bioseq) = rearrange([['NAME','SEQID','SEQ_ID'],'START',['END','STOP'],'CLASS','BIOSEQ'], at _); + my ($seqid,$start,$end,$class,$bioseq) = rearrange([['NAME','SEQID','SEQ_ID'], + 'START',['END','STOP'],'CLASS','BIOSEQ'], at _); $seqid = "$seqid:$class" if defined $class; my $seq = $self->_fetch_sequence($seqid,$start,$end); return $seq unless $bioseq; Modified: bioperl-live/trunk/Bio/Graphics/FeatureFile.pm =================================================================== --- bioperl-live/trunk/Bio/Graphics/FeatureFile.pm 2008-04-08 16:05:13 UTC (rev 14653) +++ bioperl-live/trunk/Bio/Graphics/FeatureFile.pm 2008-04-10 21:10:38 UTC (rev 14654) @@ -1326,7 +1326,8 @@ require CGI unless defined &CGI::escape; my $n; $linkrule ||= ''; # prevent uninit warning - my $seq_id = $feature->can('location') ? $feature->location->seq_id : $feature->seq_id; +# my $seq_id = $feature->can('location') ? $feature->location->seq_id : $feature->seq_id; + my $seq_id = $feature->can('seq_id') ? $feature->seq_id() : $feature->location->seq_id(); $seq_id ||= $feature->seq_id; #fallback $linkrule =~ s/\$(\w+)/ CGI::escape( Modified: bioperl-live/trunk/Bio/Graphics/Glyph/cds.pm =================================================================== --- bioperl-live/trunk/Bio/Graphics/Glyph/cds.pm 2008-04-08 16:05:13 UTC (rev 14653) +++ bioperl-live/trunk/Bio/Graphics/Glyph/cds.pm 2008-04-10 21:10:38 UTC (rev 14654) @@ -139,33 +139,33 @@ $part->{cds_frame} = $frame; $part->{cds_offset} = $offset; - if ($fits && $part->feature->seq) { + if ($fits && (my $seq = $feature->seq)) { + BLOCK: { + $seq = $self->get_seq($seq); - # do in silico splicing in order to find the codon that - # arises from the splice - my $seq = $self->get_seq($part->feature->seq); - my $protein = $seq->translate(undef,undef,$phase,$codon_table)->seq; - $part->{cds_translation} = $protein; + # do in silico splicing in order to find the codon that + # arises from the splice + my $protein = $seq->translate(undef,undef,$phase,$codon_table)->seq; + $part->{cds_translation} = $protein; - BLOCK: { - length $protein >= $feature->length/3 and last BLOCK; - ($feature->length - $phase) % 3 == 0 and last BLOCK; - - my $next_part = $parts[$i+1] - or do { - $part->{cds_splice_residue} = '?'; - last BLOCK; }; - - my $next_feature = $next_part->feature or last BLOCK; - my $next_phase = eval {$next_feature->phase} or last BLOCK; - my $splice_codon = ''; - my $left_of_splice = substr($self->get_seq($feature->seq), -$next_phase, $next_phase); - my $right_of_splice = substr($self->get_seq($next_feature->seq),0 , 3-$next_phase); - $splice_codon = $left_of_splice . $right_of_splice; - length $splice_codon == 3 or last BLOCK; - my $amino_acid = $translate_table->translate($splice_codon); - $part->{cds_splice_residue} = $amino_acid; - } + length $protein >= $feature->length/3 and last BLOCK; + ($feature->length - $phase) % 3 == 0 and last BLOCK; + + my $next_part = $parts[$i+1] + or do { + $part->{cds_splice_residue} = '?'; + last BLOCK; }; + + my $next_feature = $next_part->feature or last BLOCK; + my $next_phase = eval {$next_feature->phase} or last BLOCK; + my $splice_codon = ''; + my $left_of_splice = substr($self->get_seq($feature->seq), -$next_phase, $next_phase); + my $right_of_splice = substr($self->get_seq($next_feature->seq),0 , 3-$next_phase); + $splice_codon = $left_of_splice . $right_of_splice; + length $splice_codon == 3 or last BLOCK; + my $amino_acid = $translate_table->translate($splice_codon); + $part->{cds_splice_residue} = $amino_acid; + } } } @@ -184,7 +184,7 @@ my $frame = $self->{cds_frame}; my $linecount = $self->sixframe ? 6 : 3; - unless ($self->protein_fits) { + unless ($self->protein_fits && $self->{cds_translation}) { my $height = ($y2-$y1)/$linecount; my $offset = $y1 + $height*$frame; $offset += ($y2-$y1)/2 if $self->sixframe && $self->strand < 0; Modified: bioperl-live/trunk/Bio/Graphics/Glyph/generic.pm =================================================================== --- bioperl-live/trunk/Bio/Graphics/Glyph/generic.pm 2008-04-08 16:05:13 UTC (rev 14653) +++ bioperl-live/trunk/Bio/Graphics/Glyph/generic.pm 2008-04-10 21:10:38 UTC (rev 14654) @@ -548,7 +548,8 @@ # hack around changed feature API sub get_seq { my $self = shift; - my $seq = shift; + my $seq = shift; + return unless $seq; return $seq if ref $seq && $seq->can('translate'); require Bio::PrimarySeq unless Bio::PrimarySeq->can('new'); return Bio::PrimarySeq->new(-seq=>$seq); Modified: bioperl-live/trunk/Bio/Location/Split.pm =================================================================== --- bioperl-live/trunk/Bio/Location/Split.pm 2008-04-08 16:05:13 UTC (rev 14653) +++ bioperl-live/trunk/Bio/Location/Split.pm 2008-04-10 21:10:38 UTC (rev 14654) @@ -563,14 +563,14 @@ =cut sub seq_id { - my ($self, $seqid) = @_; + my $self = shift; - if(! $self->is_remote()) { + if(@_ && !$self->is_remote()) { foreach my $subloc ($self->sub_Location(0)) { - $subloc->seq_id($seqid) if ! $subloc->is_remote(); + $subloc->seq_id(@_) if !$subloc->is_remote(); } } - return $self->SUPER::seq_id($seqid); + return $self->SUPER::seq_id(@_); } =head2 coordinate_policy From bugzilla-daemon at portal.open-bio.org Thu Apr 10 20:00:51 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 10 Apr 2008 20:00:51 -0400 Subject: [Bioperl-guts-l] [Bug 2485] New: Bio::SearchIO::Writer::HSPTableWriter - 'frame' column messes up the output Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2485 Summary: Bio::SearchIO::Writer::HSPTableWriter - 'frame' column messes up the output Product: BioPerl Version: 1.5 branch Platform: All OS/Version: All Status: NEW Severity: normal Priority: P2 Component: Bio::Search/Bio::SearchIO AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: jayoung at fhcrc.org CC: jayoung at fhcrc.org Hi, Sorry to keep bugging you all - I'm doing a lot of updates to old scripts at the moment so I keep finding small problems. I would imagine this will be a fairly easy one to fix. I'm using Bio::SearchIO and Bio::SearchIO::Writer::HSPTableWriter to parse blastall output (NCBI). I updated from bioperl-live last week. It's mostly working fine, except that for tblastn outputs, when I try to include frame in the output using HSPTableWriter. HSPs with frame 0 are OK, but if frame=1 or 2, the output gets messed up. The frame output includes an extra tab character before the 1 or 2. If there's an empty column after frame, I see the 1 or 2. If frame is the last column in the output, then the 1 or 2 is then lost. If other non-empty columns follow frame, data from one or two other columns seems to get overwritten. If I look at frame using $hsp->frame() it looks fine (no extra tabs, etc), so it's parsing OK, just not being output properly. I'll paste in my script in just a minute, and I'll also attach a sample blast output. thanks, Janet -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 10 20:01:06 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 10 Apr 2008 20:01:06 -0400 Subject: [Bioperl-guts-l] [Bug 2485] Bio::SearchIO::Writer::HSPTableWriter - 'frame' column messes up the output In-Reply-To: Message-ID: <200804110001.m3B016qM022384@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2485 ------- Comment #1 from jayoung at fhcrc.org 2008-04-10 20:01 EST ------- #!/usr/bin/perl use warnings; use strict; use Bio::SearchIO; use Bio::SearchIO::Writer::HSPTableWriter; #set signif #my $signif = "1e-3"; my $signif = "1e-5"; #my $signif = "100"; #-------------------------------- foreach my $file (@ARGV){ print "file is $file\n"; my $resultsfile = "$file.procnew.simple"; print "output file is $resultsfile\n"; my $blastObj = new Bio::SearchIO( -file => $file, -format => 'blast', -signif => $signif, ); #note - the frame column messes things up. Putting it in different positions in the column list is a little informative my $writer = Bio::SearchIO::Writer::HSPTableWriter->new(-columns => [qw( query_name query_length hit_name hit_length expect score bits rank frac_identical_query frac_conserved_query length_aln_query length_aln_hit gaps_query gaps_hit start_query end_query start_hit end_hit strand_query strand_hit hit_description frame hit_description )] ); my $out = Bio::SearchIO->new( -writer => $writer, -file => ">$resultsfile" ); while ( my $result = $blastObj->next_result() ) { while( my $hit = $result->next_hit ) { while( my $hsp = $hit->next_hsp ) { my $frame = $hsp->frame(); print "frame $frame blah\n"; } } $out -> write_result($result); } } print "done.\n"; -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 10 20:02:56 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 10 Apr 2008 20:02:56 -0400 Subject: [Bioperl-guts-l] [Bug 2485] Bio::SearchIO::Writer::HSPTableWriter - 'frame' column messes up the output In-Reply-To: Message-ID: <200804110002.m3B02uwv022592@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2485 ------- Comment #2 from jayoung at fhcrc.org 2008-04-10 20:02 EST ------- Created an attachment (id=901) --> (http://bugzilla.open-bio.org/attachment.cgi?id=901&action=view) the script (attaching the script too - formatting is a little messed up in the copy-paste) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 10 20:03:57 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 10 Apr 2008 20:03:57 -0400 Subject: [Bioperl-guts-l] [Bug 2485] Bio::SearchIO::Writer::HSPTableWriter - 'frame' column messes up the output In-Reply-To: Message-ID: <200804110003.m3B03v8W022644@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2485 ------- Comment #3 from jayoung at fhcrc.org 2008-04-10 20:03 EST ------- Created an attachment (id=902) --> (http://bugzilla.open-bio.org/attachment.cgi?id=902&action=view) test case -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From lapp at dev.open-bio.org Fri Apr 11 19:19:32 2008 From: lapp at dev.open-bio.org (Hilmar Lapp) Date: Fri, 11 Apr 2008 19:19:32 -0400 Subject: [Bioperl-guts-l] [14655] bioperl-live/trunk/Bio/Factory/SequenceFactoryI.pm: Applied SYNOPSIS patch by Adam Sjogre (asjo at koldfront dot dk). Message-ID: <200804112319.m3BNJWFP016816@dev.open-bio.org> Revision: 14655 Author: lapp Date: 2008-04-11 19:19:31 -0400 (Fri, 11 Apr 2008) Log Message: ----------- Applied SYNOPSIS patch by Adam Sjogre (asjo at koldfront dot dk). Modified Paths: -------------- bioperl-live/trunk/Bio/Factory/SequenceFactoryI.pm Modified: bioperl-live/trunk/Bio/Factory/SequenceFactoryI.pm =================================================================== --- bioperl-live/trunk/Bio/Factory/SequenceFactoryI.pm 2008-04-10 21:10:38 UTC (rev 14654) +++ bioperl-live/trunk/Bio/Factory/SequenceFactoryI.pm 2008-04-11 23:19:31 UTC (rev 14655) @@ -20,7 +20,7 @@ # get a Bio::Factory::SequenceFactoryI object like use Bio::Seq::SeqFactory; - my $seqbuilder = Bio::Seq::SeqFactory->new('type' => 'Bio::PrimarySeq'); + my $seqbuilder = Bio::Seq::SeqFactory->new('-type' => 'Bio::PrimarySeq'); my $seq = $seqbuilder->create(-seq => 'ACTGAT', -display_id => 'exampleseq'); From lstein at dev.open-bio.org Mon Apr 14 11:05:38 2008 From: lstein at dev.open-bio.org (Lincoln Stein) Date: Mon, 14 Apr 2008 11:05:38 -0400 Subject: [Bioperl-guts-l] [14656] bioperl-live/trunk: added a clone() method to support the (uncommon ) case of passing database adaptors across a fork() Message-ID: <200804141505.m3EF5ctU029537@dev.open-bio.org> Revision: 14656 Author: lstein Date: 2008-04-14 11:05:37 -0400 (Mon, 14 Apr 2008) Log Message: ----------- added a clone() method to support the (uncommon) case of passing database adaptors across a fork() Modified Paths: -------------- bioperl-live/trunk/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm bioperl-live/trunk/Bio/DB/GFF/Adaptor/dbi.pm bioperl-live/trunk/Bio/DB/GFF.pm bioperl-live/trunk/Bio/DB/SeqFeature/Store/DBI/mysql.pm bioperl-live/trunk/Bio/DB/SeqFeature/Store.pm bioperl-live/trunk/t/BioDBGFF.t bioperl-live/trunk/t/BioDBSeqFeature.t Modified: bioperl-live/trunk/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm =================================================================== --- bioperl-live/trunk/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm 2008-04-11 23:19:31 UTC (rev 14655) +++ bioperl-live/trunk/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm 2008-04-14 15:05:37 UTC (rev 14656) @@ -144,6 +144,17 @@ $wrapper; } +# The clone method should only be called in child processes after a fork(). +# It does two things: (1) it sets the "real" dbh's InactiveDestroy to 1, +# thereby preventing the database connection from being destroyed in +# the parent when the dbh's destructor is called; (2) it replaces the +# "real" dbh with the result of dbh->clone(), so that we now have an +# independent handle. +sub clone { + my $self = shift; + foreach (@{$self->{dbh}}) { $_->clone }; +} + =head2 attribute Title : attribute @@ -213,6 +224,18 @@ shift->{dbh}->{ActiveKids}; } +# The clone method should only be called in child processes after a fork(). +# It does two things: (1) it sets the "real" dbh's InactiveDestroy to 1, +# thereby preventing the database connection from being destroyed in +# the parent when the dbh's destructor is called; (2) it replaces the +# "real" dbh with the result of dbh->clone(), so that we now have an +# independent handle. +sub clone { + my $self = shift; + $self->{dbh}{InactiveDestroy} = 1; + $self->{dbh} = $self->{dbh}->clone; +} + sub DESTROY { } sub AUTOLOAD { Modified: bioperl-live/trunk/Bio/DB/GFF/Adaptor/dbi.pm =================================================================== --- bioperl-live/trunk/Bio/DB/GFF/Adaptor/dbi.pm 2008-04-11 23:19:31 UTC (rev 14655) +++ bioperl-live/trunk/Bio/DB/GFF/Adaptor/dbi.pm 2008-04-14 15:05:37 UTC (rev 14656) @@ -1147,7 +1147,26 @@ } } +=head2 clone +The clone() method should be used when you want to pass the +Bio::DB::GFF object to a child process across a fork(). The child must +call clone() before making any queries. + +This method does two things: (1) it sets the underlying database +handle's InactiveDestroy parameter to 1, thereby preventing the +database connection from being destroyed in the parent when the dbh's +destructor is called; (2) it replaces the dbh with the result of +dbh->clone(), so that we now have an independent handle. + +=cut + +sub clone { + my $self = shift; + $self->features_db->clone; +} + + =head1 QUERIES TO IMPLEMENT The following astract methods either return DBI statement handles or Modified: bioperl-live/trunk/Bio/DB/GFF.pm =================================================================== --- bioperl-live/trunk/Bio/DB/GFF.pm 2008-04-11 23:19:31 UTC (rev 14655) +++ bioperl-live/trunk/Bio/DB/GFF.pm 2008-04-14 15:05:37 UTC (rev 14656) @@ -3336,8 +3336,21 @@ return (); } +=head2 clone +The clone() method should be used when you want to pass the +Bio::DB::GFF object to a child process across a fork(). The child must +call clone() before making any queries. +The default behavior is to do nothing, but adaptors that use the DBI +interface may need to implement this in order to avoid database handle +errors. See the dbi adaptor for an example. + +=cut + +sub clone { } + + =head1 Internal Methods The following methods are internal to Bio::DB::GFF and are not Modified: bioperl-live/trunk/Bio/DB/SeqFeature/Store/DBI/mysql.pm =================================================================== --- bioperl-live/trunk/Bio/DB/SeqFeature/Store/DBI/mysql.pm 2008-04-11 23:19:31 UTC (rev 14655) +++ bioperl-live/trunk/Bio/DB/SeqFeature/Store/DBI/mysql.pm 2008-04-14 15:05:37 UTC (rev 14656) @@ -339,6 +339,13 @@ $d; } +sub clone { + my $self = shift; + $self->{dbh}{InactiveDestroy} = 1; + $self->{dbh} = $self->{dbh}->clone + unless $self->is_temp; +} + ### # get/set directory for bulk load tables # Modified: bioperl-live/trunk/Bio/DB/SeqFeature/Store.pm =================================================================== --- bioperl-live/trunk/Bio/DB/SeqFeature/Store.pm 2008-04-11 23:19:31 UTC (rev 14655) +++ bioperl-live/trunk/Bio/DB/SeqFeature/Store.pm 2008-04-14 15:05:37 UTC (rev 14656) @@ -1507,6 +1507,20 @@ $d; } +=head2 clone + +The clone() method should be used when you want to pass the +Bio::DB::SeqFeature::Store object to a child process across a +fork(). The child must call clone() before making any queries. + +The default behavior is to do nothing, but adaptors that use the DBI +interface may need to implement this in order to avoid database handle +errors. See the dbi adaptor for an example. + +=cut + +sub clone { } + ################################# TIE interface #################### =head1 TIE Interface Modified: bioperl-live/trunk/t/BioDBGFF.t =================================================================== --- bioperl-live/trunk/t/BioDBGFF.t 2008-04-11 23:19:31 UTC (rev 14655) +++ bioperl-live/trunk/t/BioDBGFF.t 2008-04-14 15:05:37 UTC (rev 14656) @@ -8,7 +8,7 @@ use lib 't/lib'; use BioperlTest; - test_begin(-tests => 277); + test_begin(-tests => 279); use_ok('Bio::DB::GFF'); } @@ -419,13 +419,29 @@ } } + # test ability to pass adaptors across a fork + if (my $child = open(F,"-|")) { # parent reads from child + ok(scalar ); + close F; + } + else { # in child + $db->clone; + my @f = $db->features(); + print @f>0; + exit 0; + } + ok(!defined eval{$db->delete()}); ok($db->delete(-force=>1)); is(scalar $db->features,0); ok(!$db->segment('Contig1')); + } + } + + END { unlink $fasta_files."/directory.index"; } Modified: bioperl-live/trunk/t/BioDBSeqFeature.t =================================================================== --- bioperl-live/trunk/t/BioDBSeqFeature.t 2008-04-11 23:19:31 UTC (rev 14655) +++ bioperl-live/trunk/t/BioDBSeqFeature.t 2008-04-14 15:05:37 UTC (rev 14656) @@ -2,7 +2,7 @@ # $Id$ use strict; -use constant TEST_COUNT => 55; +use constant TEST_COUNT => 57; BEGIN { use lib 't/lib'; @@ -169,4 +169,21 @@ is (@lines, 2); ok("@lines" !~ /Parent=/s); ok("@lines" =~ /ID=/s); + +if (my $child = open(F,"-|")) { # parent reads from child + cmp_ok(scalar ,'>',0); + close F; + # The challenge is to make sure that the handle + # still works in the parent! + my @f = $db->features(); + cmp_ok(scalar @f,'>',0); } +else { # in child + $db->clone; + my @f = $db->features(); + my $feature_count = @f; + print $feature_count; + exit 0; +} + +} From bugzilla-daemon at portal.open-bio.org Mon Apr 14 13:31:21 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 14 Apr 2008 13:31:21 -0400 Subject: [Bioperl-guts-l] [Bug 2484] This bug is repeated for several sequences In-Reply-To: Message-ID: <200804141731.m3EHVLQW008772@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2484 cjfields at uiuc.edu changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|birney at ebi.ac.uk |bioperl-guts-l at bioperl.org Severity|normal |major Version|unspecified |main-trunk ------- Comment #1 from cjfields at uiuc.edu 2008-04-14 13:31 EST ------- Confirmed using bioperl-live. Attaching raw EMBL file generating the error (from dbfetch). -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Apr 14 13:32:56 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 14 Apr 2008 13:32:56 -0400 Subject: [Bioperl-guts-l] [Bug 2484] This bug is repeated for several sequences In-Reply-To: Message-ID: <200804141732.m3EHWuoS008888@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2484 ------- Comment #2 from cjfields at uiuc.edu 2008-04-14 13:32 EST ------- Created an attachment (id=905) --> (http://bugzilla.open-bio.org/attachment.cgi?id=905&action=view) EMBL test case; pass through SeqIO to see error. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Apr 14 13:42:02 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 14 Apr 2008 13:42:02 -0400 Subject: [Bioperl-guts-l] [Bug 2484] This bug is repeated for several sequences In-Reply-To: Message-ID: <200804141742.m3EHg2Qo009257@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2484 ------- Comment #3 from cjfields at uiuc.edu 2008-04-14 13:42 EST ------- using the (experimental) 'embldriver' format uses a different parser which appears to work (using perl 5.10): use Bio::SeqIO; use feature qw(say); my $in = Bio::SeqIO->new(-format => 'embldriver', -file => 'input.embl'); my $seq = $in->next_seq; say $seq->species->scientific_name; say join(';',$seq->species->classification); -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Mon Apr 14 13:56:14 2008 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Mon, 14 Apr 2008 13:56:14 -0400 Subject: [Bioperl-guts-l] [14657] bioperl-live/trunk/Bio/SeqIO/embl.pm: bug 2484 Message-ID: <200804141756.m3EHuEhF029778@dev.open-bio.org> Revision: 14657 Author: cjfields Date: 2008-04-14 13:56:14 -0400 (Mon, 14 Apr 2008) Log Message: ----------- bug 2484 Modified Paths: -------------- bioperl-live/trunk/Bio/SeqIO/embl.pm Modified: bioperl-live/trunk/Bio/SeqIO/embl.pm =================================================================== --- bioperl-live/trunk/Bio/SeqIO/embl.pm 2008-04-14 15:05:37 UTC (rev 14656) +++ bioperl-live/trunk/Bio/SeqIO/embl.pm 2008-04-14 17:56:14 UTC (rev 14657) @@ -1038,7 +1038,7 @@ # only split on ';' or '.' so that classification that is 2 or more words # will still get matched, use map() to remove trailing/leading/intervening # spaces - my @class = map { s/^\s+//; s/\s+$//; s/\s{2,}/ /g; $_; } split /[;\.]+/, $class_lines; + my @class = map { s/^\s+//; s/\s+$//; s/\s{2,}/ /g; $_; } split /(? Message-ID: <200804141756.m3EHuiLj009980@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2484 cjfields at uiuc.edu changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #4 from cjfields at uiuc.edu 2008-04-14 13:56 EST ------- Fixed in subversion. thanks! -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From jason at dev.open-bio.org Mon Apr 14 17:06:16 2008 From: jason at dev.open-bio.org (Jason Stajich) Date: Mon, 14 Apr 2008 17:06:16 -0400 Subject: [Bioperl-guts-l] [14658] bioperl-live/trunk/Bio/Align/DNAStatistics.pm: typo Message-ID: <200804142106.m3EL6G6F029990@dev.open-bio.org> Revision: 14658 Author: jason Date: 2008-04-14 17:06:16 -0400 (Mon, 14 Apr 2008) Log Message: ----------- typo Modified Paths: -------------- bioperl-live/trunk/Bio/Align/DNAStatistics.pm Modified: bioperl-live/trunk/Bio/Align/DNAStatistics.pm =================================================================== --- bioperl-live/trunk/Bio/Align/DNAStatistics.pm 2008-04-14 17:56:14 UTC (rev 14657) +++ bioperl-live/trunk/Bio/Align/DNAStatistics.pm 2008-04-14 21:06:16 UTC (rev 14658) @@ -1548,7 +1548,7 @@ =head2 get_syn_changes Title : get_syn_changes - Usage : Bio::Align::DNAStatitics->get_syn_chnages + Usage : Bio::Align::DNAStatitics->get_syn_changes Function: Generate a hashref of all pairwise combinations of codns differing by 1 Returns : Symetic matrix using hashes From bugzilla-daemon at portal.open-bio.org Mon Apr 14 19:09:09 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 14 Apr 2008 19:09:09 -0400 Subject: [Bioperl-guts-l] [Bug 2332] Software for analysis of redundant fragments of affys human mitochip v2 In-Reply-To: Message-ID: <200804142309.m3EN99E5026902@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2332 marian.thieme at lycos.de changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #704 is|0 |1 obsolete| | Attachment #706 is|0 |1 obsolete| | Attachment #772 is|0 |1 obsolete| | Attachment #773 is|0 |1 obsolete| | Attachment #774 is|0 |1 obsolete| | ------- Comment #15 from marian.thieme at lycos.de 2008-04-14 19:09 EST ------- Created an attachment (id=906) --> (http://bugzilla.open-bio.org/attachment.cgi?id=906&action=view) Updated Version of TestScript Due to changes of the module ReseqChip.pm a new version of a Testscript is provided. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From thm09830 at dev.open-bio.org Mon Apr 14 19:42:46 2008 From: thm09830 at dev.open-bio.org (Marian Thieme) Date: Mon, 14 Apr 2008 19:42:46 -0400 Subject: [Bioperl-guts-l] [14659] bioperl-live/trunk/Bio/Microarray/Tools/ReseqChip.pm: Minor changes in the main processing function calc_sequence(), Logmessages are buffered, hence only one file open/write/ close operation is needed per processed chip, udpated version of testscript is provided via Bug #2332 in Bioperl bugzilla , Documentation updated Message-ID: <200804142342.m3ENgkgv030357@dev.open-bio.org> Revision: 14659 Author: thm09830 Date: 2008-04-14 19:42:46 -0400 (Mon, 14 Apr 2008) Log Message: ----------- Minor changes in the main processing function calc_sequence(), Logmessages are buffered, hence only one file open/write/close operation is needed per processed chip, udpated version of testscript is provided via Bug #2332 in Bioperl bugzilla, Documentation updated Modified Paths: -------------- bioperl-live/trunk/Bio/Microarray/Tools/ReseqChip.pm Modified: bioperl-live/trunk/Bio/Microarray/Tools/ReseqChip.pm =================================================================== --- bioperl-live/trunk/Bio/Microarray/Tools/ReseqChip.pm 2008-04-14 21:06:16 UTC (rev 14658) +++ bioperl-live/trunk/Bio/Microarray/Tools/ReseqChip.pm 2008-04-14 23:42:46 UTC (rev 14659) @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ # PACKAGE : Bio::Microarray::Tools::ReseqChip -# PURPOSE : Analyse redundant fragments of Affymetrix Resequencing Chip +# PURPOSE : Analyse additional probe oligonucleotides of Resequencing Chips # AUTHOR : Marian Thieme # CREATED : 21.09.2007 # REVISION: @@ -12,16 +12,15 @@ =head1 NAME -Bio::Microarray::Tools::ReseqChip - Class for extraction and incorporation of information - about redundant fragments of Affy Mitochip v2.0 +Bio::Microarray::Tools::ReseqChip - Class for analysing additional probe oligonucleotides of Resequencing Chips (for instance Affy Mitochip v2.0) =head1 SYNOPSIS - use RedundantFragments; + use ReseqChip; my %ref_seq_max_ins_hash=(3106 => 1); - my $reseqfragSample=Bio::Tools::ReseqChipRedundantFragments->new( + my $reseqfragSample=Bio::Tools::ReseqChip->new( $Affy_frags_design_filename, $format, \%ref_seq_max_ins_hash, @@ -30,6 +29,26 @@ my $aln = new Bio::SimpleAlign(); my $in = Bio::SeqIO->new(-file => $Affy_reseq_sample_fasta_file, -format => 'Fasta'); + + my %options_hash=( + include_main_sequence => 1, + insertions => 1, + deletions => 1, + depth_ins => 1, + depth_del => 9, + depth => 1, + consider_context => 1, + flank_left => 10, + flank_right => 10, + allowed_n_in_flank => 0, + flank_left_ins => 4, + flank_right_ins => 4, + allowed_n_in_flank_ins => 1, + flank_size_weak => 1, + call_threshold => 55, + ins_threshold => 35, + del_threshold => 75, + swap_ins => 1); while ( (my $seq = $in->next_seq())) { @@ -43,25 +62,41 @@ } $aln->add_seq($locseq); } - my $new_sequence=$reseqfragSample->calc_sequence($aln, $options_hash [,"output_file"]); + my $new_sequence=$reseqfragSample->calc_sequence($aln, \%options_hash [,"output_file"]); =head1 DESCRIPTION -Process Affy MitoChip v2 Data to create an alignment of the "redundant" fragments to the reference sequence, -taking account for insertions/deletion which are defined by Affy mtDNA_Design_Annotion.xls file. Based on -that alignment substitutions, deletions and insertion can be detected and initally not called bases can called -as well possible falsly called bases can recalled. Moreover insertion and deletion as well as snps lying in highly -variable regions can be detected. Calls are done depending on the depth at a certain position -in the alignment and sequence reliability (in terms of certain number of allowed Ns in a k-base-window within -each redundant fragment, contributing to a certain alignment position). +This Software module aim to infer information of the addtional oligonucleotide probes, covering different known variants. +Oligonucleotide Array based Resequencing is done in the local context of a reference sequence. Every position in +the genomic areas of interest is interrogated using 8 different 25-mer oligonucleotide probes (forward and reverse strand). +Their middle base varies across the four possible bases, while the flanking regions are identical +with the reference sequence or its reverse strand respectively. For genomic regions with known variability across individuals, +additional probes were added to the chip. They interrogate postions in the neighborhood of polymorphisms not only in the local context +of the reference sequence but also in the context of its known variants. +This software (ReseqChip.pm) is tested to work with MitoChip v2.0 Data, manufactured by Affymetrix and the parser (MitoChipV2Parser) +reads the probe design file (Affy mtDNA_Design_Annotion.xls) wich describes the design of the probes. +The software approaches the problem in the following way: +1. An alignment of the addtional probes to the reference sequence is created (taking account for insertions/deletion) +2. Based on that alignment each position, which is covered by at least one additional probe is investigated to find a consensus call. + +This is done indirectly by excluding those probes, which appear to be inadequate for the individual. An indication for +inadaquacy is a local accumulation of N-calls. We investigate calls in neighborhoods of length K around +each sequence position in all available local context probes and count the number of N-calls in them. +That menas, in addition to the call obtained using the references sequence base call we obtain data from all alternative +local background probes that were available for the current position. All probes with more then maxN N-calls in the +K-neighborhood are excluded. Because it may happen that different candidate bases occur we introduce to more parameters minP and minU. +If more then minP probes remain after filtering and more then minU percent of them call the base x, +were x is the most frequently called base, then x is included in the final sequence, otherwise the letter N is included. + + Assumption: Gaps which are inserted in several fragments and in the reference sequence itself refer to the reference sequence. The reference sequence is given as input parameter. -Optionshash, specifying conditions if a call is done is given when calculating the sequence respect to redundant -fragments (calc_sequence()). +Optionshash, specifying the explained parameter and some further options is provided by the user. + This module depends on the following modules: use Bio::Microarray::Tools::MitoChipV2Parser use Bio::SeqIO; @@ -107,6 +142,7 @@ use base qw(Bio::Root::Root); + use Bio::Microarray::Tools::MitoChipV2Parser; use Bio::SeqIO; @@ -133,7 +169,7 @@ member variables. - Returns : Returns a new RedundantFragments object + Returns : Returns a new ReseqChip object Args : $Affy_frags_design_filename (Affymetrix xls design file, for instance: mtDNA_design_annotation_FINAL.xls for mitochondrial Genome) @@ -158,7 +194,7 @@ sub new { - my ($class, $fi