From bugzilla-daemon at portal.open-bio.org Tue Apr 1 10:14:18 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 1 Apr 2008 10:14:18 -0400
Subject: [Bioperl-guts-l] [Bug 2466] NCBIHelper redirecting RefSeq sequence
download to EBI server
In-Reply-To:
Message-ID: <200804011414.m31EEICY014764@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2466
------- Comment #2 from nsoranzo at tiscali.it 2008-04-01 10:14 EST -------
(In reply to comment #1)
> Not sure why the redirect is in place; I'll try looking back to determine why
> it was added in. If needed we can leave the code in but change the default
> 'no_redirect' setting to 1.
I don't know why it was added, surely changing the default would help, but
anyway I think eliminating the code in question would be better.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Tue Apr 1 10:26:02 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 1 Apr 2008 10:26:02 -0400
Subject: [Bioperl-guts-l] [Bug 2466] NCBIHelper redirecting RefSeq sequence
download to EBI server
In-Reply-To:
Message-ID: <200804011426.m31EQ2sa016746@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2466
------- Comment #3 from cjfields at uiuc.edu 2008-04-01 10:26 EST -------
I agree. I still need to track down the reason (I think it has something to do
with better annotation with EMBL format). However, if it isn't working as
advertised then the best fix is removing or commenting out the offending code.
I'm picking the latter option with a comment that points to this bug report;
I'll close this out when the fix is committed.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From cjfields at dev.open-bio.org Tue Apr 1 12:31:17 2008
From: cjfields at dev.open-bio.org (Christopher John Fields)
Date: Tue, 1 Apr 2008 12:31:17 -0400
Subject: [Bioperl-guts-l] [14638] bioperl-live/trunk/Bio/DB/NCBIHelper.pm:
bug 2466 :
Message-ID: <200804011631.m31GVHRG009990@dev.open-bio.org>
Revision: 14638
Author: cjfields
Date: 2008-04-01 12:31:16 -0400 (Tue, 01 Apr 2008)
Log Message:
-----------
bug 2466 :
* change default behavior of Bio::DB::GenBank to always retrieve from NCBI
* deprecate 'no_redirect' in favor of 'redirect_refseq', which must be set for RefSeq redirection (see note above)
* make explicit getter/setters out of redirect_refseq, no_redirect, seq_start, seq_end, strand, complexity along with docs
Modified Paths:
--------------
bioperl-live/trunk/Bio/DB/NCBIHelper.pm
Modified: bioperl-live/trunk/Bio/DB/NCBIHelper.pm
===================================================================
--- bioperl-live/trunk/Bio/DB/NCBIHelper.pm 2008-04-01 00:02:24 UTC (rev 14637)
+++ bioperl-live/trunk/Bio/DB/NCBIHelper.pm 2008-04-01 16:31:16 UTC (rev 14638)
@@ -104,30 +104,21 @@
'gbwithparts' => 'genbank',
);
$DEFAULTFORMAT = 'gb';
- @ATTRIBUTES = qw(complexity strand seq_start seq_stop no_redirect);
- for my $method (@ATTRIBUTES) {
- eval <{'_$method'};
- \$self->{'_$method'} = shift if \@_;
- \$d;
+ @ATTRIBUTES = qw(complexity strand seq_start seq_stop);
}
-END
- }
-}
# the new way to make modules a little more lightweight
sub new {
my ($class, @args ) = @_;
my $self = $class->SUPER::new(@args);
- my ($seq_start,$seq_stop,$no_redirect,$complexity,$strand) =
- $self->_rearrange([qw(SEQ_START SEQ_STOP NO_REDIRECT COMPLEXITY STRAND)],
+ my ($seq_start,$seq_stop,$no_redirect, $redirect, $complexity,$strand) =
+ $self->_rearrange([qw(SEQ_START SEQ_STOP NO_REDIRECT REDIRECT_REFSEQ COMPLEXITY STRAND)],
@args);
$seq_start && $self->seq_start($seq_start);
$seq_stop && $self->seq_stop($seq_stop);
$no_redirect && $self->no_redirect($no_redirect);
+ $redirect && $self->redirect_refseq($redirect);
$strand && $self->strand($strand);
# adjust statement to accept zero value
defined $complexity && ($complexity >=0 && $complexity <=4)
@@ -336,6 +327,121 @@
return @{$self->{'_format'}};
}
+=head2 redirect_refseq
+
+ Title : redirect_refseq
+ Usage : $db->redirect_refseq(1)
+ Function: simple getter/setter which redirects RefSeqs to use Bio::DB::RefSeq
+ Returns : Boolean value
+ Args : Boolean value (optional)
+ Throws : 'unparseable output exception'
+ Note : This replaces 'no_redirect' as a more straightforward flag to
+ redirect possible RefSeqs to use Bio::DB::RefSeq (EBI interface)
+ instead of retrievign the NCBI records
+
+=cut
+
+sub redirect_refseq {
+ my $self = shift;
+ return $self->{'_redirect_refseq'} = shift if @_;
+ return $self->{'_redirect_refseq'};
+}
+
+=head2 complexity
+
+ Title : complexity
+ Usage : $db->complexity(3)
+ Function: get/set complexity value
+ Returns : value from 0-4 indicating level of complexity
+ Args : value from 0-4 (optional); if unset server assumes 1
+ Throws : if arg is not an integer or falls outside of noted range above
+ Note : From efetch docs:
+
+ Complexity regulates the display:
+
+ * 0 - get the whole blob
+ * 1 - get the bioseq for gi of interest (default in Entrez)
+ * 2 - get the minimal bioseq-set containing the gi of interest
+ * 3 - get the minimal nuc-prot containing the gi of interest
+ * 4 - get the minimal pub-set containing the gi of interest
+
+=cut
+
+sub complexity {
+ my ($self, $comp) = @_;
+ if (defined $comp) {
+ $self->throw("Complexity value must be integer between 0 and 4") if
+ $comp !~ /^\d+$/ || $comp < 0 || $comp > 4;
+ $self->{'_complexity'} = $comp;
+ }
+ return $self->{'_complexity'};
+}
+
+=head2 strand
+
+ Title : strand
+ Usage : $db->strand(1)
+ Function: get/set strand value
+ Returns : strand value if set
+ Args : value of 1 (plus) or 2 (minus); if unset server assumes 1
+ Throws : if arg is not an integer or is not 1 or 2
+ Note : This differs from BioPerl's use of strand: 1 = plus, -1 = minus 0 = not relevant.
+ We should probably add in some functionality to convert over in the future.
+
+=cut
+
+sub strand {
+ my ($self, $str) = @_;
+ if ($str) {
+ $self->throw("strand() must be integer value of 1 (plus strand) or 2 (minus strand) if set") if
+ $str !~ /^\d+$/ || $str < 1 || $str > 2;
+ $self->{'_strand'} = $str;
+ }
+ return $self->{'_strand'};
+}
+
+=head2 seq_start
+
+ Title : seq_start
+ Usage : $db->seq_start(123)
+ Function: get/set sequence start location
+ Returns : sequence start value if set
+ Args : integer; if unset server assumes 1
+ Throws : if arg is not an integer
+
+=cut
+
+sub seq_start {
+ my ($self, $start) = @_;
+ if ($start) {
+ $self->throw("seq_start() must be integer value if set") if
+ $start !~ /^\d+$/;
+ $self->{'_seq_start'} = $start;
+ }
+ return $self->{'_seq_start'};
+}
+
+=head2 seq_stop
+
+ Title : seq_stop
+ Usage : $db->seq_stop(456)
+ Function: get/set sequence stop (end) location
+ Returns : sequence stop (end) value if set
+ Args : integer; if unset server assumes 1
+ Throws : if arg is not an integer
+
+=cut
+
+sub seq_stop {
+ my ($self, $stop) = @_;
+ if ($stop) {
+ $self->throw("seq_stop() must be integer if set") if
+ $stop !~ /^\d+$/;
+ $self->{'_seq_stop'} = $stop;
+ }
+ return $self->{'_seq_stop'};
+}
+
=head2 Bio::DB::WebDBSeqI methods
Overriding WebDBSeqI method to help newbies to retrieve sequences
@@ -383,7 +489,7 @@
# Asking for a RefSeq from EMBL/GenBank
- unless ($self->no_redirect) {
+ if ($self->redirect_refseq) {
if ($ids =~ /N._/) {
$self->warn("[$ids] is not a normal sequence database but a RefSeq entry.".
" Redirecting the request.\n")
@@ -461,6 +567,29 @@
my ($querykey) = $content =~ m!(\d+)!;
$self->cookie(uri_unescape($cookie),$querykey);
}
+
+########### DEPRECATED!!!! ###########
+
+=head2 no_redirect
+
+ Title : no_redirect
+ Usage : $db->no_redirect($content)
+ Function: Used to indicate that Bio::DB::GenBank instance retrieves
+ possible RefSeqs from EBI instead; default behavior is now to
+ retrieve directly from NCBI
+ Returns : None
+ Args : None
+ Throws : Method is deprecated in favor of positive flag method 'redirect_refseq'
+
+=cut
+
+sub no_redirect {
+ shift->throw(
+ "Use of no_redirect() is deprecated. Bio::DB::GenBank default is to always\n".
+ "retrieve from NCBI. In order to redirect possible RefSeqs to EBI, set\n".
+ "redirect_refseq flag to 1");
+}
+
1;
__END__
From cjfields at dev.open-bio.org Tue Apr 1 12:39:11 2008
From: cjfields at dev.open-bio.org (Christopher John Fields)
Date: Tue, 1 Apr 2008 12:39:11 -0400
Subject: [Bioperl-guts-l] [14639] bioperl-live/trunk/Bio/DB: Remove
extraneous code; update docs
Message-ID: <200804011639.m31GdBkW010041@dev.open-bio.org>
Revision: 14639
Author: cjfields
Date: 2008-04-01 12:39:11 -0400 (Tue, 01 Apr 2008)
Log Message:
-----------
Remove extraneous code; update docs
Modified Paths:
--------------
bioperl-live/trunk/Bio/DB/GenBank.pm
bioperl-live/trunk/Bio/DB/NCBIHelper.pm
Modified: bioperl-live/trunk/Bio/DB/GenBank.pm
===================================================================
--- bioperl-live/trunk/Bio/DB/GenBank.pm 2008-04-01 16:31:16 UTC (rev 14638)
+++ bioperl-live/trunk/Bio/DB/GenBank.pm 2008-04-01 16:39:11 UTC (rev 14639)
@@ -106,10 +106,11 @@
(the reason is that NT contigs are rather annotation with references
to clones).
-Some work has been done to automatically detect and retrieve whole NT_
-clones when the data is in that format (NCBI RefSeq clones). More
-testing and feedback from users is needed to achieve a good fit of
-functionality and ease of use.
+Some work has been done to automatically detect and retrieve whole NT_ clones
+when the data is in that format (NCBI RefSeq clones). The former behavior prior
+to bioperl 1.6 was to retrieve these from EBI, but now these are retrieved
+directly from NCBI. The older behavior can be regained by setting the
+'redirect_refseq' flag to a value evaluating to TRUE.
=head1 FEEDBACK
Modified: bioperl-live/trunk/Bio/DB/NCBIHelper.pm
===================================================================
--- bioperl-live/trunk/Bio/DB/NCBIHelper.pm 2008-04-01 16:31:16 UTC (rev 14638)
+++ bioperl-live/trunk/Bio/DB/NCBIHelper.pm 2008-04-01 16:39:11 UTC (rev 14639)
@@ -35,7 +35,7 @@
common HTML stripping done in L().
The base NCBI query URL used is:
-http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi
+http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi
=head1 FEEDBACK
@@ -74,7 +74,7 @@
package Bio::DB::NCBIHelper;
use strict;
-use vars qw($HOSTBASE %CGILOCATION %FORMATMAP $DEFAULTFORMAT $MAX_ENTRIES $VERSION @ATTRIBUTES);
+use vars qw($HOSTBASE %CGILOCATION %FORMATMAP $DEFAULTFORMAT $MAX_ENTRIES $VERSION);
use Bio::DB::Query::GenBank;
use HTTP::Request::Common;
@@ -104,7 +104,6 @@
'gbwithparts' => 'genbank',
);
$DEFAULTFORMAT = 'gb';
- @ATTRIBUTES = qw(complexity strand seq_start seq_stop);
}
# the new way to make modules a little more lightweight
From bugzilla-daemon at portal.open-bio.org Tue Apr 1 12:45:28 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 1 Apr 2008 12:45:28 -0400
Subject: [Bioperl-guts-l] [Bug 2466] NCBIHelper redirecting RefSeq sequence
download to EBI server
In-Reply-To:
Message-ID: <200804011645.m31GjS3T032091@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2466
cjfields at uiuc.edu changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #4 from cjfields at uiuc.edu 2008-04-01 12:45 EST -------
The redirection was added in to retrieve RefSeqs from EBI (which has slightly
better annotation). However, my take on that is one should use Bio::DB::RefSeq
under these circumstances; using Bio::DB::GenBank implies the sequences are
retrieved from GenBank by default. As a compromise, I have deprecated use of
the 'no_redirect' flag in favor of always retrieving the RefSeq from GenBank.
If one wants the old redirection behavior (present in the last few BioPerl dev.
releases) they must explicitly code for it using the 'redirect_refseq' flag. I
have also cleaned up some of the implicit getter/setters and added relevant
docs where needed.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From cjfields at dev.open-bio.org Tue Apr 1 23:57:25 2008
From: cjfields at dev.open-bio.org (Christopher John Fields)
Date: Tue, 1 Apr 2008 23:57:25 -0400
Subject: [Bioperl-guts-l] [14640] bioperl-live/trunk/Bio/SearchIO/hmmer.pm:
Squash uninit.
Message-ID: <200804020357.m323vP9B011020@dev.open-bio.org>
Revision: 14640
Author: cjfields
Date: 2008-04-01 23:57:25 -0400 (Tue, 01 Apr 2008)
Log Message:
-----------
Squash uninit. value warnings
Modified Paths:
--------------
bioperl-live/trunk/Bio/SearchIO/hmmer.pm
Modified: bioperl-live/trunk/Bio/SearchIO/hmmer.pm
===================================================================
--- bioperl-live/trunk/Bio/SearchIO/hmmer.pm 2008-04-01 16:39:11 UTC (rev 14639)
+++ bioperl-live/trunk/Bio/SearchIO/hmmer.pm 2008-04-02 03:57:25 UTC (rev 14640)
@@ -1116,7 +1116,7 @@
if ( $nm eq 'Hsp' ) {
foreach (qw(Hsp_qseq Hsp_midline Hsp_hseq)) {
my $data = $self->{'_last_hspdata'}->{$_};
- if ($_ eq 'Hsp_hseq') {
+ if ($data && $_ eq 'Hsp_hseq') {
# replace hmm '.' gap symbol by '-'
$data =~ s/\./-/g;
}
From cjfields at dev.open-bio.org Wed Apr 2 00:14:00 2008
From: cjfields at dev.open-bio.org (Christopher John Fields)
Date: Wed, 2 Apr 2008 00:14:00 -0400
Subject: [Bioperl-guts-l] [14641] bioperl-live/trunk/t/RefSeq.t: RefSeq
redirection now requires explicit setting ( note that sequence length
returned by Bio::DB::RefSeq is off by one, needs investigating).
Message-ID: <200804020414.m324E0XT011115@dev.open-bio.org>
Revision: 14641
Author: cjfields
Date: 2008-04-02 00:14:00 -0400 (Wed, 02 Apr 2008)
Log Message:
-----------
RefSeq redirection now requires explicit setting (note that sequence length returned by Bio::DB::RefSeq is off by one, needs investigating).
Modified Paths:
--------------
bioperl-live/trunk/t/RefSeq.t
Modified: bioperl-live/trunk/t/RefSeq.t
===================================================================
--- bioperl-live/trunk/t/RefSeq.t 2008-04-02 03:57:25 UTC (rev 14640)
+++ bioperl-live/trunk/t/RefSeq.t 2008-04-02 04:14:00 UTC (rev 14641)
@@ -26,9 +26,9 @@
#test redirection from GenBank and EMBL
#GenBank
-ok $db = Bio::DB::GenBank->new('-verbose'=>$verbose);
+ok $db = Bio::DB::GenBank->new('-verbose'=> $verbose, -redirect_refseq => 1);
#EMBL
-ok $db2 = Bio::DB::EMBL->new('-verbose'=>$verbose);
+ok $db2 = Bio::DB::EMBL->new('-verbose'=> $verbose, -redirect_refseq => 1);
eval {
$seq = $db->get_Seq_by_acc('NT_006732');
@@ -41,19 +41,19 @@
eval {
ok($seq = $db->get_Seq_by_acc('NM_006732'));
- is($seq->length, 3776);
+ is($seq->length, 3775);
ok $seq2 = $db2->get_Seq_by_acc('NM_006732');
- is($seq2->length, 3776);
+ is($seq2->length, 3775);
};
skip "Warning: Couldn't connect to RefSeq with Bio::DB::RefSeq.pm!", 4 if $@;
eval {
ok defined($db = Bio::DB::RefSeq->new(-verbose=>$verbose));
ok(defined($seq = $db->get_Seq_by_acc('NM_006732')));
- is( $seq->length, 3776);
+ is( $seq->length, 3775);
ok defined ($db->request_format('fasta'));
ok(defined($seq = $db->get_Seq_by_acc('NM_006732')));
- is( $seq->length, 3776);
+ is( $seq->length, 3775);
};
skip "Warning: Couldn't connect to RefSeq with Bio::DB::RefSeq.pm!", 6 if $@;
}
From cjfields at dev.open-bio.org Wed Apr 2 00:22:48 2008
From: cjfields at dev.open-bio.org (Christopher John Fields)
Date: Wed, 2 Apr 2008 00:22:48 -0400
Subject: [Bioperl-guts-l] [14642]
bioperl-live/trunk/t/RestrictionAnalysis.t: test not matching error message
(oops!)
Message-ID: <200804020422.m324MmHZ011143@dev.open-bio.org>
Revision: 14642
Author: cjfields
Date: 2008-04-02 00:22:48 -0400 (Wed, 02 Apr 2008)
Log Message:
-----------
test not matching error message (oops!)
Modified Paths:
--------------
bioperl-live/trunk/t/RestrictionAnalysis.t
Modified: bioperl-live/trunk/t/RestrictionAnalysis.t
===================================================================
--- bioperl-live/trunk/t/RestrictionAnalysis.t 2008-04-02 04:14:00 UTC (rev 14641)
+++ bioperl-live/trunk/t/RestrictionAnalysis.t 2008-04-02 04:22:48 UTC (rev 14642)
@@ -88,7 +88,7 @@
eval {$re->is_prototype};
ok($@);
-like($@, qr/Couldn't unequivicably assign prototype/, 'bug 2179');
+like($@, qr/Can't unequivocally assign prototype based on input format alone/, 'bug 2179');
$re->verbose(2);
is $re->is_prototype(0), 0;
From cjfields at dev.open-bio.org Wed Apr 2 00:23:23 2008
From: cjfields at dev.open-bio.org (Christopher John Fields)
Date: Wed, 2 Apr 2008 00:23:23 -0400
Subject: [Bioperl-guts-l] [14643] bioperl-live/trunk/t/Genpred.t: get rid of
variable redefined warnings
Message-ID: <200804020423.m324NNVA011171@dev.open-bio.org>
Revision: 14643
Author: cjfields
Date: 2008-04-02 00:23:23 -0400 (Wed, 02 Apr 2008)
Log Message:
-----------
get rid of variable redefined warnings
Modified Paths:
--------------
bioperl-live/trunk/t/Genpred.t
Modified: bioperl-live/trunk/t/Genpred.t
===================================================================
--- bioperl-live/trunk/t/Genpred.t 2008-04-02 04:22:48 UTC (rev 14642)
+++ bioperl-live/trunk/t/Genpred.t 2008-04-02 04:23:23 UTC (rev 14643)
@@ -302,8 +302,8 @@
is($fghgene->end(), 1869);
cmp_ok($fghgene->strand(), '<', 0);
-my $i = 0;
-my @num_exons = (2,5,4,8);
+$i = 0;
+ at num_exons = (2,5,4,8);
while ($fghgene = $fgh->next_prediction()) {
From bugzilla-daemon at portal.open-bio.org Wed Apr 2 02:12:33 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 2 Apr 2008 02:12:33 -0400
Subject: [Bioperl-guts-l] [Bug 2338] The first 4 bytes of flatfile index is
wrong (--indextype flat)
In-Reply-To:
Message-ID: <200804020612.m326CXLq006569@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2338
chad-bioperl-bugzilla at superfrink.net changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |chad-bioperl-
| |bugzilla at superfrink.net
------- Comment #4 from chad-bioperl-bugzilla at superfrink.net 2008-04-02 02:12 EST -------
I tried this out on a Fedora 6 machine and saw the described behaviour. I made
the following change to Bio::DB::Flat::BinarySearch and the problem went away.
I have not tested other programs and do not know how this change will impact
programs that rely on the original file format.
Regards,
Chad
# head -2 BinarySearch.pm
# $Id: BinarySearch.pm,v 1.23.4.1 2006/10/02 23:10:16 sendu Exp $
# diff -u BinarySearch.pm BinarySearch.pm.orig
--- BinarySearch.pm 2008-04-01 23:59:29.000000000 -0600
+++ BinarySearch.pm.orig 2007-04-19 22:10:40.000000000 -0600
@@ -915,7 +915,7 @@
$self->{_maxlengthlength} + 3;
- print $INDEX sprintf("%04d",$recordlength);
+ print $INDEX sprintf("%4d",$recordlength);
foreach my $id (@ids) {
@@ -982,7 +982,7 @@
my $fh = $self->new_secondary_filehandle($name);
- print $fh sprintf("%04d",$length);
+ print $fh sprintf("%4d",$length);
@seconds = sort @seconds;
foreach my $second (@seconds) {
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From avilella at dev.open-bio.org Wed Apr 2 09:50:46 2008
From: avilella at dev.open-bio.org (Albert Vilella)
Date: Wed, 2 Apr 2008 09:50:46 -0400
Subject: [Bioperl-guts-l] [14644]
bioperl-run/trunk/Bio/Tools/Run/Phylo/Hyphy/FEL.pm: recently seems to need
comma-separated values for the header in the results -- not sure if this
was a problem at the very beginning
Message-ID: <200804021350.m32DokM4018801@dev.open-bio.org>
Revision: 14644
Author: avilella
Date: 2008-04-02 09:50:45 -0400 (Wed, 02 Apr 2008)
Log Message:
-----------
recently seems to need comma-separated values for the header in the results -- not sure if this was a problem at the very beginning
Modified Paths:
--------------
bioperl-run/trunk/Bio/Tools/Run/Phylo/Hyphy/FEL.pm
Modified: bioperl-run/trunk/Bio/Tools/Run/Phylo/Hyphy/FEL.pm
===================================================================
--- bioperl-run/trunk/Bio/Tools/Run/Phylo/Hyphy/FEL.pm 2008-04-02 04:23:23 UTC (rev 14643)
+++ bioperl-run/trunk/Bio/Tools/Run/Phylo/Hyphy/FEL.pm 2008-04-02 13:50:45 UTC (rev 14644)
@@ -239,7 +239,7 @@
push @{$results->{$elems[$i]}}, $values[$i];
}
} else {
- @elems = split("\t",$_);
+ @elems = split("\,",$_);
$readed_header = 1;
}
}
From cjfields at dev.open-bio.org Wed Apr 2 11:52:41 2008
From: cjfields at dev.open-bio.org (Christopher John Fields)
Date: Wed, 2 Apr 2008 11:52:41 -0400
Subject: [Bioperl-guts-l] [14645] bioperl-live/trunk: merge back previous
commits from Sendu and I
Message-ID: <200804021552.m32Fqf1a019207@dev.open-bio.org>
Revision: 14645
Author: cjfields
Date: 2008-04-02 11:52:41 -0400 (Wed, 02 Apr 2008)
Log Message:
-----------
merge back previous commits from Sendu and I
Modified Paths:
--------------
bioperl-live/trunk/Build.PL
bioperl-live/trunk/ModuleBuildBioperl.pm
Modified: bioperl-live/trunk/Build.PL
===================================================================
--- bioperl-live/trunk/Build.PL 2008-04-02 13:50:45 UTC (rev 14644)
+++ bioperl-live/trunk/Build.PL 2008-04-02 15:52:41 UTC (rev 14645)
@@ -15,6 +15,8 @@
our @drivers;
+my $mysql_ok = 0;
+
# Set up the ModuleBuildBioperl object
my $build = ModuleBuildBioperl->new(
module_name => 'Bio',
@@ -92,7 +94,7 @@
BioDBSeqFeature_mysql => {
description => "MySQL tests for Bio::DB::SeqFeature::Store",
feature_requires => { 'DBI' => 0, 'DBD::mysql' => 0 },
- test => \&test_db
+ test => \&test_db_sf
},
Network => {
description => "Enable tests that need an internet connection",
@@ -109,8 +111,12 @@
my $accept = $build->args->{accept};
+prompt_for_biodb($accept) if $build->feature('BioDBGFF') || $build->feature('BioDBSeqFeature_mysql');
+
# Handle auto features
-if ($build->feature('BioDBSeqFeature_BDB')) {
+if ($build->feature('BioDBSeqFeature_BDB') && $mysql_ok) {
+ # will return without doing anything if user chose not to run tests during
+ # prompt_for_biodb() above
make_bdb_test();
}
if ($build->feature('BioDBSeqFeature_mysql')) {
@@ -119,7 +125,7 @@
# Ask questions
$build->choose_scripts($accept);
-prompt_for_biodbgff($accept) if $build->feature('BioDBGFF');
+#prompt_for_biodbgff($accept) if $build->feature('BioDBGFF');
{
if ($build->args('network')) {
if ($build->feature('Network')) {
@@ -155,7 +161,8 @@
sub make_bdb_test {
my $path0 = File::Spec->catfile('t', 'BioDBSeqFeature.t');
my $path = File::Spec->catfile('t', 'BioDBSeqFeature_BDB.t');
- open my $F, ">$path";
+ unlink($path) if (-e $path);
+ open(my $F, ">", $path) || die "Can't create test file\n";
print $F <add_to_manifest_skip($path);
}
-sub test_db {
+sub test_db_sf {
eval {require DBI;}; # if not installed, this sub won't actually be called
- unless (eval {DBI->connect('dbi:mysql:test',undef,undef,{RaiseError=>0,PrintError=>0})}) {
- return "Could not connect to test database";
+ @drivers = DBI->available_drivers;
+ unless (grep {/mysql/i} @drivers) {
+ $mysql_ok = 0;
+ return "Only MySQL DBI driver supported for BioDBSeqFeature_mysql tests";
}
+ $mysql_ok = 1;
return;
}
sub make_dbi_test {
+ my $dsn = $build->notes('test_dsn') || return;
my $path0 = File::Spec->catfile('t', 'BioDBSeqFeature.t');
my $path = File::Spec->catfile('t', 'BioDBSeqFeature_mysql.t');
+ my $test_db = $build->notes('test_db');
+ my $user = $build->notes('test_user');
+ my $pass = $build->notes('test_pass');
open my $F,">$path";
+ my $str = "$path0 -adaptor DBI::mysql -create 1 -temp 1 -dsn $dsn";
+ $str .= " -user $user" if $user;
+ $str .= " -password $pass" if $pass;
print $F <add_to_cleanup($path);
@@ -193,10 +210,11 @@
return;
}
-sub prompt_for_biodbgff {
+sub prompt_for_biodb {
my $accept = shift;
- my $proceed = $accept
- ? 0 : $build->y_n("Do you want to run the BioDBGFF live database tests? y/n", 'n');
+ my $proceed = $accept ? 0 : $build->y_n("Do you want to run the BioDBGFF or ".
+ "BioDBSeqFeature_mysql live database tests? ".
+ "y/n", 'n');
if ($proceed) {
my @driver_choices;
@@ -239,9 +257,11 @@
my $test_dsn;
if ($driver eq 'Pg') {
$test_dsn = "dbi:$driver:dbname=$test_db";
+ $mysql_ok = 0;
}
else {
$test_dsn = "dbi:$driver:database=$test_db";
+ $mysql_ok = 0;
}
if ($use_host) {
$test_dsn .= ";host=$test_host";
@@ -254,15 +274,17 @@
$build->notes(test_pass => $test_pass eq 'undef' ? undef : $test_pass);
$build->notes(test_dsn => $test_dsn);
- $build->log_info(" - will run the BioDBGFF tests with database driver '$driver' and these settings:\n",
+ $build->log_info(" - will run tests with database driver '$driver' and these settings:\n",
" Database $test_db\n",
" Host $test_host\n",
" DSN $test_dsn\n",
" User $test_user\n",
" Password $test_pass\n");
+ $build->log_info(" - will not run the BioDBSeqFeature_mysql live ".
+ "database tests (requires MySQL driver)\n") unless $mysql_ok;
}
else {
- $build->log_info(" - will not run the BioDBGFF live database tests\n");
+ $build->log_info(" - will not run the BioDBGFF or BioDBSeqFeature live database tests\n");
}
$build->log_info("\n");
Modified: bioperl-live/trunk/ModuleBuildBioperl.pm
===================================================================
--- bioperl-live/trunk/ModuleBuildBioperl.pm 2008-04-02 13:50:45 UTC (rev 14644)
+++ bioperl-live/trunk/ModuleBuildBioperl.pm 2008-04-02 15:52:41 UTC (rev 14645)
@@ -282,7 +282,17 @@
my $status = {};
if ($type eq 'test') {
unless (keys %$out) {
- $status->{message} = &{$prereqs};
+ if (ref($prereqs) eq 'CODE') {
+ $status->{message} = &{$prereqs};
+
+ # drop the code-ref to avoid Module::Build trying to store
+ # it with Data::Dumper, generating warnings. (And also, may
+ # be expensive to run the sub multiple times.)
+ $info->{$type} = $status->{message};
+ }
+ else {
+ $status->{message} = $prereqs;
+ }
$out->{$type}{'test'} = $status if $status->{message};
}
}
@@ -336,6 +346,11 @@
}
elsif ($type =~ /^feature_requires/) {
next if $status->{ok};
+
+ # if there is a test code-ref, drop it to avoid
+ # Module::Build trying to store it with Data::Dumper,
+ # generating warnings.
+ delete $info->{test};
}
else {
next if $status->{ok};
From cjfields at dev.open-bio.org Wed Apr 2 12:24:49 2008
From: cjfields at dev.open-bio.org (Christopher John Fields)
Date: Wed, 2 Apr 2008 12:24:49 -0400
Subject: [Bioperl-guts-l] [14646] bioperl-live/trunk/Bio/Tools/Fgenesh.pm:
Fix bad silent bug which doesn't set exon tags correctly ( showed up when
using -W flag with tests)
Message-ID: <200804021624.m32GOn9I019260@dev.open-bio.org>
Revision: 14646
Author: cjfields
Date: 2008-04-02 12:24:49 -0400 (Wed, 02 Apr 2008)
Log Message:
-----------
Fix bad silent bug which doesn't set exon tags correctly (showed up when using -W flag with tests)
Modified Paths:
--------------
bioperl-live/trunk/Bio/Tools/Fgenesh.pm
Modified: bioperl-live/trunk/Bio/Tools/Fgenesh.pm
===================================================================
--- bioperl-live/trunk/Bio/Tools/Fgenesh.pm 2008-04-02 15:52:41 UTC (rev 14645)
+++ bioperl-live/trunk/Bio/Tools/Fgenesh.pm 2008-04-02 16:24:49 UTC (rev 14646)
@@ -275,7 +275,7 @@
}
# split into fields
chomp();
- my @flds = split(/\s+/, ' ' . $line);
+ my @flds = split(/\s+/, ' ' . $line);
## NB - the above adds leading whitespace before the gene
## number in case there was none (as quick patch to code
## below which expects it but it is not present after 999
@@ -320,7 +320,7 @@
# are set, in order to allow for proper expansion of the range)
if($is_exon) {
# first, set fields unique to exons
- $predobj->primary_tag($ExonTags{$flds[3]} . 'Exon');
+ $predobj->primary_tag($ExonTags{$flds[4]} . 'Exon');
$predobj->is_coding(1);
my $cod_offset;
if($predobj->strand() == 1) {
From cjfields at dev.open-bio.org Wed Apr 2 12:28:57 2008
From: cjfields at dev.open-bio.org (Christopher John Fields)
Date: Wed, 2 Apr 2008 12:28:57 -0400
Subject: [Bioperl-guts-l] [14647] bioperl-live/trunk/t/Genpred.t: Add some
tests to catch tag naming
Message-ID: <200804021628.m32GSvPn019288@dev.open-bio.org>
Revision: 14647
Author: cjfields
Date: 2008-04-02 12:28:57 -0400 (Wed, 02 Apr 2008)
Log Message:
-----------
Add some tests to catch tag naming
Modified Paths:
--------------
bioperl-live/trunk/t/Genpred.t
Modified: bioperl-live/trunk/t/Genpred.t
===================================================================
--- bioperl-live/trunk/t/Genpred.t 2008-04-02 16:24:49 UTC (rev 14646)
+++ bioperl-live/trunk/t/Genpred.t 2008-04-02 16:28:57 UTC (rev 14647)
@@ -7,7 +7,7 @@
use lib 't/lib';
use BioperlTest;
- test_begin(-tests => 180);
+ test_begin(-tests => 182);
use_ok('Bio::Tools::Fgenesh');
use_ok('Bio::Tools::Genscan');
@@ -313,9 +313,11 @@
if ($i == 2) {
cmp_ok($fghexons[0]->strand(), '>', 0);
+ is($fghexons[0]->primary_tag(), 'InitialExon');
is($fghexons[0]->start(), 14778);
is($fghexons[0]->end(), 15104);
cmp_ok($fghexons[3]->strand(), '>', 0);
+ is($fghexons[3]->primary_tag(), 'TerminalExon');
is($fghexons[3]->start(), 16988);
is($fghexons[3]->end(), 17212);
}
From fangly at dev.open-bio.org Wed Apr 2 17:36:46 2008
From: fangly at dev.open-bio.org (Florent E Angly)
Date: Wed, 2 Apr 2008 17:36:46 -0400
Subject: [Bioperl-guts-l] [14648] bioperl-live/trunk/Bio/Assembly: Misc
cleaning, and bug fixing
Message-ID: <200804022136.m32Lakuo019678@dev.open-bio.org>
Revision: 14648
Author: fangly
Date: 2008-04-02 17:36:46 -0400 (Wed, 02 Apr 2008)
Log Message:
-----------
Misc cleaning, and bug fixing
Bio::Assembly::Contig
Fixed bug: replaced an occurence of 'elem' by '_elem'
Bio::Assembly::Scaffold
Fixed bug: scaffold source is implemented, stored in the scaffold at scaffold creation
Fixed bug: contig or singlet addition to a scaffold now really creates a reference to that scaffold as a contig or singlet attribute
Improvement: a list of singlets can be given at scaffold creation (just like a list of contigs can be specified)
Improvement: adding a non singlet object using the add_singlet method is now a fatal error (like all other such errors)
Improvement: adding singlet to scaffold now generates a singlet name if singlet is unnamed
Improvement: method update_seq_list now also updates singlets
Improvement: adding a singlet to a scaffold now puts the singlet sequence in the list of sequences belonging to the scaffold (just like for contigs)
Improvement: implemented 'remove_features_collection' method
Bio::Assembly::IO::Ace
Fixed bug: under certain conditions, the list of scaffold sequences ('_seqs') was not populated
Modified Paths:
--------------
bioperl-live/trunk/Bio/Assembly/Contig.pm
bioperl-live/trunk/Bio/Assembly/IO/ace.pm
bioperl-live/trunk/Bio/Assembly/Scaffold.pm
bioperl-live/trunk/Bio/Assembly/Singlet.pm
Modified: bioperl-live/trunk/Bio/Assembly/Contig.pm
===================================================================
--- bioperl-live/trunk/Bio/Assembly/Contig.pm 2008-04-02 16:28:57 UTC (rev 14647)
+++ bioperl-live/trunk/Bio/Assembly/Contig.pm 2008-04-02 21:36:46 UTC (rev 14648)
@@ -220,17 +220,16 @@
Usage : my $contig = Bio::Assembly::Contig->new();
Function : Creates a new contig object
Returns : Bio::Assembly::Contig
- Args : -source => string representing the source
- program where this contig came
- from
- -id => contig unique ID
+ Args : -id => contig unique ID
+ -source => string for the sequence assembly program used
+
=cut
#-----------
sub new {
#-----------
- my ($class, at args) = @_;
+ my ($class, @args) = @_;
my $self = $class->SUPER::new(@args);
@@ -243,8 +242,8 @@
# Bio::SimpleAlign derived fields (check which ones are needed for AlignI compatibility)
$self->{'_elem'} = {}; # contig elements: aligned sequence objects (keyed by ID)
$self->{'_order'} = {}; # store sequence order
-# $self->{'start_end_lists'} = {}; # References to entries in {'_seq'}. Keyed by seq ids.
-# $self->{'_dis_name'} = {}; # Display names for each sequence
+ # $self->{'start_end_lists'} = {}; # References to entries in {'_seq'}. Keyed by seq ids.
+ # $self->{'_dis_name'} = {}; # Display names for each sequence
$self->{'_symbols'} = {}; # List of symbols
#Contig specific slots
@@ -252,10 +251,10 @@
$self->{'_consensus_quality'} = undef;
$self->{'_nof_residues'} = 0;
$self->{'_nof_seqs'} = 0;
-# $self->{'_nof_segments'} = 0; # Let's not make it heavier than needed by now...
+ # $self->{'_nof_segments'} = 0; # Let's not make it heavier than needed by now...
$self->{'_sfc'} = Bio::SeqFeature::Collection->new();
- # Assembly specifcs
+ # Assembly specifics
$self->{'_assembly'} = undef; # Reference to a Bio::Assembly::Scaffold object, if contig belongs to one.
$self->{'_strand'} = 0; # Reverse (-1) or forward (1), if contig is in a scaffold. 0 otherwise
$self->{'_neighbor_start'} = undef; # Will hold a reference to another contig
@@ -305,7 +304,7 @@
my $assembly = shift;
$self->throw("Using non Bio::Assembly::Scaffold object when assign contig to assembly")
- if (defined $assembly && ! $assembly->isa("Bio::Assembly::Scaffold"));
+ if (defined $assembly && ! $assembly->isa("Bio::Assembly::Scaffold"));
$self->{'_assembly'} = $assembly if (defined $assembly);
return $self->{'_assembly'};
@@ -330,11 +329,10 @@
my $self = shift;
my $ori = shift;
- if (defined $ori) {
- $self->throw("Contig strand must be either 1, -1 or 0")
+ if (defined $ori) {
+ $self->throw("Contig strand must be either 1, -1 or 0")
unless $ori == 1 || $ori == 0 || $ori == -1;
-
- $self->{'_strand'} = $ori;
+ $self->{'_strand'} = $ori;
}
return $self->{'_strand'};
@@ -357,7 +355,7 @@
my $ref = shift;
$self->throw("Trying to assign a non Bio::Assembly::Contig object to upstream contig")
- if (defined $ref && ! $ref->isa("Bio::Assembly::Contig"));
+ if (defined $ref && ! $ref->isa("Bio::Assembly::Contig"));
$self->{'_neighbor_start'} = $ref if (defined $ref);
return $self->{'_neighbor_start'};
@@ -380,7 +378,7 @@
my $ref = shift;
$self->throw("Trying to assign a non Bio::Assembly::Contig object to downstream contig")
- if (defined $ref && ! $ref->isa("Bio::Assembly::Contig"));
+ if (defined $ref && ! $ref->isa("Bio::Assembly::Contig"));
$self->{'_neighbor_end'} = $ref if (defined $ref);
return $self->{'_neighbor_end'};
}
@@ -424,20 +422,20 @@
# Adding shortcuts for aligned sequence features
$flag = 0 unless (defined $flag);
if ($flag && defined $self->{'_consensus_sequence'}) {
- foreach my $feat (@$args) {
- next if (defined $feat->seq);
- $feat->attach_seq($self->{'_consensus_sequence'});
- }
+ foreach my $feat (@$args) {
+ next if (defined $feat->seq);
+ $feat->attach_seq($self->{'_consensus_sequence'});
+ }
} elsif (!$flag) { # Register aligned sequence features
- foreach my $feat (@$args) {
- if (my $seq = $feat->entire_seq()) {
- my $seqID = $seq->id() || $seq->display_id || $seq->primary_id;
- $self->warn("Adding contig feature attached to unknown sequence $seqID!")
- unless (exists $self->{'_elem'}{$seqID});
- my $tag = $feat->primary_tag;
- $self->{'_elem'}{$seqID}{'_feat'}{$tag} = $feat;
- }
- }
+ foreach my $feat (@$args) {
+ if (my $seq = $feat->entire_seq()) {
+ my $seqID = $seq->id() || $seq->display_id || $seq->primary_id;
+ $self->warn("Adding contig feature attached to unknown sequence $seqID!")
+ unless (exists $self->{'_elem'}{$seqID});
+ my $tag = $feat->primary_tag;
+ $self->{'_elem'}{$seqID}{'_feat'}{$tag} = $feat;
+ }
+ }
}
# Add feature to feature collection
@@ -461,16 +459,17 @@
# Removing shortcuts for aligned sequence features
foreach my $feat (@args) {
- if (my $seq = $feat->entire_seq()) {
- my $seqID = $seq->id() || $seq->display_id || $seq->primary_id;
- my $tag = $feat->primary_tag;
- $tag =~ s/:$seqID$/$1/g;
- delete( $self->{'_elem'}{$seqID}{'_feat'}{$tag} )
- if (exists $self->{'_elem'}{$seqID}{'_feat'}{$tag} &&
- $self->{'_elem'}{$seqID}{'_feat'}{$tag} eq $feat);
- }
+ if (my $seq = $feat->entire_seq()) {
+ my $seqID = $seq->id() || $seq->display_id || $seq->primary_id;
+ my $tag = $feat->primary_tag;
+ $tag =~ s/:$seqID$/$1/g;
+ delete( $self->{'_elem'}{$seqID}{'_feat'}{$tag} )
+ if (exists $self->{'_elem'}{$seqID}{'_feat'}{$tag} &&
+ $self->{'_elem'}{$seqID}{'_feat'}{$tag} eq $feat);
+ }
}
-
+
+ # Removing Bio::SeqFeature::Collection features
return $self->{'_sfc'}->remove_features(\@args);
}
@@ -486,10 +485,31 @@
sub get_features_collection {
my $self = shift;
-
return $self->{'_sfc'};
}
+=head2 remove_features_collection
+
+ Title : remove_features_collection
+ Usage : $contig->remove_features_collection()
+ Function : Remove the collection of all contig features. It is useful
+ to save some memory (when contig features are not needed).
+ Returns : none
+ Argument : none
+
+=cut
+
+sub remove_features_collection {
+ my $self = shift;
+ # Removing shortcuts for aligned sequence features
+ for my $seqID (keys %{$self->{'_elem'}}) {
+ delete $self->{'_elem'}{$seqID};
+ }
+ # Removing Bio::SeqFeature::Collection features
+ $self->{'_sfc'} = {};
+ return;
+}
+
=head1 Coordinate system's related methods
See L above.
@@ -533,140 +553,140 @@
my $out_ID = ( split(' ',$type_out) )[1];
if ($in_ID ne 'consensus') {
- $read_in = $self->get_seq_coord( $self->get_seq_by_name($in_ID) );
- $self->throw("Can't change coordinates without sequence location for $in_ID")
- unless (defined $read_in);
+ $read_in = $self->get_seq_coord( $self->get_seq_by_name($in_ID) );
+ $self->throw("Can't change coordinates without sequence location for $in_ID")
+ unless (defined $read_in);
}
if ($out_ID ne 'consensus') {
- $read_out = $self->get_seq_coord( $self->get_seq_by_name($out_ID) );
- $self->throw("Can't change coordinates without sequence location for $out_ID")
- unless (defined $read_out);
+ $read_out = $self->get_seq_coord( $self->get_seq_by_name($out_ID) );
+ $self->throw("Can't change coordinates without sequence location for $out_ID")
+ unless (defined $read_out);
}
# Performing transformation between coordinates
- SWITCH1: {
+ SWITCH1: {
- # Transformations between contig padded and contig unpadded
- (($type_in eq 'gapped consensus') && ($type_out eq 'ungapped consensus')) && do {
- $self->throw("Can't use ungapped consensus coordinates without a consensus sequence")
- unless (defined $self->{'_consensus_sequence'});
- $query = &_padded_unpadded($self->{'_consensus_gaps'}, $query);
- last SWITCH1;
- };
- (($type_in eq 'ungapped consensus') && ($type_out eq 'gapped consensus')) && do {
- $self->throw("Can't use ungapped consensus coordinates without a consensus sequence")
- unless (defined $self->{'_consensus_sequence'});
- $query = &_unpadded_padded($self->{'_consensus_gaps'},$query);
- last SWITCH1;
- };
+ # Transformations between contig padded and contig unpadded
+ (($type_in eq 'gapped consensus') && ($type_out eq 'ungapped consensus')) && do {
+ $self->throw("Can't use ungapped consensus coordinates without a consensus sequence")
+ unless (defined $self->{'_consensus_sequence'});
+ $query = &_padded_unpadded($self->{'_consensus_gaps'}, $query);
+ last SWITCH1;
+ };
+ (($type_in eq 'ungapped consensus') && ($type_out eq 'gapped consensus')) && do {
+ $self->throw("Can't use ungapped consensus coordinates without a consensus sequence")
+ unless (defined $self->{'_consensus_sequence'});
+ $query = &_unpadded_padded($self->{'_consensus_gaps'},$query);
+ last SWITCH1;
+ };
- # Transformations between contig (padded) and read (padded)
- (($type_in eq 'gapped consensus') &&
- ($type_out =~ /^aligned /) && defined($read_out)) && do {
- $query = $query - $read_out->start() + 1;
- last SWITCH1;
- };
- (($type_in =~ /^aligned /) && defined($read_in) &&
- ($type_out eq 'gapped consensus')) && do {
- $query = $query + $read_in->start() - 1;
- last SWITCH1;
- };
+ # Transformations between contig (padded) and read (padded)
+ (($type_in eq 'gapped consensus') &&
+ ($type_out =~ /^aligned /) && defined($read_out)) && do {
@@ Diff output truncated at 10000 characters. @@
From lstein at dev.open-bio.org Thu Apr 3 09:24:40 2008
From: lstein at dev.open-bio.org (Lincoln Stein)
Date: Thu, 3 Apr 2008 09:24:40 -0400
Subject: [Bioperl-guts-l] [14649]
bioperl-live/trunk/Bio/Graphics/FeatureFile.pm: restored the ability to
have a "tag=value #comment" style comment
Message-ID: <200804031324.m33DOekg022361@dev.open-bio.org>
Revision: 14649
Author: lstein
Date: 2008-04-03 09:24:40 -0400 (Thu, 03 Apr 2008)
Log Message:
-----------
restored the ability to have a "tag=value #comment" style comment
Modified Paths:
--------------
bioperl-live/trunk/Bio/Graphics/FeatureFile.pm
Modified: bioperl-live/trunk/Bio/Graphics/FeatureFile.pm
===================================================================
--- bioperl-live/trunk/Bio/Graphics/FeatureFile.pm 2008-04-02 21:36:46 UTC (rev 14648)
+++ bioperl-live/trunk/Bio/Graphics/FeatureFile.pm 2008-04-03 13:24:40 UTC (rev 14649)
@@ -514,6 +514,8 @@
my $self = shift;
local $_ = shift;
+ s/\s+\#.*$//; # strip right-column comments
+
if (/^\s+(.+)/ && $self->{current_tag}) { # configuration continuation line
my $value = $1;
my $cc = $self->{current_config} ||= 'general'; # in case no configuration named
From lstein at dev.open-bio.org Thu Apr 3 09:58:49 2008
From: lstein at dev.open-bio.org (Lincoln Stein)
Date: Thu, 3 Apr 2008 09:58:49 -0400
Subject: [Bioperl-guts-l] [14650] bioperl-live/trunk/Bio: added an API to
prevent FeatureFile->render () from rendering indiscriminately without
paying attention to the seq_id of the underlying reference sequence
Message-ID: <200804031358.m33Dwns6022456@dev.open-bio.org>
Revision: 14650
Author: lstein
Date: 2008-04-03 09:58:49 -0400 (Thu, 03 Apr 2008)
Log Message:
-----------
added an API to prevent FeatureFile->render() from rendering indiscriminately without paying attention to the seq_id of the underlying reference sequence
Modified Paths:
--------------
bioperl-live/trunk/Bio/DB/SeqFeature/Store/FeatureFileLoader.pm
bioperl-live/trunk/Bio/Graphics/FeatureFile.pm
Modified: bioperl-live/trunk/Bio/DB/SeqFeature/Store/FeatureFileLoader.pm
===================================================================
--- bioperl-live/trunk/Bio/DB/SeqFeature/Store/FeatureFileLoader.pm 2008-04-03 13:24:40 UTC (rev 14649)
+++ bioperl-live/trunk/Bio/DB/SeqFeature/Store/FeatureFileLoader.pm 2008-04-03 13:58:49 UTC (rev 14650)
@@ -435,7 +435,7 @@
# either create a new feature or add a segment to it
my $feature = $ld->{CurrentFeature};
if ($feature) {
-
+
# if this is a different feature from what we have now, then we
# store the current one, and create a new one
if ($feature->display_name ne $name ||
Modified: bioperl-live/trunk/Bio/Graphics/FeatureFile.pm
===================================================================
--- bioperl-live/trunk/Bio/Graphics/FeatureFile.pm 2008-04-03 13:24:40 UTC (rev 14649)
+++ bioperl-live/trunk/Bio/Graphics/FeatureFile.pm 2008-04-03 13:58:49 UTC (rev 14650)
@@ -279,7 +279,9 @@
sub render {
my $self = shift;
my $panel = shift;
- my ($position_to_insert,$options,$max_bump,$max_label,$selector) = @_;
+ my ($position_to_insert,$options,
+ $max_bump,$max_label,
+ $selector,$range) = @_;
my %seenit;
$panel ||= $self->new_panel;
@@ -296,7 +298,7 @@
} map {
shellwords ($self->setting($_=>'feature')||$_) } @labels;
my %lc_types = map {lc($_)}%types;
-
+
my @unconfigured_types = sort grep {!exists $lc_types{lc $_} &&
!exists $lc_types{lc $_->method}
} $self->types;
@@ -332,7 +334,12 @@
next if defined $selector and !$selector->($self,$label);
- my @features = grep {$self->_visible($_)} $self->features(\@types);
+ my @features = !$range ? grep {$self->_visible($_)} $self->features(\@types)
+ : $self->features(-types => \@types,
+ -seq_id => $range->seq_id,
+ -start => $range->start,
+ -end => $range->end
+ );
next unless @features; # suppress tracks for features that don't appear
@@ -343,7 +350,6 @@
my @auto_bump;
push @auto_bump,(-bump => @$features < $max_bump) if defined $max_bump;
push @auto_bump,(-label => @$features < $max_label) if defined $max_label;
-
my @config = ( -glyph => 'segments', # really generic
-bgcolor => $COLORS[$color++ % @COLORS],
@@ -944,6 +950,8 @@
$features = $features-Efeatures(-type=>'a type');
$iterator = $features-Efeatures(-type=>'a type',-iterator=>1);
+ $iterator = $features-Efeatures(-type=>'a type',-seq_id=>$id,-start=>$start,-end=>$end);
+
=back
=cut
@@ -951,10 +959,16 @@
# return features
sub features {
my $self = shift;
- my ($types,$iterator, at rest) = defined($_[0] && $_[0]=~/^-/)
- ? rearrange([['TYPE','TYPES']], at _) : (\@_);
+ my ($types,$iterator,$seq_id,$start,$end, at rest) = defined($_[0] && $_[0]=~/^-/)
+ ? rearrange([['TYPE','TYPES'],'ITERATOR','SEQ_ID','START','END'], at _) : (\@_);
+
$types = [$types] if $types && !ref($types);
my @args = $types && @$types ? (-type=>$types) : ();
+
+ push @args,(-seq_id => $seq_id) if $seq_id;
+ push @args,(-start => $start) if defined $start;
+ push @args,(-end => $end) if defined $end;
+
my $db = $self->db;
if ($iterator) {
From bugzilla-daemon at portal.open-bio.org Sun Apr 6 23:17:03 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sun, 6 Apr 2008 23:17:03 -0400
Subject: [Bioperl-guts-l] [Bug 2337] BDB flatfile index should store global
configuration data in BDB
In-Reply-To:
Message-ID: <200804070317.m373H35T008698@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2337
------- Comment #3 from cjfields at uiuc.edu 2008-04-06 23:17 EST -------
(In reply to comment #2)
> Naohisa is right, the spec says that some meta-data should go into a BerkeleyDB
> db. But it looks like what Lincoln did was to put information on primary and
> secondary namespaces in a human-readable file called config.dat. He did not put
> it into a BerkeleyDB db. His decision was the correct one, this information
> should be human-readable. Of course for OBDA to work across platforms Bioperl
> has to do what the other platforms do, even if it's not the right approach.
> Hmm.
I agree the namespace info should be human-readable. Maybe the best solution
is to require BDB storage as stated in the OBDA spec, and modify the spec to
(optionally) allow adding human-readable data to config.dat.
Might be something to bring up at BOSC, to see how other OBDA implementations
in other Bio* langs are doing this.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From cjfields at dev.open-bio.org Mon Apr 7 14:24:43 2008
From: cjfields at dev.open-bio.org (Christopher John Fields)
Date: Mon, 7 Apr 2008 14:24:43 -0400
Subject: [Bioperl-guts-l] [14651] bioperl-live/trunk/Bio: updates;
may switch some data objects over to (lightweight) Data:: Stag
implementation at some point
Message-ID: <200804071824.m37IOheV006309@dev.open-bio.org>
Revision: 14651
Author: cjfields
Date: 2008-04-07 14:24:42 -0400 (Mon, 07 Apr 2008)
Log Message:
-----------
updates; may switch some data objects over to (lightweight) Data::Stag implementation at some point
Modified Paths:
--------------
bioperl-live/trunk/Bio/DB/EUtilParameters.pm
bioperl-live/trunk/Bio/DB/GenericWebAgent.pm
bioperl-live/trunk/Bio/Tools/EUtilities/Summary/DocSum.pm
bioperl-live/trunk/Bio/Tools/EUtilities/Summary/Item.pm
bioperl-live/trunk/Bio/Tools/EUtilities.pm
Modified: bioperl-live/trunk/Bio/DB/EUtilParameters.pm
===================================================================
--- bioperl-live/trunk/Bio/DB/EUtilParameters.pm 2008-04-03 13:58:49 UTC (rev 14650)
+++ bioperl-live/trunk/Bio/DB/EUtilParameters.pm 2008-04-07 18:24:42 UTC (rev 14651)
@@ -561,7 +561,6 @@
{
# default retmode if one is not supplied
my %NCBI_DATABASE = (
- 'pubmed' => 'xml',
'protein' => 'text',
'nucleotide' => 'text',
'nuccore' => 'text',
@@ -569,42 +568,16 @@
'nucest' => 'text',
'structure' => 'text',
'genome' => 'text',
- 'books' => 'xml',
- 'cancerchromosomes'=> 'xml',
- 'cdd' => 'xml',
- 'domains' => 'xml',
'gene' => 'asn1',
- 'genomeprj' => 'xml',
- 'gensat' => 'xml',
- 'geo' => 'xml',
- 'gds' => 'xml',
- 'homologene' => 'xml',
'journals' => 'text',
- 'mesh' => 'xml',
- 'ncbisearch' => 'xml',
- 'nlmcatalog' => 'xml',
- 'omia' => 'xml',
- 'omim' => 'xml',
- 'pmc' => 'xml',
- 'popset' => 'xml',
- 'probe' => 'xml',
- 'pcassay' => 'xml',
- 'pccompound' => 'xml',
- 'pcsubstance' => 'xml',
- 'snp' => 'xml',
- 'taxonomy' => 'xml',
- 'unigene' => 'xml',
- 'unists' => 'xml',
);
sub set_default_retmode {
my $self = shift;
if ($self->eutil eq 'efetch') {
my $db = $self->db || return; # assume retmode will be set along with db
- $self->throw('Database $db not recognized')
- if !exists $NCBI_DATABASE{$db};
- # set efetch-based retmode
- $self->retmode($NCBI_DATABASE{$db});
+ my $mode = exists $NCBI_DATABASE{$db} ? $NCBI_DATABASE{$db} : 'xml';
+ $self->retmode($mode);
} else {
$self->retmode('xml');
}
Modified: bioperl-live/trunk/Bio/DB/GenericWebAgent.pm
===================================================================
--- bioperl-live/trunk/Bio/DB/GenericWebAgent.pm 2008-04-03 13:58:49 UTC (rev 14650)
+++ bioperl-live/trunk/Bio/DB/GenericWebAgent.pm 2008-04-07 18:24:42 UTC (rev 14651)
@@ -1,6 +1,6 @@
# $Id$
#
-# BioPerl module for Bio::DB::EUtilities
+# BioPerl module for Bio::DB::GenericWebAgent
#
# Cared for by Chris Fields
#
@@ -10,11 +10,11 @@
#
# POD documentation - main docs before the code
#
-# Interfaces with new GenericWebDBI interface
+# Interfaces with new GenericWebAgent interface
=head1 NAME
-Bio::DB::GenericWebDBI - helper base class for parameter-based remote server
+Bio::DB::GenericWebAgent - helper base class for parameter-based remote server
access and response retrieval.
=head1 SYNOPSIS
Modified: bioperl-live/trunk/Bio/Tools/EUtilities/Summary/DocSum.pm
===================================================================
--- bioperl-live/trunk/Bio/Tools/EUtilities/Summary/DocSum.pm 2008-04-03 13:58:49 UTC (rev 14650)
+++ bioperl-live/trunk/Bio/Tools/EUtilities/Summary/DocSum.pm 2008-04-07 18:24:42 UTC (rev 14651)
@@ -17,6 +17,8 @@
Bio::DB::EUtilities::Summary::DocSum - data object for document summary data
from esummary
+############ NOTE : Undergoing reimplementation to use simple Data::Stag ############
+
=head1 SYNOPSIS
@@ -128,7 +130,7 @@
Function : iterates through Items (nested layer of Item)
Returns : single Item
Args : [optional] single arg (string)
- 'flattened' - iterates through a flattened list ala
+ 'flatten' - iterates through a flattened list ala
get_all_DocSum_Items()
=cut
@@ -138,7 +140,7 @@
unless ($self->{"_items_it"}) {
#my @items = $self->get_Items;
my @items = ($request && $request eq 'flatten') ?
- $self->get_all_DocSum_Items :
+ $self->get_all_Items :
$self->get_Items ;
$self->{"_items_it"} = sub {return shift @items}
}
@@ -160,10 +162,10 @@
return ref $self->{'_items'} ? @{ $self->{'_items'} } : return ();
}
-=head2 get_all_DocSum_Items
+=head2 get_all_Items
- Title : get_all_DocSum_Items
- Usage : my @items = $docsum->get_all_DocSum_Items
+ Title : get_all_Items
+ Usage : my @items = $docsum->get_all_Items
Function : returns flattened list of all Item objects (Items, ListItems,
StructureItems)
Returns : array of Items
@@ -182,21 +184,60 @@
=cut
-sub get_all_DocSum_Items {
+sub get_all_Items {
my $self = shift;
- my @items;
- for my $item ($self->get_Items) {
- push @items, $item;
- for my $ls ($item->get_ListItems) {
- push @items, $ls;
- for my $st ($ls->get_StructureItems) {
- push @items, $st;
- }
+ unless ($self->{'_ordered_items'}) {
+ for my $item ($self->get_Items) {
+ push @{$self->{'_ordered_items'}}, $item;
+ for my $ls ($item->get_ListItems) {
+ push @{$self->{'_ordered_items'}}, $ls;
+ for my $st ($ls->get_StructureItems) {
+ push @{$self->{'_ordered_items'}}, $st;
+ }
+ }
}
}
- return @items;
+ return @{$self->{'_ordered_items'}};
}
+=head2 get_content_by_name
+
+ Title : get_content_by_Item_name
+ Usage : my $data = get_content_by_name('CreateDate')
+ Function : Returns scalar content for named Item in DocSum (indicated by
+ passed argument)
+ Returns : scalar value (string) if present
+ Args : string (Item name)
+ Warns : If Item with name is not found
+
+=cut
+
+sub get_content_by_name {
+ my ($self, $key) = @_;
+ return unless $key;
+ my ($it) = grep {$_->get_name eq $key} $self->get_all_Items;
+ return $it->get_content;
+}
+
+=head2 get_type_by_name
+
+ Title : get_type_by_name
+ Usage : my $data = get_type_by_name('CreateDate')
+ Function : Returns data type for named Item in DocSum (indicated by
+ passed argument)
+ Returns : scalar value (string) if present
+ Args : string (Item name)
+ Warns : If Item with name is not found
+
+=cut
+
+sub get_type_by_name {
+ my ($self, $key) = @_;
+ return unless $key;
+ my ($it) = grep {$_->get_name eq $key} $self->get_all_Items;
+ return $it->get_type;
+}
+
=head2 rewind
Title : rewind
Modified: bioperl-live/trunk/Bio/Tools/EUtilities/Summary/Item.pm
===================================================================
--- bioperl-live/trunk/Bio/Tools/EUtilities/Summary/Item.pm 2008-04-03 13:58:49 UTC (rev 14650)
+++ bioperl-live/trunk/Bio/Tools/EUtilities/Summary/Item.pm 2008-04-07 18:24:42 UTC (rev 14651)
@@ -241,7 +241,7 @@
Returns : string
Args : none
Note : this is not the same as the datatype(), which describes the
- group this Item ojbect belongs to
+ group this Item object belongs to
=cut
Modified: bioperl-live/trunk/Bio/Tools/EUtilities.pm
===================================================================
--- bioperl-live/trunk/Bio/Tools/EUtilities.pm 2008-04-03 13:58:49 UTC (rev 14650)
+++ bioperl-live/trunk/Bio/Tools/EUtilities.pm 2008-04-07 18:24:42 UTC (rev 14651)
@@ -827,7 +827,7 @@
my $ds = shift;
my $string = sprintf("UID: %s\n",$ds->get_id);
# flattened mode
- while (my $item = $ds->next_Item('flattened')) {
+ while (my $item = $ds->next_Item('flatten')) {
# not all Items have content, so need to check...
my $content = $item->get_content || '';
$string .= sprintf("%-20s%s\n",$item->get_name(),
From bugzilla-daemon at portal.open-bio.org Mon Apr 7 14:50:15 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 7 Apr 2008 14:50:15 -0400
Subject: [Bioperl-guts-l] [Bug 2482] New: paml4 mlc file fails to parse
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2482
Summary: paml4 mlc file fails to parse
Product: BioPerl
Version: 1.5 branch
Platform: Other
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P2
Component: Core Components
AssignedTo: bioperl-guts-l at bioperl.org
ReportedBy: jayoung at fhcrc.org
CC: jayoung at fhcrc.org
Hi,
I have just updated our version of PAML to v4, and now have problems parsing
the mlc file with Bio::Tools::Phylo::PAML.
I think I have also updated to the latest version of bioperl:
$Bio::Tools::Phylo::PAML::VERSION gives 1.0050021
My script is based on http://bioperl.org/wiki/HOWTO:PAML, and the basics of it
are here:
--------------------
#!/usr/bin/perl
#specify the mlc file(s) on the command line
use Bio::Tools::Phylo::PAML;
use warnings;
print "PAML version ", $Bio::Tools::Phylo::PAML::VERSION, "\n\n";
foreach my $file (@ARGV) {
my $outcodeml = $file;
if (!-e $outcodeml) {die "\ncan't find the file you specified $outcodeml -
terminating\n\n";}
my $out = "$outcodeml.treeinfo";
print "file $file - output will be in $out\n";
my $paml_parser = new Bio::Tools::Phylo::PAML(-file => $outcodeml,
-dir => "./",
-ctlf => "./codeml.ctl");
open (OUT, "> $out");
print OUT "Descendants\tt\tS\tN\tdN/dS\tdN\tdS\tS*dS\tN*dN\n";
if( my $result = $paml_parser->next_result() ) {
print "got a result\n";
while ( my $tree = $result->next_tree ) {
print "found a tree\n";
my $newtree = new Bio::TreeIO(-file=>'> temp.xml', -format=>'svggraph');
$newtree->write_tree($tree);
#do stuff with the tree here....
}
} else {print "no results\n";}
close OUT;
}
--------------------
It works fine on output from paml 3.15 but on output from paml4 I get the
following:
PAML version 1.0050021
file mlc - output will be in mlc.treeinfo
no results
which tells me that the parser didn't recognize the output.
I'll attach the mlc file in a few minutes.
thanks in advance for any help,
Janet Young
-------------------------------------------------------------------
Dr. Janet Young (Trask lab)
Fred Hutchinson Cancer Research Center
1100 Fairview Avenue N., C3-168,
P.O. Box 19024, Seattle, WA 98109-1024, USA.
tel: (206) 667 1471 fax: (206) 667 6524
email: jayoung at fhcrc.org
http://www.fhcrc.org/labs/trask/
-------------------------------------------------------------------
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Mon Apr 7 14:52:59 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 7 Apr 2008 14:52:59 -0400
Subject: [Bioperl-guts-l] [Bug 2482] paml4 mlc file fails to parse
In-Reply-To:
Message-ID: <200804071852.m37IqxcP025782@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2482
------- Comment #1 from jayoung at fhcrc.org 2008-04-07 14:52 EST -------
Created an attachment (id=897)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=897&action=view)
mlc file, can't parse this one
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Tue Apr 8 03:26:19 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 8 Apr 2008 03:26:19 -0400
Subject: [Bioperl-guts-l] [Bug 2474] postgres 8.3 - load_seqdatabase.pl /
swissprot
In-Reply-To:
Message-ID: <200804080726.m387QJ4N005423@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2474
------- Comment #2 from Bank.Beszteri at awi.de 2008-04-08 03:26 EST -------
Created an attachment (id=898)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=898&action=view)
Another output from load_seqdatabase.pl illustrating taxonomic conflicts
between Swissprot flat file (v.13.1) & NCBI taxonomy
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Tue Apr 8 03:32:32 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 8 Apr 2008 03:32:32 -0400
Subject: [Bioperl-guts-l] [Bug 2474] postgres 8.3 - load_seqdatabase.pl /
swissprot
In-Reply-To:
Message-ID: <200804080732.m387WWVk005848@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2474
------- Comment #3 from Bank.Beszteri at awi.de 2008-04-08 03:32 EST -------
(From update of attachment 898)
Forgot to add: MySQL this time (client v.4.0.18, server v.5.0.45)
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From cjfields at dev.open-bio.org Tue Apr 8 11:58:20 2008
From: cjfields at dev.open-bio.org (Christopher John Fields)
Date: Tue, 8 Apr 2008 11:58:20 -0400
Subject: [Bioperl-guts-l] [14652] bioperl-live/trunk/Bio/Seq/Meta/Array.pm:
bug 2478
Message-ID: <200804081558.m38FwKOA008026@dev.open-bio.org>
Revision: 14652
Author: cjfields
Date: 2008-04-08 11:58:19 -0400 (Tue, 08 Apr 2008)
Log Message:
-----------
bug 2478
Modified Paths:
--------------
bioperl-live/trunk/Bio/Seq/Meta/Array.pm
Modified: bioperl-live/trunk/Bio/Seq/Meta/Array.pm
===================================================================
--- bioperl-live/trunk/Bio/Seq/Meta/Array.pm 2008-04-07 18:24:42 UTC (rev 14651)
+++ bioperl-live/trunk/Bio/Seq/Meta/Array.pm 2008-04-08 15:58:19 UTC (rev 14652)
@@ -394,7 +394,7 @@
$start =~ /^[+]?\d+$/ and $start > 0 or
$self->throw("Need at least a positive integer start value");
$start--;
-
+ my $meta_len = scalar(@{$self->{_meta}->{$name}});
if (defined $value) {
my $arrayref;
@@ -428,12 +428,17 @@
return $arrayref;
} else {
-
- $end or $end = $self->length;
- $end = $self->length if $end > $self->length;
+ # don't set by seq length; use meta array length instead; bug 2478
+ $end ||= $meta_len;
+ if ($end > $meta_len) {
+ $self->warn("End is longer than meta sequence $name length; resetting to $meta_len");
+ $end = $meta_len;
+ }
+ # warn but don't reset (push use of trunc() instead)
+ $self->warn("End is longer than sequence length; use trunc() \n".
+ "if you want a fully truncated object") if $end > $self->length;
$end--;
return [@{$self->{_meta}->{$name}}[$start..$end]];
-
}
}
@@ -661,15 +666,14 @@
# test arguments
$start =~ /^[+]?\d+$/ and $start > 0 or
- $self->throw("Need at least a positive integer start value as start");
+ $self->throw("Need at least a positive integer start value as start; got [$start]");
$end =~ /^[+]?\d+$/ and $end > 0 or
- $self->throw("Need at least a positive integer start value as end");
+ $self->throw("Need at least a positive integer start value as end; got [$end]");
$end >= $start or
- $self->throw("End position has to be larger or equal to start");
+ $self->throw("End position has to be larger or equal to start; got [$start..$end]");
$end <= $self->length or
- $self->throw("End position can not be larger than sequence length");
+ $self->throw("End position can not be larger than sequence length; got [$end]");
-
my $new = $self->SUPER::trunc($start, $end);
$start--;
$end--;
From bugzilla-daemon at portal.open-bio.org Tue Apr 8 12:03:27 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 8 Apr 2008 12:03:27 -0400
Subject: [Bioperl-guts-l] [Bug 2478] Bio::SeqIO::fastq subqual reassignment
of qualities results in replacement of end characters with '!'
In-Reply-To:
Message-ID: <200804081603.m38G3Ri0013170@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2478
cjfields at uiuc.edu changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #4 from cjfields at uiuc.edu 2008-04-08 12:03 EST -------
subqual() currently checks the passed start/end coordinates against the
sequence coordinates (start=1, end=seq length). When resetting the sequence
and then calling subqual(), this resets the qual end to the newly set seq().
subqual() (and submeta(), by extension) should be independent of the sequence
and checked against the qual array length, then warn if it doesn't match the
seq length. I've added a fix for that as well as a few warnings.
For future reference, the proper (easier) method to retrieve a fully truncated
object of the same class is trunc(). Removing the calls to subseq/subqual and
using trunc directly like so:
$out->write_fastq($seq->trunc($opt_b+1,$seq_length-$opt_e));
gets:
@fake header 1 trimmed by 3 at beginning and 2 at end
gacaatatat
+fake header 1 trimmed by 3 at beginning and 2 at end
sfiojeq%!@
@fake header 2 trimmed by 3 at beginning and 2 at end
ctagagagg
+fake header 2 trimmed by 3 at beginning and 2 at end
2v1cty1f5
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From cjfields at dev.open-bio.org Tue Apr 8 12:05:13 2008
From: cjfields at dev.open-bio.org (Christopher John Fields)
Date: Tue, 8 Apr 2008 12:05:13 -0400
Subject: [Bioperl-guts-l] [14653]
bioperl-live/trunk/scripts/Bio-DB-GFF/bulk_load_gff.PLS: bug 2479
Message-ID: <200804081605.m38G5DCI008078@dev.open-bio.org>
Revision: 14653
Author: cjfields
Date: 2008-04-08 12:05:13 -0400 (Tue, 08 Apr 2008)
Log Message:
-----------
bug 2479
Modified Paths:
--------------
bioperl-live/trunk/scripts/Bio-DB-GFF/bulk_load_gff.PLS
Modified: bioperl-live/trunk/scripts/Bio-DB-GFF/bulk_load_gff.PLS
===================================================================
--- bioperl-live/trunk/scripts/Bio-DB-GFF/bulk_load_gff.PLS 2008-04-08 15:58:19 UTC (rev 14652)
+++ bioperl-live/trunk/scripts/Bio-DB-GFF/bulk_load_gff.PLS 2008-04-08 16:05:13 UTC (rev 14653)
@@ -91,7 +91,8 @@
GFF and/or FASTA files
--password Password to use for authentication
(Does not work with Postgres, password must be
- supplied interactively)
+ supplied interactively or be left empty for
+ ident authentication)
--maxbin Set the value of the maximum bin size
--local Flag to indicate that the data source is local
--maxfeature Set the value of the maximum feature size (power of 10)
@@ -207,7 +208,7 @@
# If called as pg_bulk_load_gff.pl behave as that did.
if ($0 =~/pg_bulk_load_gff.pl/){
- $ADAPTOR ||= 'pg';
+ $ADAPTOR ||= 'Pg';
$DSN ||= 'test';
}
$DSN ||= 'dbi:mysql:test';
@@ -227,7 +228,13 @@
die "Aborted\n" unless $f =~ /^[yY]/;
close TTY;
}
+# postgres DBD::Pg allows 'database', but also 'dbname', and 'db':
+# and it must be Pg (not pg)
+$DSN=~s/pg:database=/Pg:/i;
+$DSN=~s/pg:dbname=/Pg:/i;
+$DSN=~s/pg:db=/Pg:/i;
+# leave these lines for mysql
$DSN=~s/database=//i;
$DSN=~s/;host=/:/i; #cater for dsn in the form of "dbi:mysql:database=$dbname;host=$host"
@@ -237,6 +244,12 @@
$ADAPTOR ||= $DBD;
$ADAPTOR ||= 'mysql';
+if ($DBD eq 'Pg') {
+ # rebuild DSN, DBD::Pg requires full dbname= format
+ $DSN = "dbi:Pg:dbname=$DBNAME";
+ if ($HOST) { $DSN .= ";host=$HOST"; }
+}
+
my ($use_mysql,$use_mysqlcmap,$use_pg) = (0,0,0);
if ( $ADAPTOR eq 'mysqlcmap' ) {
$use_mysqlcmap = 1;
@@ -244,7 +257,7 @@
elsif ( $ADAPTOR =~ /^mysql/ ) {
$use_mysql = 1;
}
-elsif ( $ADAPTOR eq "pg" ) {
+elsif ( $ADAPTOR eq "Pg" ) {
$use_pg = 1;
}
else{
@@ -575,8 +588,8 @@
foreach (@files) {
my $file = "$tmpdir/$_.$$";
- $AUTH ? system("psql $AUTH -f $file $DSN")
- : system('psql','-f', $file, $DSN);
+ $AUTH ? system("psql $AUTH -f $file $DBNAME")
+ : system('psql','-f', $file, $DBNAME);
unlink $file;
}
From bugzilla-daemon at portal.open-bio.org Tue Apr 8 12:05:48 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 8 Apr 2008 12:05:48 -0400
Subject: [Bioperl-guts-l] [Bug 2479] bp_pg_bulk_load_gff.pl postgres GFF
bulk loader broken
In-Reply-To:
Message-ID: <200804081605.m38G5mFM013343@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2479
cjfields at uiuc.edu changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #2 from cjfields at uiuc.edu 2008-04-08 12:05 EST -------
Committed to svn. Thanks!
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Tue Apr 8 14:34:56 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 8 Apr 2008 14:34:56 -0400
Subject: [Bioperl-guts-l] [Bug 2483] New: request for implementation of
write_assembly
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2483
Summary: request for implementation of write_assembly
Product: BioPerl
Version: 1.5 branch
Platform: All
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P2
Component: Core Components
AssignedTo: bioperl-guts-l at bioperl.org
ReportedBy: jayoung at fhcrc.org
CC: jayoung at fhcrc.org
Hi,
I would really love it if write_assembly (ace format) could be implemented in
Bio::Assembly::IO::ace
I realise it's probably a low priority thing, but thought I'd just throw it out
there in case anyone is able to do it.
thanks,
Janet
-------------------------------------------------------------------
Dr. Janet Young (Trask lab)
Fred Hutchinson Cancer Research Center
1100 Fairview Avenue N., C3-168,
P.O. Box 19024, Seattle, WA 98109-1024, USA.
tel: (206) 667 1471 fax: (206) 667 6524
email: jayoung at fhcrc.org
http://www.fhcrc.org/labs/trask/
-------------------------------------------------------------------
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Tue Apr 8 14:48:39 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 8 Apr 2008 14:48:39 -0400
Subject: [Bioperl-guts-l] [Bug 2483] request for implementation of
write_assembly
In-Reply-To:
Message-ID: <200804081848.m38Imd5W022232@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2483
cjfields at uiuc.edu changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|1.6 release |1.7 release
------- Comment #1 from cjfields at uiuc.edu 2008-04-08 14:48 EST -------
Bio::Assembly issues will be tackled in the next dev release. I agree it would
be nice to have this, though.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Tue Apr 8 21:44:41 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 8 Apr 2008 21:44:41 -0400
Subject: [Bioperl-guts-l] [Bug 2350] Bio::Assembly::Scaffold->add_singlet
has a bug
In-Reply-To:
Message-ID: <200804090144.m391ifkD010215@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2350
cjfields at uiuc.edu changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|1.6 release |1.7 release
------- Comment #4 from cjfields at uiuc.edu 2008-04-08 21:44 EST -------
Changing milestone to 1.7, along with other Bio::Assembly
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Tue Apr 8 21:46:38 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 8 Apr 2008 21:46:38 -0400
Subject: [Bioperl-guts-l] [Bug 2370] Bio::Assembly::Scaffold Source
In-Reply-To:
Message-ID: <200804090146.m391kc5S010408@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2370
cjfields at uiuc.edu changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|1.6 release |1.7 release
------- Comment #5 from cjfields at uiuc.edu 2008-04-08 21:46 EST -------
Pushing to 1.7.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Apr 9 17:44:36 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 9 Apr 2008 17:44:36 -0400
Subject: [Bioperl-guts-l] [Bug 2482] paml4 mlc file fails to parse
In-Reply-To:
Message-ID: <200804092144.m39LiaTX009507@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2482
------- Comment #2 from cjfields at uiuc.edu 2008-04-09 17:44 EST -------
Confirmed in bioperl-live. Anyone familiar with PAML want to comment?
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Apr 9 18:03:07 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 9 Apr 2008 18:03:07 -0400
Subject: [Bioperl-guts-l] [Bug 2482] paml4 mlc file fails to parse
In-Reply-To:
Message-ID: <200804092203.m39M37wZ010482@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2482
------- Comment #3 from jason at bioperl.org 2008-04-09 18:03 EST -------
uh, it sucks that the output format changes way too much between any possible
release so that keeping this up-to-date is a bit of a losing battle....
Someone just has to spend a few minutes figuring out which extra lines are
breaking it or if the order is changing. Stefan had reported problems with the
different order of the sequence header lines in different versions making it
really hard to make one parser that worked.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Apr 9 18:35:58 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 9 Apr 2008 18:35:58 -0400
Subject: [Bioperl-guts-l] [Bug 2482] paml4 mlc file fails to parse
In-Reply-To:
Message-ID: <200804092235.m39MZw1Y012196@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2482
------- Comment #4 from cjfields at uiuc.edu 2008-04-09 18:35 EST -------
(In reply to comment #3)
> uh, it sucks that the output format changes way too much between any possible
> release so that keeping this up-to-date is a bit of a losing battle....
>
> Someone just has to spend a few minutes figuring out which extra lines are
> breaking it or if the order is changing. Stefan had reported problems with the
> different order of the sequence header lines in different versions making it
> really hard to make one parser that worked.
I remember something about that, maybe from the list. I can try looking into
it when I can, just not too familiar with the code (yet).
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Apr 10 12:11:38 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 10 Apr 2008 12:11:38 -0400
Subject: [Bioperl-guts-l] [Bug 2482] paml4 mlc file fails to parse
In-Reply-To:
Message-ID: <200804101611.m3AGBcbj030260@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2482
------- Comment #5 from kirovs at gmail.com 2008-04-10 12:11 EST -------
Seems like a different problem: in next_result the tree is not parsed, so
%data is empty (but the branch data is read). Reason for that- no model line is
detected before the tree data comes. Sorry cannot follow further for now. It
would be useful if there is an old mlc file and what were the params with which
codeml was called.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Apr 10 13:01:02 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 10 Apr 2008 13:01:02 -0400
Subject: [Bioperl-guts-l] [Bug 2482] paml4 mlc file fails to parse
In-Reply-To:
Message-ID: <200804101701.m3AH12hf000491@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2482
------- Comment #6 from jayoung at fhcrc.org 2008-04-10 13:01 EST -------
Thanks for looking into this. I'll add a couple more attachments, as suggested
by Stefan. One is the codeml.ctl file associated with that output. The other is
the mlc file generated using paml 3.15 for the same data, same parameters -
this one parses fine.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Apr 10 13:06:48 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 10 Apr 2008 13:06:48 -0400
Subject: [Bioperl-guts-l] [Bug 2482] paml4 mlc file fails to parse
In-Reply-To:
Message-ID: <200804101706.m3AH6mZ7000881@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2482
------- Comment #7 from jayoung at fhcrc.org 2008-04-10 13:06 EST -------
Created an attachment (id=899)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=899&action=view)
codeml.ctl file (parameters used)
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Apr 10 13:53:40 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 10 Apr 2008 13:53:40 -0400
Subject: [Bioperl-guts-l] [Bug 2482] paml4 mlc file fails to parse
In-Reply-To:
Message-ID: <200804101753.m3AHreeX003063@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2482
------- Comment #8 from jayoung at fhcrc.org 2008-04-10 13:53 EST -------
Hi again,
the plot thickens... Before uploading the mlc file from PAML v 3.15, I
checked again whether it would parse. It did better than for the v4 mlc file,
but also failed. next_result succeeded (unlike for the PAML v4 output, which
failed at this step) but next_tree failed.
Using an older version of bioperl I had successfully parsed that mlc file (PAML
v 3.15) and got the tree information out, but when the PAML v4 mlc file failed
to parse, I updated bioperl and now I can't parse the file I could parse
before. I still have the output of the first parsing so I know it worked...
I've been trying to figure out what older version of bioperl I was using but am
having some trouble. I only recently finally figured out how to get Build.PL
working on our system so I could do updates myself - before that I was using a
version of Bioperl that the sysadmins people installed for me. I also used
uninst when I built bioperl, so I think it removed any older versions of the
modules it could find. Sorry! I know that's not helpful. I'm still not a very
advanced user.
Janet
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Apr 10 13:55:48 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 10 Apr 2008 13:55:48 -0400
Subject: [Bioperl-guts-l] [Bug 2482] paml4 mlc file fails to parse
In-Reply-To:
Message-ID: <200804101755.m3AHtmVd003174@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2482
------- Comment #9 from jayoung at fhcrc.org 2008-04-10 13:55 EST -------
Created an attachment (id=900)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=900&action=view)
mlc file from PAML v 3.15. next_result succeeds, next_tree fails
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From lstein at dev.open-bio.org Thu Apr 10 17:10:39 2008
From: lstein at dev.open-bio.org (Lincoln Stein)
Date: Thu, 10 Apr 2008 17:10:39 -0400
Subject: [Bioperl-guts-l] [14654] bioperl-live/trunk/Bio: corrected case in
which seq_id of Bio:: Location::Split could unintentionally be set to undef
Message-ID: <200804102110.m3ALAdhw014037@dev.open-bio.org>
Revision: 14654
Author: lstein
Date: 2008-04-10 17:10:38 -0400 (Thu, 10 Apr 2008)
Log Message:
-----------
corrected case in which seq_id of Bio::Location::Split could unintentionally be set to undef
Modified Paths:
--------------
bioperl-live/trunk/Bio/DB/GFF.pm
bioperl-live/trunk/Bio/DB/SeqFeature/NormalizedFeature.pm
bioperl-live/trunk/Bio/DB/SeqFeature/Store.pm
bioperl-live/trunk/Bio/Graphics/FeatureFile.pm
bioperl-live/trunk/Bio/Graphics/Glyph/cds.pm
bioperl-live/trunk/Bio/Graphics/Glyph/generic.pm
bioperl-live/trunk/Bio/Location/Split.pm
Modified: bioperl-live/trunk/Bio/DB/GFF.pm
===================================================================
--- bioperl-live/trunk/Bio/DB/GFF.pm 2008-04-08 16:05:13 UTC (rev 14653)
+++ bioperl-live/trunk/Bio/DB/GFF.pm 2008-04-10 21:10:38 UTC (rev 14654)
@@ -1031,15 +1031,20 @@
sub features {
my $self = shift;
- my ($types,$automerge,$sparse,$iterator,$other);
+ my ($types,$automerge,$sparse,$iterator,$refseq,$start,$end,$other);
if (defined $_[0] &&
$_[0] =~ /^-/) {
- ($types,$automerge,$sparse,$iterator,$other) = rearrange([
- [qw(TYPE TYPES)],
- [qw(MERGE AUTOMERGE)],
- [qw(RARE SPARSE)],
- 'ITERATOR'
- ], at _);
+ ($types,$automerge,$sparse,$iterator,
+ $refseq,$start,$end,
+ $other) = rearrange([
+ [qw(TYPE TYPES)],
+ [qw(MERGE AUTOMERGE)],
+ [qw(RARE SPARSE)],
+ 'ITERATOR',
+ [qw(REFSEQ SEQ_ID)],
+ 'START',
+ [qw(STOP END)],
+ ], at _);
} else {
$types = \@_;
}
@@ -1048,8 +1053,11 @@
$automerge = $self->automerge unless defined $automerge;
$other ||= {};
$self->_features({
- rangetype => 'contains',
+ rangetype => $refseq ? 'overlaps' : 'contains',
types => $types,
+ refseq => $refseq,
+ start => $start,
+ stop => $end,
},
{ sparse => $sparse,
automerge => $automerge,
@@ -3377,6 +3385,7 @@
my ($search,$options,$parent) = @_;
(@{$search}{qw(start stop)}) = (@{$search}{qw(stop start)})
if defined($search->{start}) && $search->{start} > $search->{stop};
+ $search->{refseq} = $search->{seq_id} if exists $search->{seq_id};
my $types = $self->parse_types($search->{types}); # parse out list of types
my @aggregated_types = @$types; # keep a copy
Modified: bioperl-live/trunk/Bio/DB/SeqFeature/NormalizedFeature.pm
===================================================================
--- bioperl-live/trunk/Bio/DB/SeqFeature/NormalizedFeature.pm 2008-04-08 16:05:13 UTC (rev 14653)
+++ bioperl-live/trunk/Bio/DB/SeqFeature/NormalizedFeature.pm 2008-04-10 21:10:38 UTC (rev 14654)
@@ -176,7 +176,7 @@
return Bio::PrimarySeq->new(-seq => $store->fetch_sequence($self->seq_id,$start,$end) || '',
-id => $self->display_name);
} else {
- return $self->SUPER::seq($self->seq_id,$start,$end);
+ return $self->SUPER::seq($self->seq_id,$start,$end);
}
}
Modified: bioperl-live/trunk/Bio/DB/SeqFeature/Store.pm
===================================================================
--- bioperl-live/trunk/Bio/DB/SeqFeature/Store.pm 2008-04-08 16:05:13 UTC (rev 14653)
+++ bioperl-live/trunk/Bio/DB/SeqFeature/Store.pm 2008-04-10 21:10:38 UTC (rev 14654)
@@ -1170,7 +1170,8 @@
#
sub fetch_sequence {
my $self = shift;
- my ($seqid,$start,$end,$class,$bioseq) = rearrange([['NAME','SEQID','SEQ_ID'],'START',['END','STOP'],'CLASS','BIOSEQ'], at _);
+ my ($seqid,$start,$end,$class,$bioseq) = rearrange([['NAME','SEQID','SEQ_ID'],
+ 'START',['END','STOP'],'CLASS','BIOSEQ'], at _);
$seqid = "$seqid:$class" if defined $class;
my $seq = $self->_fetch_sequence($seqid,$start,$end);
return $seq unless $bioseq;
Modified: bioperl-live/trunk/Bio/Graphics/FeatureFile.pm
===================================================================
--- bioperl-live/trunk/Bio/Graphics/FeatureFile.pm 2008-04-08 16:05:13 UTC (rev 14653)
+++ bioperl-live/trunk/Bio/Graphics/FeatureFile.pm 2008-04-10 21:10:38 UTC (rev 14654)
@@ -1326,7 +1326,8 @@
require CGI unless defined &CGI::escape;
my $n;
$linkrule ||= ''; # prevent uninit warning
- my $seq_id = $feature->can('location') ? $feature->location->seq_id : $feature->seq_id;
+# my $seq_id = $feature->can('location') ? $feature->location->seq_id : $feature->seq_id;
+ my $seq_id = $feature->can('seq_id') ? $feature->seq_id() : $feature->location->seq_id();
$seq_id ||= $feature->seq_id; #fallback
$linkrule =~ s/\$(\w+)/
CGI::escape(
Modified: bioperl-live/trunk/Bio/Graphics/Glyph/cds.pm
===================================================================
--- bioperl-live/trunk/Bio/Graphics/Glyph/cds.pm 2008-04-08 16:05:13 UTC (rev 14653)
+++ bioperl-live/trunk/Bio/Graphics/Glyph/cds.pm 2008-04-10 21:10:38 UTC (rev 14654)
@@ -139,33 +139,33 @@
$part->{cds_frame} = $frame;
$part->{cds_offset} = $offset;
- if ($fits && $part->feature->seq) {
+ if ($fits && (my $seq = $feature->seq)) {
+ BLOCK: {
+ $seq = $self->get_seq($seq);
- # do in silico splicing in order to find the codon that
- # arises from the splice
- my $seq = $self->get_seq($part->feature->seq);
- my $protein = $seq->translate(undef,undef,$phase,$codon_table)->seq;
- $part->{cds_translation} = $protein;
+ # do in silico splicing in order to find the codon that
+ # arises from the splice
+ my $protein = $seq->translate(undef,undef,$phase,$codon_table)->seq;
+ $part->{cds_translation} = $protein;
- BLOCK: {
- length $protein >= $feature->length/3 and last BLOCK;
- ($feature->length - $phase) % 3 == 0 and last BLOCK;
-
- my $next_part = $parts[$i+1]
- or do {
- $part->{cds_splice_residue} = '?';
- last BLOCK; };
-
- my $next_feature = $next_part->feature or last BLOCK;
- my $next_phase = eval {$next_feature->phase} or last BLOCK;
- my $splice_codon = '';
- my $left_of_splice = substr($self->get_seq($feature->seq), -$next_phase, $next_phase);
- my $right_of_splice = substr($self->get_seq($next_feature->seq),0 , 3-$next_phase);
- $splice_codon = $left_of_splice . $right_of_splice;
- length $splice_codon == 3 or last BLOCK;
- my $amino_acid = $translate_table->translate($splice_codon);
- $part->{cds_splice_residue} = $amino_acid;
- }
+ length $protein >= $feature->length/3 and last BLOCK;
+ ($feature->length - $phase) % 3 == 0 and last BLOCK;
+
+ my $next_part = $parts[$i+1]
+ or do {
+ $part->{cds_splice_residue} = '?';
+ last BLOCK; };
+
+ my $next_feature = $next_part->feature or last BLOCK;
+ my $next_phase = eval {$next_feature->phase} or last BLOCK;
+ my $splice_codon = '';
+ my $left_of_splice = substr($self->get_seq($feature->seq), -$next_phase, $next_phase);
+ my $right_of_splice = substr($self->get_seq($next_feature->seq),0 , 3-$next_phase);
+ $splice_codon = $left_of_splice . $right_of_splice;
+ length $splice_codon == 3 or last BLOCK;
+ my $amino_acid = $translate_table->translate($splice_codon);
+ $part->{cds_splice_residue} = $amino_acid;
+ }
}
}
@@ -184,7 +184,7 @@
my $frame = $self->{cds_frame};
my $linecount = $self->sixframe ? 6 : 3;
- unless ($self->protein_fits) {
+ unless ($self->protein_fits && $self->{cds_translation}) {
my $height = ($y2-$y1)/$linecount;
my $offset = $y1 + $height*$frame;
$offset += ($y2-$y1)/2 if $self->sixframe && $self->strand < 0;
Modified: bioperl-live/trunk/Bio/Graphics/Glyph/generic.pm
===================================================================
--- bioperl-live/trunk/Bio/Graphics/Glyph/generic.pm 2008-04-08 16:05:13 UTC (rev 14653)
+++ bioperl-live/trunk/Bio/Graphics/Glyph/generic.pm 2008-04-10 21:10:38 UTC (rev 14654)
@@ -548,7 +548,8 @@
# hack around changed feature API
sub get_seq {
my $self = shift;
- my $seq = shift;
+ my $seq = shift;
+ return unless $seq;
return $seq if ref $seq && $seq->can('translate');
require Bio::PrimarySeq unless Bio::PrimarySeq->can('new');
return Bio::PrimarySeq->new(-seq=>$seq);
Modified: bioperl-live/trunk/Bio/Location/Split.pm
===================================================================
--- bioperl-live/trunk/Bio/Location/Split.pm 2008-04-08 16:05:13 UTC (rev 14653)
+++ bioperl-live/trunk/Bio/Location/Split.pm 2008-04-10 21:10:38 UTC (rev 14654)
@@ -563,14 +563,14 @@
=cut
sub seq_id {
- my ($self, $seqid) = @_;
+ my $self = shift;
- if(! $self->is_remote()) {
+ if(@_ && !$self->is_remote()) {
foreach my $subloc ($self->sub_Location(0)) {
- $subloc->seq_id($seqid) if ! $subloc->is_remote();
+ $subloc->seq_id(@_) if !$subloc->is_remote();
}
}
- return $self->SUPER::seq_id($seqid);
+ return $self->SUPER::seq_id(@_);
}
=head2 coordinate_policy
From bugzilla-daemon at portal.open-bio.org Thu Apr 10 20:00:51 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 10 Apr 2008 20:00:51 -0400
Subject: [Bioperl-guts-l] [Bug 2485] New:
Bio::SearchIO::Writer::HSPTableWriter - 'frame' column messes
up the output
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2485
Summary: Bio::SearchIO::Writer::HSPTableWriter - 'frame' column
messes up the output
Product: BioPerl
Version: 1.5 branch
Platform: All
OS/Version: All
Status: NEW
Severity: normal
Priority: P2
Component: Bio::Search/Bio::SearchIO
AssignedTo: bioperl-guts-l at bioperl.org
ReportedBy: jayoung at fhcrc.org
CC: jayoung at fhcrc.org
Hi,
Sorry to keep bugging you all - I'm doing a lot of updates to old scripts at
the moment so I keep finding small problems. I would imagine this will be a
fairly easy one to fix.
I'm using Bio::SearchIO and Bio::SearchIO::Writer::HSPTableWriter to parse
blastall output (NCBI). I updated from bioperl-live last week.
It's mostly working fine, except that for tblastn outputs, when I try to
include frame in the output using HSPTableWriter.
HSPs with frame 0 are OK, but if frame=1 or 2, the output gets messed up. The
frame output includes an extra tab character before the 1 or 2. If there's an
empty column after frame, I see the 1 or 2. If frame is the last column in the
output, then the 1 or 2 is then lost. If other non-empty columns follow frame,
data from one or two other columns seems to get overwritten. If I look at
frame using $hsp->frame() it looks fine (no extra tabs, etc), so it's parsing
OK, just not being output properly.
I'll paste in my script in just a minute, and I'll also attach a sample blast
output.
thanks,
Janet
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Apr 10 20:01:06 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 10 Apr 2008 20:01:06 -0400
Subject: [Bioperl-guts-l] [Bug 2485] Bio::SearchIO::Writer::HSPTableWriter -
'frame' column messes up the output
In-Reply-To:
Message-ID: <200804110001.m3B016qM022384@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2485
------- Comment #1 from jayoung at fhcrc.org 2008-04-10 20:01 EST -------
#!/usr/bin/perl
use warnings;
use strict;
use Bio::SearchIO;
use Bio::SearchIO::Writer::HSPTableWriter;
#set signif
#my $signif = "1e-3";
my $signif = "1e-5";
#my $signif = "100";
#--------------------------------
foreach my $file (@ARGV){
print "file is $file\n";
my $resultsfile = "$file.procnew.simple";
print "output file is $resultsfile\n";
my $blastObj = new Bio::SearchIO( -file => $file,
-format => 'blast',
-signif => $signif,
);
#note - the frame column messes things up. Putting it in different
positions in the column list is a little informative
my $writer = Bio::SearchIO::Writer::HSPTableWriter->new(-columns => [qw(
query_name
query_length
hit_name
hit_length
expect
score
bits
rank
frac_identical_query
frac_conserved_query
length_aln_query
length_aln_hit
gaps_query
gaps_hit
start_query
end_query
start_hit
end_hit
strand_query
strand_hit
hit_description
frame
hit_description
)]
);
my $out = Bio::SearchIO->new( -writer => $writer,
-file => ">$resultsfile" );
while ( my $result = $blastObj->next_result() ) {
while( my $hit = $result->next_hit ) {
while( my $hsp = $hit->next_hsp ) {
my $frame = $hsp->frame();
print "frame $frame blah\n";
}
}
$out -> write_result($result);
}
}
print "done.\n";
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Apr 10 20:02:56 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 10 Apr 2008 20:02:56 -0400
Subject: [Bioperl-guts-l] [Bug 2485] Bio::SearchIO::Writer::HSPTableWriter -
'frame' column messes up the output
In-Reply-To:
Message-ID: <200804110002.m3B02uwv022592@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2485
------- Comment #2 from jayoung at fhcrc.org 2008-04-10 20:02 EST -------
Created an attachment (id=901)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=901&action=view)
the script
(attaching the script too - formatting is a little messed up in the copy-paste)
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Apr 10 20:03:57 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 10 Apr 2008 20:03:57 -0400
Subject: [Bioperl-guts-l] [Bug 2485] Bio::SearchIO::Writer::HSPTableWriter -
'frame' column messes up the output
In-Reply-To:
Message-ID: <200804110003.m3B03v8W022644@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2485
------- Comment #3 from jayoung at fhcrc.org 2008-04-10 20:03 EST -------
Created an attachment (id=902)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=902&action=view)
test case
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From lapp at dev.open-bio.org Fri Apr 11 19:19:32 2008
From: lapp at dev.open-bio.org (Hilmar Lapp)
Date: Fri, 11 Apr 2008 19:19:32 -0400
Subject: [Bioperl-guts-l] [14655]
bioperl-live/trunk/Bio/Factory/SequenceFactoryI.pm: Applied SYNOPSIS patch
by Adam Sjogre (asjo at koldfront dot dk).
Message-ID: <200804112319.m3BNJWFP016816@dev.open-bio.org>
Revision: 14655
Author: lapp
Date: 2008-04-11 19:19:31 -0400 (Fri, 11 Apr 2008)
Log Message:
-----------
Applied SYNOPSIS patch by Adam Sjogre (asjo at koldfront dot dk).
Modified Paths:
--------------
bioperl-live/trunk/Bio/Factory/SequenceFactoryI.pm
Modified: bioperl-live/trunk/Bio/Factory/SequenceFactoryI.pm
===================================================================
--- bioperl-live/trunk/Bio/Factory/SequenceFactoryI.pm 2008-04-10 21:10:38 UTC (rev 14654)
+++ bioperl-live/trunk/Bio/Factory/SequenceFactoryI.pm 2008-04-11 23:19:31 UTC (rev 14655)
@@ -20,7 +20,7 @@
# get a Bio::Factory::SequenceFactoryI object like
use Bio::Seq::SeqFactory;
- my $seqbuilder = Bio::Seq::SeqFactory->new('type' => 'Bio::PrimarySeq');
+ my $seqbuilder = Bio::Seq::SeqFactory->new('-type' => 'Bio::PrimarySeq');
my $seq = $seqbuilder->create(-seq => 'ACTGAT',
-display_id => 'exampleseq');
From lstein at dev.open-bio.org Mon Apr 14 11:05:38 2008
From: lstein at dev.open-bio.org (Lincoln Stein)
Date: Mon, 14 Apr 2008 11:05:38 -0400
Subject: [Bioperl-guts-l] [14656] bioperl-live/trunk: added a clone() method
to support the (uncommon ) case of passing database adaptors across a
fork()
Message-ID: <200804141505.m3EF5ctU029537@dev.open-bio.org>
Revision: 14656
Author: lstein
Date: 2008-04-14 11:05:37 -0400 (Mon, 14 Apr 2008)
Log Message:
-----------
added a clone() method to support the (uncommon) case of passing database adaptors across a fork()
Modified Paths:
--------------
bioperl-live/trunk/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
bioperl-live/trunk/Bio/DB/GFF/Adaptor/dbi.pm
bioperl-live/trunk/Bio/DB/GFF.pm
bioperl-live/trunk/Bio/DB/SeqFeature/Store/DBI/mysql.pm
bioperl-live/trunk/Bio/DB/SeqFeature/Store.pm
bioperl-live/trunk/t/BioDBGFF.t
bioperl-live/trunk/t/BioDBSeqFeature.t
Modified: bioperl-live/trunk/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
===================================================================
--- bioperl-live/trunk/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm 2008-04-11 23:19:31 UTC (rev 14655)
+++ bioperl-live/trunk/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm 2008-04-14 15:05:37 UTC (rev 14656)
@@ -144,6 +144,17 @@
$wrapper;
}
+# The clone method should only be called in child processes after a fork().
+# It does two things: (1) it sets the "real" dbh's InactiveDestroy to 1,
+# thereby preventing the database connection from being destroyed in
+# the parent when the dbh's destructor is called; (2) it replaces the
+# "real" dbh with the result of dbh->clone(), so that we now have an
+# independent handle.
+sub clone {
+ my $self = shift;
+ foreach (@{$self->{dbh}}) { $_->clone };
+}
+
=head2 attribute
Title : attribute
@@ -213,6 +224,18 @@
shift->{dbh}->{ActiveKids};
}
+# The clone method should only be called in child processes after a fork().
+# It does two things: (1) it sets the "real" dbh's InactiveDestroy to 1,
+# thereby preventing the database connection from being destroyed in
+# the parent when the dbh's destructor is called; (2) it replaces the
+# "real" dbh with the result of dbh->clone(), so that we now have an
+# independent handle.
+sub clone {
+ my $self = shift;
+ $self->{dbh}{InactiveDestroy} = 1;
+ $self->{dbh} = $self->{dbh}->clone;
+}
+
sub DESTROY { }
sub AUTOLOAD {
Modified: bioperl-live/trunk/Bio/DB/GFF/Adaptor/dbi.pm
===================================================================
--- bioperl-live/trunk/Bio/DB/GFF/Adaptor/dbi.pm 2008-04-11 23:19:31 UTC (rev 14655)
+++ bioperl-live/trunk/Bio/DB/GFF/Adaptor/dbi.pm 2008-04-14 15:05:37 UTC (rev 14656)
@@ -1147,7 +1147,26 @@
}
}
+=head2 clone
+The clone() method should be used when you want to pass the
+Bio::DB::GFF object to a child process across a fork(). The child must
+call clone() before making any queries.
+
+This method does two things: (1) it sets the underlying database
+handle's InactiveDestroy parameter to 1, thereby preventing the
+database connection from being destroyed in the parent when the dbh's
+destructor is called; (2) it replaces the dbh with the result of
+dbh->clone(), so that we now have an independent handle.
+
+=cut
+
+sub clone {
+ my $self = shift;
+ $self->features_db->clone;
+}
+
+
=head1 QUERIES TO IMPLEMENT
The following astract methods either return DBI statement handles or
Modified: bioperl-live/trunk/Bio/DB/GFF.pm
===================================================================
--- bioperl-live/trunk/Bio/DB/GFF.pm 2008-04-11 23:19:31 UTC (rev 14655)
+++ bioperl-live/trunk/Bio/DB/GFF.pm 2008-04-14 15:05:37 UTC (rev 14656)
@@ -3336,8 +3336,21 @@
return ();
}
+=head2 clone
+The clone() method should be used when you want to pass the
+Bio::DB::GFF object to a child process across a fork(). The child must
+call clone() before making any queries.
+The default behavior is to do nothing, but adaptors that use the DBI
+interface may need to implement this in order to avoid database handle
+errors. See the dbi adaptor for an example.
+
+=cut
+
+sub clone { }
+
+
=head1 Internal Methods
The following methods are internal to Bio::DB::GFF and are not
Modified: bioperl-live/trunk/Bio/DB/SeqFeature/Store/DBI/mysql.pm
===================================================================
--- bioperl-live/trunk/Bio/DB/SeqFeature/Store/DBI/mysql.pm 2008-04-11 23:19:31 UTC (rev 14655)
+++ bioperl-live/trunk/Bio/DB/SeqFeature/Store/DBI/mysql.pm 2008-04-14 15:05:37 UTC (rev 14656)
@@ -339,6 +339,13 @@
$d;
}
+sub clone {
+ my $self = shift;
+ $self->{dbh}{InactiveDestroy} = 1;
+ $self->{dbh} = $self->{dbh}->clone
+ unless $self->is_temp;
+}
+
###
# get/set directory for bulk load tables
#
Modified: bioperl-live/trunk/Bio/DB/SeqFeature/Store.pm
===================================================================
--- bioperl-live/trunk/Bio/DB/SeqFeature/Store.pm 2008-04-11 23:19:31 UTC (rev 14655)
+++ bioperl-live/trunk/Bio/DB/SeqFeature/Store.pm 2008-04-14 15:05:37 UTC (rev 14656)
@@ -1507,6 +1507,20 @@
$d;
}
+=head2 clone
+
+The clone() method should be used when you want to pass the
+Bio::DB::SeqFeature::Store object to a child process across a
+fork(). The child must call clone() before making any queries.
+
+The default behavior is to do nothing, but adaptors that use the DBI
+interface may need to implement this in order to avoid database handle
+errors. See the dbi adaptor for an example.
+
+=cut
+
+sub clone { }
+
################################# TIE interface ####################
=head1 TIE Interface
Modified: bioperl-live/trunk/t/BioDBGFF.t
===================================================================
--- bioperl-live/trunk/t/BioDBGFF.t 2008-04-11 23:19:31 UTC (rev 14655)
+++ bioperl-live/trunk/t/BioDBGFF.t 2008-04-14 15:05:37 UTC (rev 14656)
@@ -8,7 +8,7 @@
use lib 't/lib';
use BioperlTest;
- test_begin(-tests => 277);
+ test_begin(-tests => 279);
use_ok('Bio::DB::GFF');
}
@@ -419,13 +419,29 @@
}
}
+ # test ability to pass adaptors across a fork
+ if (my $child = open(F,"-|")) { # parent reads from child
+ ok(scalar );
+ close F;
+ }
+ else { # in child
+ $db->clone;
+ my @f = $db->features();
+ print @f>0;
+ exit 0;
+ }
+
ok(!defined eval{$db->delete()});
ok($db->delete(-force=>1));
is(scalar $db->features,0);
ok(!$db->segment('Contig1'));
+
}
+
}
+
+
END {
unlink $fasta_files."/directory.index";
}
Modified: bioperl-live/trunk/t/BioDBSeqFeature.t
===================================================================
--- bioperl-live/trunk/t/BioDBSeqFeature.t 2008-04-11 23:19:31 UTC (rev 14655)
+++ bioperl-live/trunk/t/BioDBSeqFeature.t 2008-04-14 15:05:37 UTC (rev 14656)
@@ -2,7 +2,7 @@
# $Id$
use strict;
-use constant TEST_COUNT => 55;
+use constant TEST_COUNT => 57;
BEGIN {
use lib 't/lib';
@@ -169,4 +169,21 @@
is (@lines, 2);
ok("@lines" !~ /Parent=/s);
ok("@lines" =~ /ID=/s);
+
+if (my $child = open(F,"-|")) { # parent reads from child
+ cmp_ok(scalar ,'>',0);
+ close F;
+ # The challenge is to make sure that the handle
+ # still works in the parent!
+ my @f = $db->features();
+ cmp_ok(scalar @f,'>',0);
}
+else { # in child
+ $db->clone;
+ my @f = $db->features();
+ my $feature_count = @f;
+ print $feature_count;
+ exit 0;
+}
+
+}
From bugzilla-daemon at portal.open-bio.org Mon Apr 14 13:31:21 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 14 Apr 2008 13:31:21 -0400
Subject: [Bioperl-guts-l] [Bug 2484] This bug is repeated for several
sequences
In-Reply-To:
Message-ID: <200804141731.m3EHVLQW008772@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2484
cjfields at uiuc.edu changed:
What |Removed |Added
----------------------------------------------------------------------------
AssignedTo|birney at ebi.ac.uk |bioperl-guts-l at bioperl.org
Severity|normal |major
Version|unspecified |main-trunk
------- Comment #1 from cjfields at uiuc.edu 2008-04-14 13:31 EST -------
Confirmed using bioperl-live. Attaching raw EMBL file generating the error
(from dbfetch).
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Mon Apr 14 13:32:56 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 14 Apr 2008 13:32:56 -0400
Subject: [Bioperl-guts-l] [Bug 2484] This bug is repeated for several
sequences
In-Reply-To:
Message-ID: <200804141732.m3EHWuoS008888@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2484
------- Comment #2 from cjfields at uiuc.edu 2008-04-14 13:32 EST -------
Created an attachment (id=905)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=905&action=view)
EMBL test case; pass through SeqIO to see error.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Mon Apr 14 13:42:02 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 14 Apr 2008 13:42:02 -0400
Subject: [Bioperl-guts-l] [Bug 2484] This bug is repeated for several
sequences
In-Reply-To:
Message-ID: <200804141742.m3EHg2Qo009257@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2484
------- Comment #3 from cjfields at uiuc.edu 2008-04-14 13:42 EST -------
using the (experimental) 'embldriver' format uses a different parser which
appears to work (using perl 5.10):
use Bio::SeqIO;
use feature qw(say);
my $in = Bio::SeqIO->new(-format => 'embldriver',
-file => 'input.embl');
my $seq = $in->next_seq;
say $seq->species->scientific_name;
say join(';',$seq->species->classification);
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From cjfields at dev.open-bio.org Mon Apr 14 13:56:14 2008
From: cjfields at dev.open-bio.org (Christopher John Fields)
Date: Mon, 14 Apr 2008 13:56:14 -0400
Subject: [Bioperl-guts-l] [14657] bioperl-live/trunk/Bio/SeqIO/embl.pm: bug
2484
Message-ID: <200804141756.m3EHuEhF029778@dev.open-bio.org>
Revision: 14657
Author: cjfields
Date: 2008-04-14 13:56:14 -0400 (Mon, 14 Apr 2008)
Log Message:
-----------
bug 2484
Modified Paths:
--------------
bioperl-live/trunk/Bio/SeqIO/embl.pm
Modified: bioperl-live/trunk/Bio/SeqIO/embl.pm
===================================================================
--- bioperl-live/trunk/Bio/SeqIO/embl.pm 2008-04-14 15:05:37 UTC (rev 14656)
+++ bioperl-live/trunk/Bio/SeqIO/embl.pm 2008-04-14 17:56:14 UTC (rev 14657)
@@ -1038,7 +1038,7 @@
# only split on ';' or '.' so that classification that is 2 or more words
# will still get matched, use map() to remove trailing/leading/intervening
# spaces
- my @class = map { s/^\s+//; s/\s+$//; s/\s{2,}/ /g; $_; } split /[;\.]+/, $class_lines;
+ my @class = map { s/^\s+//; s/\s+$//; s/\s{2,}/ /g; $_; } split /(?
Message-ID: <200804141756.m3EHuiLj009980@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2484
cjfields at uiuc.edu changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #4 from cjfields at uiuc.edu 2008-04-14 13:56 EST -------
Fixed in subversion. thanks!
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From jason at dev.open-bio.org Mon Apr 14 17:06:16 2008
From: jason at dev.open-bio.org (Jason Stajich)
Date: Mon, 14 Apr 2008 17:06:16 -0400
Subject: [Bioperl-guts-l] [14658]
bioperl-live/trunk/Bio/Align/DNAStatistics.pm: typo
Message-ID: <200804142106.m3EL6G6F029990@dev.open-bio.org>
Revision: 14658
Author: jason
Date: 2008-04-14 17:06:16 -0400 (Mon, 14 Apr 2008)
Log Message:
-----------
typo
Modified Paths:
--------------
bioperl-live/trunk/Bio/Align/DNAStatistics.pm
Modified: bioperl-live/trunk/Bio/Align/DNAStatistics.pm
===================================================================
--- bioperl-live/trunk/Bio/Align/DNAStatistics.pm 2008-04-14 17:56:14 UTC (rev 14657)
+++ bioperl-live/trunk/Bio/Align/DNAStatistics.pm 2008-04-14 21:06:16 UTC (rev 14658)
@@ -1548,7 +1548,7 @@
=head2 get_syn_changes
Title : get_syn_changes
- Usage : Bio::Align::DNAStatitics->get_syn_chnages
+ Usage : Bio::Align::DNAStatitics->get_syn_changes
Function: Generate a hashref of all pairwise combinations of codns
differing by 1
Returns : Symetic matrix using hashes
From bugzilla-daemon at portal.open-bio.org Mon Apr 14 19:09:09 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 14 Apr 2008 19:09:09 -0400
Subject: [Bioperl-guts-l] [Bug 2332] Software for analysis of redundant
fragments of affys human mitochip v2
In-Reply-To:
Message-ID: <200804142309.m3EN99E5026902@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2332
marian.thieme at lycos.de changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #704 is|0 |1
obsolete| |
Attachment #706 is|0 |1
obsolete| |
Attachment #772 is|0 |1
obsolete| |
Attachment #773 is|0 |1
obsolete| |
Attachment #774 is|0 |1
obsolete| |
------- Comment #15 from marian.thieme at lycos.de 2008-04-14 19:09 EST -------
Created an attachment (id=906)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=906&action=view)
Updated Version of TestScript
Due to changes of the module ReseqChip.pm a new version of a Testscript is
provided.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From thm09830 at dev.open-bio.org Mon Apr 14 19:42:46 2008
From: thm09830 at dev.open-bio.org (Marian Thieme)
Date: Mon, 14 Apr 2008 19:42:46 -0400
Subject: [Bioperl-guts-l] [14659]
bioperl-live/trunk/Bio/Microarray/Tools/ReseqChip.pm: Minor changes in the
main processing function calc_sequence(), Logmessages are buffered,
hence only one file open/write/ close operation is needed per processed chip,
udpated version of testscript is provided via Bug #2332 in Bioperl bugzilla ,
Documentation updated
Message-ID: <200804142342.m3ENgkgv030357@dev.open-bio.org>
Revision: 14659
Author: thm09830
Date: 2008-04-14 19:42:46 -0400 (Mon, 14 Apr 2008)
Log Message:
-----------
Minor changes in the main processing function calc_sequence(), Logmessages are buffered, hence only one file open/write/close operation is needed per processed chip, udpated version of testscript is provided via Bug #2332 in Bioperl bugzilla, Documentation updated
Modified Paths:
--------------
bioperl-live/trunk/Bio/Microarray/Tools/ReseqChip.pm
Modified: bioperl-live/trunk/Bio/Microarray/Tools/ReseqChip.pm
===================================================================
--- bioperl-live/trunk/Bio/Microarray/Tools/ReseqChip.pm 2008-04-14 21:06:16 UTC (rev 14658)
+++ bioperl-live/trunk/Bio/Microarray/Tools/ReseqChip.pm 2008-04-14 23:42:46 UTC (rev 14659)
@@ -1,6 +1,6 @@
#------------------------------------------------------------------------------
# PACKAGE : Bio::Microarray::Tools::ReseqChip
-# PURPOSE : Analyse redundant fragments of Affymetrix Resequencing Chip
+# PURPOSE : Analyse additional probe oligonucleotides of Resequencing Chips
# AUTHOR : Marian Thieme
# CREATED : 21.09.2007
# REVISION:
@@ -12,16 +12,15 @@
=head1 NAME
-Bio::Microarray::Tools::ReseqChip - Class for extraction and incorporation of information
- about redundant fragments of Affy Mitochip v2.0
+Bio::Microarray::Tools::ReseqChip - Class for analysing additional probe oligonucleotides of Resequencing Chips (for instance Affy Mitochip v2.0)
=head1 SYNOPSIS
- use RedundantFragments;
+ use ReseqChip;
my %ref_seq_max_ins_hash=(3106 => 1);
- my $reseqfragSample=Bio::Tools::ReseqChipRedundantFragments->new(
+ my $reseqfragSample=Bio::Tools::ReseqChip->new(
$Affy_frags_design_filename,
$format,
\%ref_seq_max_ins_hash,
@@ -30,6 +29,26 @@
my $aln = new Bio::SimpleAlign();
my $in = Bio::SeqIO->new(-file => $Affy_reseq_sample_fasta_file,
-format => 'Fasta');
+
+ my %options_hash=(
+ include_main_sequence => 1,
+ insertions => 1,
+ deletions => 1,
+ depth_ins => 1,
+ depth_del => 9,
+ depth => 1,
+ consider_context => 1,
+ flank_left => 10,
+ flank_right => 10,
+ allowed_n_in_flank => 0,
+ flank_left_ins => 4,
+ flank_right_ins => 4,
+ allowed_n_in_flank_ins => 1,
+ flank_size_weak => 1,
+ call_threshold => 55,
+ ins_threshold => 35,
+ del_threshold => 75,
+ swap_ins => 1);
while ( (my $seq = $in->next_seq())) {
@@ -43,25 +62,41 @@
}
$aln->add_seq($locseq);
}
- my $new_sequence=$reseqfragSample->calc_sequence($aln, $options_hash [,"output_file"]);
+ my $new_sequence=$reseqfragSample->calc_sequence($aln, \%options_hash [,"output_file"]);
=head1 DESCRIPTION
-Process Affy MitoChip v2 Data to create an alignment of the "redundant" fragments to the reference sequence,
-taking account for insertions/deletion which are defined by Affy mtDNA_Design_Annotion.xls file. Based on
-that alignment substitutions, deletions and insertion can be detected and initally not called bases can called
-as well possible falsly called bases can recalled. Moreover insertion and deletion as well as snps lying in highly
-variable regions can be detected. Calls are done depending on the depth at a certain position
-in the alignment and sequence reliability (in terms of certain number of allowed Ns in a k-base-window within
-each redundant fragment, contributing to a certain alignment position).
+This Software module aim to infer information of the addtional oligonucleotide probes, covering different known variants.
+Oligonucleotide Array based Resequencing is done in the local context of a reference sequence. Every position in
+the genomic areas of interest is interrogated using 8 different 25-mer oligonucleotide probes (forward and reverse strand).
+Their middle base varies across the four possible bases, while the flanking regions are identical
+with the reference sequence or its reverse strand respectively. For genomic regions with known variability across individuals,
+additional probes were added to the chip. They interrogate postions in the neighborhood of polymorphisms not only in the local context
+of the reference sequence but also in the context of its known variants.
+This software (ReseqChip.pm) is tested to work with MitoChip v2.0 Data, manufactured by Affymetrix and the parser (MitoChipV2Parser)
+reads the probe design file (Affy mtDNA_Design_Annotion.xls) wich describes the design of the probes.
+The software approaches the problem in the following way:
+1. An alignment of the addtional probes to the reference sequence is created (taking account for insertions/deletion)
+2. Based on that alignment each position, which is covered by at least one additional probe is investigated to find a consensus call.
+
+This is done indirectly by excluding those probes, which appear to be inadequate for the individual. An indication for
+inadaquacy is a local accumulation of N-calls. We investigate calls in neighborhoods of length K around
+each sequence position in all available local context probes and count the number of N-calls in them.
+That menas, in addition to the call obtained using the references sequence base call we obtain data from all alternative
+local background probes that were available for the current position. All probes with more then maxN N-calls in the
+K-neighborhood are excluded. Because it may happen that different candidate bases occur we introduce to more parameters minP and minU.
+If more then minP probes remain after filtering and more then minU percent of them call the base x,
+were x is the most frequently called base, then x is included in the final sequence, otherwise the letter N is included.
+
+
Assumption:
Gaps which are inserted in several fragments and in the reference sequence itself refer to the reference sequence.
The reference sequence is given as input parameter.
-Optionshash, specifying conditions if a call is done is given when calculating the sequence respect to redundant
-fragments (calc_sequence()).
+Optionshash, specifying the explained parameter and some further options is provided by the user.
+
This module depends on the following modules:
use Bio::Microarray::Tools::MitoChipV2Parser
use Bio::SeqIO;
@@ -107,6 +142,7 @@
use base qw(Bio::Root::Root);
+
use Bio::Microarray::Tools::MitoChipV2Parser;
use Bio::SeqIO;
@@ -133,7 +169,7 @@
member variables.
- Returns : Returns a new RedundantFragments object
+ Returns : Returns a new ReseqChip object
Args : $Affy_frags_design_filename (Affymetrix xls design file,
for instance: mtDNA_design_annotation_FINAL.xls for mitochondrial Genome)
@@ -158,7 +194,7 @@
sub new {
- my ($class, $fi