From bugzilla-daemon at portal.open-bio.org Thu Apr 1 08:36:21 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 1 Apr 2010 08:36:21 -0400 Subject: [Bioperl-guts-l] [Bug 3039] New: TreeIO::newick writes root node branch length incorrectly Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=3039 Summary: TreeIO::newick writes root node branch length incorrectly Product: BioPerl Version: main-trunk Platform: All OS/Version: Mac OS Status: NEW Severity: normal Priority: P2 Component: Core Components AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: online at davemessina.com It seems that TreeIO::newick might not be properly writing root nodes which have a branch length ??? it omits the colon. e.g. If you read in (a:1,b:2):0.0; It will be written out as (a:1,b:2)0.0; In the latter case, 0.0 now looks like a node id, causing strict interpreters of newick (like PAML) to complain. I'll attach a patch and a test in a moment. I'm not sure I understand the code that well, though; could another dev or two take a look before I commit? -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 1 08:37:52 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 1 Apr 2010 08:37:52 -0400 Subject: [Bioperl-guts-l] [Bug 3039] TreeIO::newick writes root node branch length incorrectly In-Reply-To: Message-ID: <201004011237.o31CbqaO007075@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3039 ------- Comment #1 from online at davemessina.com 2010-04-01 08:37 EST ------- Created an attachment (id=1472) --> (http://bugzilla.open-bio.org/attachment.cgi?id=1472&action=view) modified version of newick.pm (Bio::TreeIO::newick) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 1 08:38:38 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 1 Apr 2010 08:38:38 -0400 Subject: [Bioperl-guts-l] [Bug 3039] TreeIO::newick writes root node branch length incorrectly In-Reply-To: Message-ID: <201004011238.o31CccST007108@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3039 ------- Comment #2 from online at davemessina.com 2010-04-01 08:38 EST ------- Created an attachment (id=1473) --> (http://bugzilla.open-bio.org/attachment.cgi?id=1473&action=view) modified version of newick.t -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 1 09:21:02 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 1 Apr 2010 09:21:02 -0400 Subject: [Bioperl-guts-l] [Bug 3039] TreeIO::newick writes root node branch length incorrectly In-Reply-To: Message-ID: <201004011321.o31DL2Z1008488@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3039 ------- Comment #3 from online at davemessina.com 2010-04-01 09:21 EST ------- Created an attachment (id=1474) --> (http://bugzilla.open-bio.org/attachment.cgi?id=1474&action=view) modified version of newick.pm (Bio::TreeIO::newick) - no perltidy This supersedes the previous attachment (1472). This one should be easily diff-able. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 1 09:21:27 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 1 Apr 2010 09:21:27 -0400 Subject: [Bioperl-guts-l] [Bug 3039] TreeIO::newick writes root node branch length incorrectly In-Reply-To: Message-ID: <201004011321.o31DLRtv008528@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3039 online at davemessina.com changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #1472 is|0 |1 obsolete| | -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From lstein at dev.open-bio.org Thu Apr 1 14:44:14 2010 From: lstein at dev.open-bio.org (Lincoln Stein) Date: Thu, 1 Apr 2010 14:44:14 -0400 Subject: [Bioperl-guts-l] [16944] bioperl-live/trunk: merged changes Message-ID: <201004011844.o31IiEcT015555@dev.open-bio.org> Revision: 16944 Author: lstein Date: 2010-04-01 14:44:14 -0400 (Thu, 01 Apr 2010) Log Message: ----------- merged changes Modified Paths: -------------- bioperl-live/trunk/Bio/DB/Fasta.pm bioperl-live/trunk/Bio/DB/GFF.pm bioperl-live/trunk/Bio/DB/SeqFeature/Store/DBI/Pg.pm bioperl-live/trunk/Bio/DB/SeqFeature/Store/DBI/mysql.pm bioperl-live/trunk/Bio/DB/SeqFeature/Store.pm bioperl-live/trunk/t/LocalDB/SeqFeature.t Modified: bioperl-live/trunk/Bio/DB/Fasta.pm =================================================================== --- bioperl-live/trunk/Bio/DB/Fasta.pm 2010-03-31 20:34:05 UTC (rev 16943) +++ bioperl-live/trunk/Bio/DB/Fasta.pm 2010-04-01 18:44:14 UTC (rev 16944) @@ -630,7 +630,7 @@ sub set_pack_method { my $self = shift; - # Find the maximum file size: + # Find the maximum file size:eq my ($maxsize) = sort { $b <=> $a } map { -s $_ } @_; my $fourGB = (2 ** 32) - 1; @@ -1074,6 +1074,8 @@ },$class; } +sub fetch_sequence { shift->seq(@_) } + sub seq { my $self = shift; return $self->{db}->seq($self->{id},$self->{start},$self->{stop}); Modified: bioperl-live/trunk/Bio/DB/GFF.pm =================================================================== --- bioperl-live/trunk/Bio/DB/GFF.pm 2010-03-31 20:34:05 UTC (rev 16943) +++ bioperl-live/trunk/Bio/DB/GFF.pm 2010-04-01 18:44:14 UTC (rev 16944) @@ -2805,6 +2805,8 @@ $self->get_dna($id,$start,$stop,$class); } +sub fetch_sequence { shift->dna(@_) } + sub features_in_range { my $self = shift; my ($range_type,$refseq,$class,$start,$stop,$types,$parent,$sparse,$automerge,$iterator,$other) = Modified: bioperl-live/trunk/Bio/DB/SeqFeature/Store/DBI/Pg.pm =================================================================== --- bioperl-live/trunk/Bio/DB/SeqFeature/Store/DBI/Pg.pm 2010-03-31 20:34:05 UTC (rev 16943) +++ bioperl-live/trunk/Bio/DB/SeqFeature/Store/DBI/Pg.pm 2010-04-01 18:44:14 UTC (rev 16944) @@ -6,7 +6,7 @@ =head1 NAME -Bio::DB::SeqFeature::Store::DBI::Pg -- Mysql implementation of Bio::DB::SeqFeature::Store +Bio::DB::SeqFeature::Store::DBI::Pg -- PostgreSQL implementation of Bio::DB::SeqFeature::Store =head1 SYNOPSIS @@ -163,8 +163,6 @@ use File::Spec; use constant DEBUG=>0; -# from the MySQL documentation... -# WARNING: if your sequence uses coordinates greater than 2 GB, you are out of luck! use constant MAX_INT => 2_147_483_647; use constant MIN_INT => -2_147_483_648; use constant MAX_BIN => 1_000_000_000; # size of largest feature = 1 Gb Modified: bioperl-live/trunk/Bio/DB/SeqFeature/Store/DBI/mysql.pm =================================================================== --- bioperl-live/trunk/Bio/DB/SeqFeature/Store/DBI/mysql.pm 2010-03-31 20:34:05 UTC (rev 16943) +++ bioperl-live/trunk/Bio/DB/SeqFeature/Store/DBI/mysql.pm 2010-04-01 18:44:14 UTC (rev 16944) @@ -159,6 +159,7 @@ use Bio::DB::GFF::Util::Rearrange 'rearrange'; use Bio::SeqFeature::Lite; use File::Spec; +use Carp 'carp','cluck'; use constant DEBUG=>0; # from the MySQL documentation... Modified: bioperl-live/trunk/Bio/DB/SeqFeature/Store.pm =================================================================== --- bioperl-live/trunk/Bio/DB/SeqFeature/Store.pm 2010-03-31 20:34:05 UTC (rev 16943) +++ bioperl-live/trunk/Bio/DB/SeqFeature/Store.pm 2010-04-01 18:44:14 UTC (rev 16944) @@ -334,8 +334,19 @@ L for a description of this. Default is the current directory. + -write Make the database writeable (implied by -create) + -fasta Provide an alternative DNA accessor object or path. + +By default the database will store DNA sequences internally. However, +you may override this behavior by passing either a path to a FASTA +file, or any Perl object that recognizes the seq($seqid,$start,$end) +method. In the former case, the FASTA path will be passed to +Bio::DB::Fasta, possibly causing an index to be constructed. Suitable +examples of the latter type of object include the Bio::DB::Sam and +Bio::DB::Sam::Fai classes. + =cut ### @@ -343,12 +354,12 @@ # sub new { my $self = shift; - my ($adaptor,$serializer,$index_subfeatures,$cache,$compress,$debug,$create,$args); + my ($adaptor,$serializer,$index_subfeatures,$cache,$compress,$debug,$create,$fasta,$args); if (@_ == 1) { $args = {DSN => shift} } else { - ($adaptor,$serializer,$index_subfeatures,$cache,$compress,$debug,$create,$args) = + ($adaptor,$serializer,$index_subfeatures,$cache,$compress,$debug,$create,$fasta,$args) = rearrange(['ADAPTOR', 'SERIALIZER', 'INDEX_SUBFEATURES', @@ -356,6 +367,7 @@ 'COMPRESS', 'DEBUG', 'CREATE', + 'FASTA', ], at _); } $adaptor ||= 'DBI::mysql'; @@ -373,6 +385,7 @@ $obj->serializer($serializer) if defined $serializer; $obj->index_subfeatures($index_subfeatures) if defined $index_subfeatures; $obj->seqfeature_class('Bio::DB::SeqFeature'); + $obj->set_dna_accessor($fasta) if defined $fasta; $obj->post_init($args); $obj; } @@ -984,7 +997,7 @@ Location filters: -seq_id Chromosome, contig or other DNA segment - -seqid Synonym for -seqid + -seqid Synonym for -seq_id -ref Synonym for -seqid -start Start of range -end End of range @@ -1251,7 +1264,7 @@ my ($seqid,$start,$end,$class,$bioseq) = rearrange([['NAME','SEQID','SEQ_ID'], 'START',['END','STOP'],'CLASS','BIOSEQ'], at _); $seqid = "$seqid:$class" if defined $class; - my $seq = $self->_fetch_sequence($seqid,$start,$end); + my $seq = $self->seq($seqid,$start,$end); return $seq unless $bioseq; require Bio::Seq unless Bio::Seq->can('new'); @@ -1570,6 +1583,54 @@ $d; } +=head2 dna_accessor + + Title : dna_accessor + Usage : $dna_accessor = $db->dna_accessor([$new_dna_accessor]) + Function: get/set the name of the dna_accessor + Returns : the current dna_accessor object, if any + Args : (optional) the dna_accessor object + Status : public + +You can use this method to request or set the DNA accessor. + +=cut + +### +# dna_accessor +# +sub dna_accessor { + my $self = shift; + my $d = $self->{dna_accessor}; + $self->{dna_accessor} = shift if @_; + $d; +} + +sub can_do_seq { + my $self = shift; + my $obj = shift; + return + UNIVERSAL::can($obj,'seq') || + UNIVERSAL::can($obj,'fetch_sequence'); +} + +sub set_dna_accessor { + my $self = shift; + my $accessor = shift; + if (-e $accessor) { # a file, assume it is a fasta file + eval "require Bio::DB::Fasta" unless Bio::DB::Fasta->can('new'); + my $a = Bio::DB::Fasta->new($accessor) + or croak "Can't open FASTA file $accessor: $!"; + $self->dna_accessor($a); + } + + if (ref $accessor && $self->can_do_seq($accessor)) { + $self->dna_accessor($accessor); # already built + } + + return; +} + sub do_compress { my $self = shift; if (@_) { @@ -1963,6 +2024,19 @@ sub _fetch_sequence { shift->throw_not_implemented } +sub seq { + my $self = shift; + my ($seq_id,$start,$end) = @_; + if (my $a = $self->dna_accessor) { + return $a->can('seq') ? $a->seq($seq_id,$start,$end) + :$a->can('fetch_sequence')? $a->fetch_sequence($seq_id,$start,$end) + : undef; + } + else { + return $self->_fetch_sequence($seq_id,$start,$end); + } +} + =head2 _seq_ids Title : _seq_ids Modified: bioperl-live/trunk/t/LocalDB/SeqFeature.t =================================================================== --- bioperl-live/trunk/t/LocalDB/SeqFeature.t 2010-03-31 20:34:05 UTC (rev 16943) +++ bioperl-live/trunk/t/LocalDB/SeqFeature.t 2010-04-01 18:44:14 UTC (rev 16944) @@ -2,11 +2,10 @@ # $Id$ use strict; -use constant TEST_COUNT => 74; +use constant TEST_COUNT => 83; BEGIN { - use lib '/home/lstein/projects/bioperl-live'; - use lib '.'; + use lib '.','..'; use Bio::Root::Test; test_begin(-tests => TEST_COUNT); @@ -15,8 +14,12 @@ use_ok('Bio::DB::SeqFeature::Store'); use_ok('Bio::DB::SeqFeature::Store::GFF3Loader'); + use_ok('Bio::Root::IO'); + use_ok('File::Copy'); } +my $DEBUG = test_debug(); + my $gff_file = test_input_file('seqfeaturedb','test.gff3'); my (@f,$f, at s,$s,$seq1,$seq2); @@ -191,24 +194,24 @@ is($c[0]->phase,0); is($c[1]->phase,1); -SKIP: { - test_skip(-tests => 2, -excludes_os => 'mswin'); + SKIP: { + test_skip(-tests => 2, -excludes_os => 'mswin'); - if (my $child = open(F,"-|")) { # parent reads from child - cmp_ok(scalar ,'>',0); - close F; - # The challenge is to make sure that the handle - # still works in the parent! - my @f = $db->features(); - cmp_ok(scalar @f,'>',0); - } - else { # in child - $db->clone; - my @f = $db->features(); - my $feature_count = @f; - print $feature_count; - exit 0; - } + if (my $child = open(F,"-|")) { # parent reads from child + cmp_ok(scalar ,'>',0); + close F; + # The challenge is to make sure that the handle + # still works in the parent! + my @f = $db->features(); + cmp_ok(scalar @f,'>',0); + } + else { # in child + $db->clone; + my @f = $db->features(); + my $feature_count = @f; + print $feature_count; + exit 0; + } } @@ -233,39 +236,78 @@ @results = $db->search_notes('terribly interesting'); is(scalar @results,2,'keyword search; 2 terms'); +# test our ability to substitute a FASTA file for the database +my $fasta_dir = make_fasta_testdir(); +my $dbfa = Bio::DB::Fasta->new($fasta_dir, -reindex => 1); +ok($dbfa); +ok(my $contig1=$dbfa->seq('Contig1')); + +$db = Bio::DB::SeqFeature::Store->new(@args,-fasta=>$dbfa); +$loader = Bio::DB::SeqFeature::Store::GFF3Loader->new(-store=>$db); +ok($loader->load($gff_file)); + +ok($db->dna_accessor); +my $f = $db->segment('Contig1'); +ok($f->dna eq $contig1); + +ok(my $contig2 = $dbfa->seq('Contig2')); +($f) = $db->get_feature_by_name('match4'); +my $length = $f->length; +ok(substr($contig2,0,$length) eq $f->dna); + # testing namespaces for mysql and Pg adaptor -SKIP: { - my $adaptor; + SKIP: { + my $adaptor; - for (my $i=0; $i < @args; $i++) { - if ($args[$i] eq '-adaptor') { - $adaptor = $args[$i+1]; - last; - } - } + for (my $i=0; $i < @args; $i++) { + if ($args[$i] eq '-adaptor') { + $adaptor = $args[$i+1]; + last; + } + } - skip "Namespaces only supported for DBI::mysql and DBI::Pg adaptors", 5, if ($adaptor ne 'DBI::mysql' && $adaptor ne 'DBI::Pg'); + skip "Namespaces only supported for DBI::mysql and DBI::Pg adaptors", 5, if ($adaptor ne 'DBI::mysql' && $adaptor ne 'DBI::Pg'); @@ Diff output truncated at 10000 characters. @@ From bugzilla-daemon at portal.open-bio.org Mon Apr 5 05:00:45 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 5 Apr 2010 05:00:45 -0400 Subject: [Bioperl-guts-l] [Bug 3040] New: PAML parser fails on codeml output with 'nan' Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=3040 Summary: PAML parser fails on codeml output with 'nan' Product: BioPerl Version: main-trunk Platform: All OS/Version: Mac OS Status: NEW Severity: normal Priority: P2 Component: Core Components AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: online at davemessina.com Bio::Tools::Phylo::PAML fails silently on codeml output with dS=nan. Noticed with output with codeml 4.4 (January 2010). -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From dave_messina at dev.open-bio.org Mon Apr 5 06:15:43 2010 From: dave_messina at dev.open-bio.org (Dave Messina) Date: Mon, 5 Apr 2010 06:15:43 -0400 Subject: [Bioperl-guts-l] [16945] bioperl-live/trunk: Fix for bug #3040. Message-ID: <201004051015.o35AFhIi010715@dev.open-bio.org> Revision: 16945 Author: dave_messina Date: 2010-04-05 06:15:42 -0400 (Mon, 05 Apr 2010) Log Message: ----------- Fix for bug #3040. dS=nan is now parsed. As far as I can tell, allowing this non-numeric value isn't causing any downstream problems. All tests pass. Modified Paths: -------------- bioperl-live/trunk/Bio/Tools/Phylo/PAML.pm bioperl-live/trunk/t/Tools/Phylo/PAML.t Added Paths: ----------- bioperl-live/trunk/t/data/codeml_nan.mlc Modified: bioperl-live/trunk/Bio/Tools/Phylo/PAML.pm =================================================================== --- bioperl-live/trunk/Bio/Tools/Phylo/PAML.pm 2010-04-01 18:44:14 UTC (rev 16944) +++ bioperl-live/trunk/Bio/Tools/Phylo/PAML.pm 2010-04-05 10:15:42 UTC (rev 16945) @@ -510,7 +510,7 @@ my ($phylip_header) = $self->_readline; $self->_parse_seqs; } - elsif ( ( @lines > 3 ) && ( $self->{'_already_parsed_seqs'} != 1 ) ) + elsif ( ( @lines >= 3 ) && ( $self->{'_already_parsed_seqs'} != 1 ) ) { #No gap $self->_parse_seqs; } @@ -920,23 +920,33 @@ if (/^pairwise comparison, codon frequencies\:\s*(\S+)\./) { $model = $1; } + # 1st line of a pair block, e.g. + # 2 (all_c7259) ... 1 (all_s57600) elsif (/^(\d+)\s+\((\S+)\)\s+\.\.\.\s+(\d+)\s+\((\S+)\)/) { ( $a, $b ) = ( $1, $3 ); } + # 2nd line of a pair block, e.g. + # lnL = -126.880601 elsif (/^lnL\s+\=\s*(\-?\d+(\.\d+)?)/) { $log = $1; if ( defined( $_ = $self->_readline ) ) { + # 3rd line of a pair block, e.g. + # 0.19045 2.92330 0.10941 s/^\s+//; ( $t, $kappa, $omega ) = split; } } + # 5th line of a pair block, e.g. + # t= 0.1904 S= 5.8 N= 135.2 dN/dS= 0.1094 dN= 0.0476 dS= 0.4353 + # OR lines like (note last field; this includes a fix for bug #3040) + # t= 0.0439 S= 0.0 N= 141.0 dN/dS= 0.1626 dN= 0.0146 dS= nan elsif ( m/^t\=\s*(\d+(\.\d+)?)\s+ S\=\s*(\d+(\.\d+)?)\s+ N\=\s*(\d+(\.\d+)?)\s+ dN\/dS\=\s*(\d+(\.\d+)?)\s+ dN\=\s*(\d+(\.\d+)?)\s+ - dS\=\s*(\d+(\.\d+)?)/ox + dS\=\s*(\d+(\.\d+)?|nan)/ox ) { $result[ $b - 1 ]->[ $a - 1 ] = { @@ -950,6 +960,7 @@ 'dS' => $11 }; } + # 4th line of a pair block (which is blank) elsif (/^\s+$/) { next; } Modified: bioperl-live/trunk/t/Tools/Phylo/PAML.t =================================================================== --- bioperl-live/trunk/t/Tools/Phylo/PAML.t 2010-04-01 18:44:14 UTC (rev 16944) +++ bioperl-live/trunk/t/Tools/Phylo/PAML.t 2010-04-05 10:15:42 UTC (rev 16945) @@ -7,7 +7,7 @@ use lib '.'; use Bio::Root::Test; - test_begin(-tests => 247, + test_begin(-tests => 251, -requires_module => 'IO::String'); use_ok('Bio::Tools::Phylo::PAML'); @@ -482,4 +482,19 @@ is( $lastsite->[2], 0.971); is( $lastsite->[3], '*'); is( $lastsite->[4], 6.134); +} + +# bug #3040 +{ + my $parser = Bio::Tools::Phylo::PAML->new + (-file => test_input_file('codeml_nan.mlc')); + ok($parser); + + my $result = $parser->next_result; + ok($result); + + my $MLmatrix = $result->get_MLmatrix(); + ok($MLmatrix); + + is($MLmatrix->[1]->[2]->{'dS'}, 'nan', 'bug 3040'); } \ No newline at end of file Added: bioperl-live/trunk/t/data/codeml_nan.mlc =================================================================== --- bioperl-live/trunk/t/data/codeml_nan.mlc (rev 0) +++ bioperl-live/trunk/t/data/codeml_nan.mlc 2010-04-05 10:15:42 UTC (rev 16945) @@ -0,0 +1,177 @@ + +seed used = 785596301 + + +Data set 1 + 3 141 + +all_s57600 CTC TTC CAC ACC TCC CAC TCC CCC CCC CTC CAC TCC TTC TTC ACC TCC CCT TCC CCC TCC CTC CTT CCC TCC CCC ACC CTC TTC CTT TTC CCC TTC CAC TCC CCC TTC CTC CGC CTC CCC TCC CTC CCC CCC CAC CAC CCC +all_s56012 CTC CTC CTC ACC TCC CAC TCC CCC CCC TTC CAC TCC TTC TTC ACC CCC CCC TCC CCC CCC CTC CTC CCC TCC CCC ACC CTC TTC CTC TTC CCC TTC CAC TCC TCC TTC CTC CGC CTC TCC TCC CTC CCC CCC CAC CAC CCC +all_c11513 CTC CTC CTC ACC TCC CAC TCC CCC CCC TTC CAC TCC TTC TTC ACC CCC CCC TCC CCC CCC CTC CTC CCC TCC CCC ACC CCC TTC CTC TTC CCC TTC CAC TCC TCC TTC CTC CGC CTC TCC TCC CTC CCC CCC CAC CAC CCC + + + +Printing out site pattern counts + + + 11 93 P + +all_s57600 ACC CAC CAC CAC CAC CCC CCC CCC CCC CCC CCC CCT CGC CTC CTC CTC CTC CTC CTC CTT CTT TCC TCC TCC TCC TCC TCC TCC TTC TTC TTC +all_s56012 ... ... ... ... .T. ... ... ... ... ... T.. ..C ... ... ... ... ... ... T.. ..C ..C C.. C.. ... ... ... ... ... C.. ... ... +all_c11513 ... ... ... ... .T. ... ... ... ... ... T.. ..C ... .C. ... ... ... ... T.. ..C ..C C.. C.. ... ... ... ... ... C.. ... ... + + 3 3 1 1 1 1 5 1 1 1 2 1 1 1 1 + 2 1 1 1 1 1 1 1 1 1 3 1 1 1 5 + 1 + +CODONML (in paml version 4.4, January 2010) /scratch/davepaml17/cluster45_fake.phy +Model: One dN/dS ratio for branches Global clock +Codon frequency model: F3x4 +ns = 3 ls = 47 + +Codon usage in sequences +-------------------------------------------------------------------------------------------------------------- +Phe TTT 0 0 0 1 0 0 | Ser TCT 0 0 0 0 0 0 | Tyr TAT 0 0 0 0 0 0 | Cys TGT 0 0 0 0 0 0 + TTC 7 7 7 7 9 7 | TCC 9 10 8 10 8 9 | TAC 0 0 0 0 0 0 | TGC 0 0 0 0 0 0 +Leu TTA 0 0 0 0 0 0 | TCA 0 0 0 0 0 0 | *** TAA 0 0 0 0 0 0 | *** TGA 0 0 0 0 0 0 + TTG 0 0 0 0 0 0 | TCG 0 0 0 0 0 0 | TAG 0 0 0 0 0 0 | Trp TGG 0 0 0 0 0 0 +-------------------------------------------------------------------------------------------------------------- +Leu CTT 2 1 1 2 1 0 | Pro CCT 1 0 0 0 0 0 | His CAT 0 0 0 0 0 0 | Arg CGT 0 0 0 0 1 0 + CTC 7 9 4 6 8 10 | CCC 11 11 19 12 13 12 | CAC 6 5 4 5 4 5 | CGC 1 1 1 0 0 1 + CTA 0 0 0 0 0 0 | CCA 0 0 0 0 0 0 | Gln CAA 0 0 0 0 0 0 | CGA 0 0 0 0 0 0 + CTG 0 0 0 0 0 0 | CCG 0 0 0 0 0 0 | CAG 0 0 0 0 0 0 | CGG 0 0 0 0 0 0 +-------------------------------------------------------------------------------------------------------------- +Ile ATT 0 0 0 0 0 0 | Thr ACT 0 0 0 0 0 0 | Asn AAT 0 0 0 0 0 0 | Ser AGT 0 0 0 0 0 0 + ATC 0 0 0 0 0 0 | ACC 3 3 3 4 3 3 | AAC 0 0 0 0 0 0 | AGC 0 0 0 0 0 0 + ATA 0 0 0 0 0 0 | ACA 0 0 0 0 0 0 | Lys AAA 0 0 0 0 0 0 | Arg AGA 0 0 0 0 0 0 +Met ATG 0 0 0 0 0 0 | ACG 0 0 0 0 0 0 | AAG 0 0 0 0 0 0 | AGG 0 0 0 0 0 0 +-------------------------------------------------------------------------------------------------------------- +Val GTT 0 0 0 0 0 0 | Ala GCT 0 0 0 0 0 0 | Asp GAT 0 0 0 0 0 0 | Gly GGT 0 0 0 0 0 0 + GTC 0 0 0 0 0 0 | GCC 0 0 0 0 0 0 | GAC 0 0 0 0 0 0 | GGC 0 0 0 0 0 0 + GTA 0 0 0 0 0 0 | GCA 0 0 0 0 0 0 | Glu GAA 0 0 0 0 0 0 | GGA 0 0 0 0 0 0 + GTG 0 0 0 0 0 0 | GCG 0 0 0 0 0 0 | GAG 0 0 0 0 0 0 | GGG 0 0 0 0 0 0 +-------------------------------------------------------------------------------------------------------------- + +-------------------------------------------------------------------------------------------------- +Phe TTT 0 0 0 0 0 | Ser TCT 0 0 0 0 0 | Tyr TAT 0 0 0 0 0 | Cys TGT 0 0 0 0 0 + TTC 7 8 9 7 7 | TCC 9 12 9 10 10 | TAC 0 0 0 0 0 | TGC 0 0 0 0 0 +Leu TTA 0 0 0 0 0 | TCA 0 0 0 0 0 | *** TAA 0 0 0 0 0 | *** TGA 0 0 0 0 0 + TTG 0 0 0 0 0 | TCG 0 0 0 0 0 | TAG 0 0 0 0 0 | Trp TGG 0 0 0 0 0 +-------------------------------------------------------------------------------------------------- +Leu CTT 0 1 1 1 1 | Pro CCT 0 0 1 0 0 | His CAT 0 0 1 0 0 | Arg CGT 0 0 0 0 0 + CTC 9 8 6 9 9 | CCC 13 8 11 11 11 | CAC 5 5 5 5 5 | CGC 1 1 0 1 1 + CTA 0 0 0 0 0 | CCA 0 0 0 0 0 | Gln CAA 0 0 0 0 0 | CGA 0 0 0 0 0 + CTG 0 0 0 0 0 | CCG 0 0 0 0 0 | CAG 0 0 0 0 0 | CGG 0 0 0 0 0 +-------------------------------------------------------------------------------------------------- +Ile ATT 0 0 0 0 0 | Thr ACT 0 0 0 0 0 | Asn AAT 0 0 0 0 0 | Ser AGT 0 0 0 0 0 + ATC 0 0 0 0 0 | ACC 3 4 4 3 3 | AAC 0 0 0 0 0 | AGC 0 0 0 0 0 + ATA 0 0 0 0 0 | ACA 0 0 0 0 0 | Lys AAA 0 0 0 0 0 | Arg AGA 0 0 0 0 0 +Met ATG 0 0 0 0 0 | ACG 0 0 0 0 0 | AAG 0 0 0 0 0 | AGG 0 0 0 0 0 +-------------------------------------------------------------------------------------------------- +Val GTT 0 0 0 0 0 | Ala GCT 0 0 0 0 0 | Asp GAT 0 0 0 0 0 | Gly GGT 0 0 0 0 0 + GTC 0 0 0 0 0 | GCC 0 0 0 0 0 | GAC 0 0 0 0 0 | GGC 0 0 0 0 0 + GTA 0 0 0 0 0 | GCA 0 0 0 0 0 | Glu GAA 0 0 0 0 0 | GGA 0 0 0 0 0 + GTG 0 0 0 0 0 | GCG 0 0 0 0 0 | GAG 0 0 0 0 0 | GGG 0 0 0 0 0 +-------------------------------------------------------------------------------------------------- + +Codon position x base (3x4) table for each sequence. + +#1: all_s57600 +position 1: T:0.34043 C:0.59574 A:0.06383 G:0.00000 +position 2: T:0.34043 C:0.51064 A:0.12766 G:0.02128 +position 3: T:0.06383 C:0.93617 A:0.00000 G:0.00000 +Average T:0.24823 C:0.68085 A:0.06383 G:0.00709 + +#2: all_s56012 +position 1: T:0.34043 C:0.59574 A:0.06383 G:0.00000 +position 2: T:0.36170 C:0.51064 A:0.10638 G:0.02128 +position 3: T:0.00000 C:1.00000 A:0.00000 G:0.00000 +Average T:0.23404 C:0.70213 A:0.05674 G:0.00709 + +#3: all_c11513 @@ Diff output truncated at 10000 characters. @@ From bugzilla-daemon at portal.open-bio.org Mon Apr 5 06:16:55 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 5 Apr 2010 06:16:55 -0400 Subject: [Bioperl-guts-l] [Bug 3040] PAML parser fails on codeml output with 'nan' In-Reply-To: Message-ID: <201004051016.o35AGtxp018012@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3040 online at davemessina.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #1 from online at davemessina.com 2010-04-05 06:16 EST ------- Fixed in r16945. Added a test. All tests pass. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Apr 5 08:46:36 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 5 Apr 2010 08:46:36 -0400 Subject: [Bioperl-guts-l] [Bug 3041] New: BioPerl breaks listing of updateable modules in a CPAN shell Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=3041 Summary: BioPerl breaks listing of updateable modules in a CPAN shell Product: BioPerl Version: 1.6 branch Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Core Components AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: fernan at iib.unsam.edu.ar When using CPAN to install modules, bioperl breaks the listing of modules that can be updated. Bio/Ontology/SimpleGOEngine/GraphAdaptor.pm seems to be the culprit. How to reproduce: First, make sure BioPerl is installed: 1) open a CPAN shell and install BioPerl cpan> install CJFIELDS/BioPerl-1.6.0.tar.gz # we're installing an older version on purpose so that later CPAN will show there is an update for this package When BioPerl is installed, you can ask CPAN about modules that need updating 2) in your CPAN shell type 'r' cpan> r and notice how BioPerl breaks the listing of modules Package namespace installed latest in CPAN file Any::Moose 0.11 0.12 SARTAK/Any-Moose-0.12.tar.gz JSON 2.19 2.20 MAKAMAKA/JSON-2.20.tar.gz ... Bio::Align::AlignI undef 1.006001 CJFIELDS/BioPerl-1.6.1.tar.gz Could not eval ' package ExtUtils::MakeMaker::_version; no strict; BEGIN { eval { # Ensure any version() routine which might have leaked # into this package has been deleted. Interferes with # version->import() undef *version; require version; "version"->import; } } local $Graph::VERSION; $Graph::VERSION=undef; do { ( defined $Graph::VERSION && $Graph::VERSION >= 0.5 ) ? }; $Graph::VERSION; ' in /usr/local/share/perl/5.10.0/Bio/Ontology/SimpleGOEngine/GraphAdaptor.pm: syntax error at (eval 3244) line 17, at EOF Could not eval ' package ExtUtils::MakeMaker::_version; no strict; BEGIN { eval { # Ensure any version() routine which might have leaked # into this package has been deleted. Interferes with # version->import() undef *version; require version; "version"->import; } } local $Graph::VERSION; $Graph::VERSION=undef; do { ( defined $Graph::VERSION && $Graph::VERSION >= 0.5 ) ? }; $Graph::VERSION; ' in /usr/local/share/perl/5.10.0/Bio/Ontology/SimpleGOEngine/GraphAdaptor.pm: syntax error at (eval 3244) line 17, at EOF Could not eval ' package ExtUtils::MakeMaker::_version; no strict; BEGIN { eval { # Ensure any version() routine which might have leaked # into this package has been deleted. Interferes with # version->import() undef *version; require version; "version"->import; } } local $Graph::VERSION; $Graph::VERSION=undef; do { ( defined $Graph::VERSION && $Graph::VERSION >= 0.5 ) ? }; $Graph::VERSION; ' in /usr/local/share/perl/5.10.0/Bio/Ontology/SimpleGOEngine/GraphAdaptor.pm: syntax error at (eval 3245) line 17, at EOF Compress::Zlib 2.020 2.025 PMQS/IO-Compress-2.025.tar.gz Image::Magick 6.5.1 6.005009 JCRISTY/PerlMagick-6.59.tar.gz -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Apr 5 10:14:14 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 5 Apr 2010 10:14:14 -0400 Subject: [Bioperl-guts-l] [Bug 3041] BioPerl breaks listing of updateable modules in a CPAN shell In-Reply-To: Message-ID: <201004051414.o35EEEDs024577@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3041 ------- Comment #1 from cjfields at bioperl.org 2010-04-05 10:14 EST ------- Fernan, this doesn't seem to be very widespread (it's the first time I've heard of it since the 1.6.1 release, and it wasn't one of the fail reasons on CPAN Testers). We will need to know the (perl version|CPAN version|OS) to try reproducing this. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Apr 5 10:48:22 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 5 Apr 2010 10:48:22 -0400 Subject: [Bioperl-guts-l] [Bug 3041] BioPerl breaks listing of updateable modules in a CPAN shell In-Reply-To: Message-ID: <201004051448.o35EmMFd025290@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3041 ------- Comment #2 from fernan at iib.unsam.edu.ar 2010-04-05 10:48 EST ------- (In reply to comment #1) > Fernan, this doesn't seem to be very widespread (it's the first time I've heard > of it since the 1.6.1 release, and it wasn't one of the fail reasons on CPAN > Testers). This is happening since ~ 1.5.x. Though I always thought someone would notice and fix it :) It doesn't cause a failure when installing ... so I would not expect to be detected by CPAN Testers. It's a cosmetic issue that only affects listing of updates in the CPAN shell. It doesn't affect looking for information on a specific module, e.g. the following works fine: cpan> i /BioPerl/ cpan[3]> i /BioPerl/ Bundle Bundle::BioPerl (CRAFFI/Bundle-BioPerl-2.1.8.tar.gz) Distribution BIRNEY/bioperl-1.2.2.tar.gz Distribution BIRNEY/bioperl-1.2.3.tar.gz Distribution BIRNEY/bioperl-1.2.tar.gz Distribution BIRNEY/bioperl-1.4.tar.gz Distribution BIRNEY/bioperl-db-0.1.tar.gz Distribution BIRNEY/bioperl-ext-1.4.tar.gz Distribution BIRNEY/bioperl-gui-0.7.tar.gz Distribution BIRNEY/bioperl-run-1.4.tar.gz Distribution BOZO/Fry-Lib-BioPerl-0.15.tar.gz Distribution CJFIELDS/BioPerl-1.6.0.tar.gz Distribution CJFIELDS/BioPerl-1.6.1.tar.gz Distribution CJFIELDS/BioPerl-db-1.6.0.tar.gz Distribution CJFIELDS/BioPerl-network-1.6.0.tar.gz Distribution CJFIELDS/BioPerl-run-1.6.1.tar.gz Distribution CRAFFI/Bundle-BioPerl-2.1.8.tar.gz Module < Bio::LiveSeq::IO::BioPerl (CJFIELDS/BioPerl-1.6.1.tar.gz) Module Fry::Lib::BioPerl (BOZO/Fry-Lib-BioPerl-0.15.tar.gz) Author BIOPERLML ("Bioperl-l" ) 19 items found Seems like CPAN-shell is looking inside modules for version information, maybe comparing what's installed to what's available, and failing ... apparently this is only done at this stage (looking for out of date modules) so it's not preventing installation of the Distribution. > We will need to know the (perl version|CPAN version|OS) to try reproducing > this. CPAN 1.9402 (ANDK/CPAN-1.9402.tar.gz) perl 5.10.0 Ubuntu Linux 9.10 -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Apr 5 11:21:00 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 5 Apr 2010 11:21:00 -0400 Subject: [Bioperl-guts-l] [Bug 3041] BioPerl breaks listing of updateable modules in a CPAN shell In-Reply-To: Message-ID: <201004051521.o35FL0v9026216@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3041 ------- Comment #3 from cjfields at bioperl.org 2010-04-05 11:20 EST ------- (In reply to comment #2) > (In reply to comment #1) > > Fernan, this doesn't seem to be very widespread (it's the first time I've heard > > of it since the 1.6.1 release, and it wasn't one of the fail reasons on CPAN > > Testers). > > This is happening since ~ 1.5.x. Though I always thought someone would notice > and fix it :) Nope, never noticed it. It's the first time I recall it being reported. > Seems like CPAN-shell is looking inside modules for version information, maybe > comparing what's installed to what's available, and failing ... apparently this > is only done at this stage (looking for out of date modules) so it's not > preventing installation of the Distribution. > > > > We will need to know the (perl version|CPAN version|OS) to try reproducing > > this. > > CPAN 1.9402 (ANDK/CPAN-1.9402.tar.gz) > perl 5.10.0 > Ubuntu Linux 9.10 Okay, I'll check it out. I will be installing a local perl 5.10.1 and perl 5.12 on my Ubuntu box for testing later this week, I can probably try replicating this there. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Wed Apr 7 00:00:39 2010 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Wed, 7 Apr 2010 00:00:39 -0400 Subject: [Bioperl-guts-l] [16946] bioperl-live/trunk: [bug 2399] n() should be at least 1, but not an empty string. Message-ID: <201004070400.o3740dsJ025174@dev.open-bio.org> Revision: 16946 Author: cjfields Date: 2010-04-07 00:00:39 -0400 (Wed, 07 Apr 2010) Log Message: ----------- [bug 2399] n() should be at least 1, but not an empty string. This required some fixes to old tiling code to ignore n(). Tests added Modified Paths: -------------- bioperl-live/trunk/Bio/Search/HSP/GenericHSP.pm bioperl-live/trunk/Bio/Search/SearchUtils.pm bioperl-live/trunk/t/SearchIO/blast.t Modified: bioperl-live/trunk/Bio/Search/HSP/GenericHSP.pm =================================================================== --- bioperl-live/trunk/Bio/Search/HSP/GenericHSP.pm 2010-04-05 10:15:42 UTC (rev 16945) +++ bioperl-live/trunk/Bio/Search/HSP/GenericHSP.pm 2010-04-07 04:00:39 UTC (rev 16946) @@ -1191,7 +1191,8 @@ sub n { my $self = shift; if(@_) { $self->{'N'} = shift; } - defined $self->{'N'} ? $self->{'N'} : ''; + # note that returning 1 is completely an assumption + defined $self->{'N'} ? $self->{'N'} : 1; } =head2 range Modified: bioperl-live/trunk/Bio/Search/SearchUtils.pm =================================================================== --- bioperl-live/trunk/Bio/Search/SearchUtils.pm 2010-04-05 10:15:42 UTC (rev 16945) +++ bioperl-live/trunk/Bio/Search/SearchUtils.pm 2010-04-07 04:00:39 UTC (rev 16946) @@ -133,30 +133,31 @@ #$sbjct->verbose(1); # to activate debugging $sbjct->tiled_hsps(1); - if( $sbjct->num_hsps == 0 || $sbjct->n == 0 ) { + # changed to not rely on n() (which is unreliable here) --cjfields 4/6/10 + if( $sbjct->num_hsps == 0) { #print STDERR "_tile_hsps(): no hsps, nothing to tile! (", $sbjct->num_hsps, ")\n"; _warn_about_no_hsps($sbjct); return (undef, undef); - } elsif( $sbjct->n == 1 or $sbjct->num_hsps == 1) { + } elsif($sbjct->num_hsps == 1) { ## Simple summation scheme. Valid if there is only one HSP. - #print STDERR "_tile_hsps(): single HSP, easy stats.\n"; - my $hsp = $sbjct->hsp; - $sbjct->length_aln('query', $hsp->length('query')); - $sbjct->length_aln('hit', $hsp->length('sbjct')); - $sbjct->length_aln('total', $hsp->length('total')); - $sbjct->matches( $hsp->matches() ); - $sbjct->gaps('query', $hsp->gaps('query')); - $sbjct->gaps('sbjct', $hsp->gaps('sbjct')); + #print STDERR "_tile_hsps(): single HSP, easy stats.\n"; + my $hsp = $sbjct->hsp; + $sbjct->length_aln('query', $hsp->length('query')); + $sbjct->length_aln('hit', $hsp->length('sbjct')); + $sbjct->length_aln('total', $hsp->length('total')); + $sbjct->matches( $hsp->matches() ); + $sbjct->gaps('query', $hsp->gaps('query')); + $sbjct->gaps('sbjct', $hsp->gaps('sbjct')); _adjust_length_aln($sbjct); - return (1, 1); + return (1, 1); } else { - #print STDERR "Sbjct: _tile_hsps: summing multiple HSPs\n"; - $sbjct->length_aln('query', 0); - $sbjct->length_aln('sbjct', 0); - $sbjct->length_aln('total', 0); - $sbjct->matches( 0, 0); + #print STDERR "Sbjct: _tile_hsps: summing multiple HSPs\n"; + $sbjct->length_aln('query', 0); + $sbjct->length_aln('sbjct', 0); + $sbjct->length_aln('total', 0); + $sbjct->matches( 0, 0); $sbjct->gaps('query', 0); $sbjct->gaps('hit', 0); } Modified: bioperl-live/trunk/t/SearchIO/blast.t =================================================================== --- bioperl-live/trunk/t/SearchIO/blast.t 2010-04-05 10:15:42 UTC (rev 16945) +++ bioperl-live/trunk/t/SearchIO/blast.t 2010-04-07 04:00:39 UTC (rev 16946) @@ -7,7 +7,7 @@ use lib '.'; use Bio::Root::Test; - test_begin(-tests => 1093); + test_begin(-tests => 1116); use_ok('Bio::SearchIO'); } @@ -81,6 +81,7 @@ is(sprintf("%.4f",$hsp->frac_identical('query')), 0.9829); is(sprintf("%.4f",$hsp->frac_identical('hit')), 0.9829); is($hsp->gaps, 0); + is($hsp->n, 1); $hsps_left--; } is($hsps_left, 0); @@ -184,6 +185,7 @@ is($hsp->frac_identical('query'), 1.00); is($hsp->frac_identical('hit'), 1.00); is($hsp->gaps, 0); + is($hsp->n, 1); $hsps_left--; } is($hsps_left, 0); @@ -262,6 +264,7 @@ is($hsp->frac_identical('query'), 1.00); is($hsp->frac_identical('hit'), 1.00); is($hsp->gaps, 0); + is($hsp->n, 1); $hsps_left--; } is($hsps_left, 0); @@ -345,6 +348,7 @@ is(join(' ', $hsp->seq_inds('query', 'nomatch',1)), '1063-1065 1090-1095 1099-1104 1108-1113 1117-1125'); is(join(' ', $hsp->seq_inds('hit', 'nomatch',1)), '5825-5833 5837-5842 5846-5851 5855-5860 5885-5887'); is($hsp->ambiguous_seq_inds, 'query/subject'); + is($hsp->n, 1); $hsps_left--; } is($hsps_left, 0); @@ -435,6 +439,7 @@ my $aln = $hsp->get_aln; is(sprintf("%.2f", $aln->overall_percentage_identity), 96.67); is(sprintf("%.2f",$aln->percentage_identity), 98.31); + is($hsp->n, 1); $hsps_left--; } is($hsps_left, 0); @@ -523,6 +528,7 @@ is(join(' ', $hsp->seq_inds('query', 'gaps',1)), '347 1004'); is(join(' ', $hsp->seq_inds('hit', 'gaps',1)), '100 131 197 362 408'); is($hsp->ambiguous_seq_inds, 'query'); + is($hsp->n, 1); $hsps_left--; } is($hsps_left, 0); @@ -640,6 +646,7 @@ is(join(' ', $hsp->seq_inds('query', 'gaps',1)), '109 328'); is(join(' ', $hsp->seq_inds('hit', 'gaps',1)), '5077 5170 5368 5863 6001'); is($hsp->ambiguous_seq_inds, 'subject'); + is($hsp->n, 1); $hsps_left--; } is($hsps_left, 0); @@ -709,7 +716,8 @@ is($hsp->hit->frame(), 1); is($hsp->gaps('query'), 0); is($hsp->gaps('hit'), 0); - is($hsp->gaps, 0); + is($hsp->gaps, 0); + is($hsp->n, 1); is($hsp->query_string, 'ALDYLLSRGFTKELINEFQIGYALDSWDFITKFLVKRGFSEAQMEKAGLLIRREDGSGY'); is($hsp->hit_string, 'ARQYLEKRGLSHEVIARFAIGFAPPGWDNVLKRFGGNPENRQSLIDAGMLVTNDQGRSY'); is($hsp->homology_string, 'A YL RG + E+I F IG+A WD + K + + AG+L+ + G Y'); @@ -744,6 +752,7 @@ is($hsp->gaps('query'), 0); is($hsp->gaps('hit'), 0); is($hsp->gaps, 0); + is($hsp->n, 1); is($hsp->query_string, 'WLPRALPEKATTAP**SWIGNMTRFLKRSKYPLPSSRLIR'); is($hsp->hit_string, 'WLSRTTVGSSTVSPRTFWITRMKVKLSSSKVTLPSTKSTR'); is($hsp->homology_string, 'WL R +T +P WI M L SK LPS++ R'); @@ -806,6 +815,7 @@ is($hsp->frac_identical('query'), 1.00); is($hsp->frac_identical('hit'), 1.00); is($hsp->gaps, 0); + is($hsp->n, 1); $hsps_left--; } is($hsps_left, 0); @@ -969,6 +979,7 @@ is($hsp->hit->start, shift @$d); is($hsp->hit->end, shift @$d); is($hsp->hit->strand, shift @$d); + is($hsp->n, 1); $hits_left--; } is($hits_left, 0); @@ -1070,6 +1081,7 @@ is(sprintf("%.4f",$hsp->frac_identical('query')), 0.4757); is(sprintf("%.3f",$hsp->frac_identical('hit')), 0.482); is($hsp->gaps, 18); + is($hsp->n, 1); $hsps_left--; } is($hsps_left, 0); @@ -1186,6 +1198,7 @@ is($hsp->hit->start, 1); is($hsp->hit->end,94); is($hsp->gaps, 7); +is($hsp->n, 1); # this is blastn bl2seq $searchio = Bio::SearchIO->new(-format => 'blast', @@ -1211,6 +1224,7 @@ is($hsp->hit->start, 86); is($hsp->hit->end,179); is($hsp->gaps, 7); +is($hsp->n, 1); # this is blastp bl2seq $searchio = Bio::SearchIO->new(-format => 'blast', @@ -1237,6 +1251,8 @@ is($hsp->hit->start, 1); is($hsp->hit->end,469); is($hsp->gaps, 120); +is($hsp->n, 1); + ok($hit->next_hsp); # there is more than one HSP here, # make sure it is parsed at least @@ -1271,6 +1287,7 @@ is($hsp->hit->frame,0); is($hsp->query->strand,-1); is($hsp->hit->strand,0); +is($hsp->n, 1); # this is tblastx bl2seq (self against self) $searchio = Bio::SearchIO->new(-format => 'blast', @@ -1301,6 +1318,7 @@ is($hsp->hit->frame,0); is($hsp->query->strand,1); is($hsp->hit->strand,1); +is($hsp->n, 1); # this is NCBI tblastn $searchio = Bio::SearchIO->new(-format => 'blast', @@ -1376,7 +1394,7 @@ is($hsp->query->strand, 1); is($hsp->hit->strand, 1); is($hsp->hsp_group, '1'); - +is($hsp->n, 1); ## Web blast result parsing $searchio = Bio::SearchIO->new(-format => 'blast', @@ -1389,6 +1407,7 @@ ok($hsp = $hit->next_hsp); is($hsp->query->start, 1, 'query start'); is($hsp->query->end, 528, 'query start'); +is($hsp->n, 1); # tests for new BLAST 2.2.13 output $searchio = Bio::SearchIO->new(-format => 'blast', @@ -1454,6 +1473,7 @@ is(sprintf("%.4f",$hsp->frac_identical('query')), 0.8522); is(sprintf("%.4f",$hsp->frac_identical('hit')), 0.8522); is($hsp->gaps, 0); + is($hsp->n, 1); $hsps_left--; } is($hsps_left, 0); @@ -1599,7 +1619,7 @@ $total_n += grep{$_->n} $subject->hsps; } } -is($total_n, 10); +is($total_n, 80); # n = at least 1, so this was changed to reflect that sub cmp_evalue ($$) { my ($tval, $aval) = @_; From bugzilla-daemon at portal.open-bio.org Wed Apr 7 00:09:33 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 7 Apr 2010 00:09:33 -0400 Subject: [Bioperl-guts-l] [Bug 2399] BlastHSP::n gives empty values In-Reply-To: Message-ID: <201004070409.o3749XrL026131@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2399 cjfields at bioperl.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|REOPENED |RESOLVED Resolution| |FIXED ------- Comment #7 from cjfields at bioperl.org 2010-04-07 00:09 EST ------- This is now implemented for GenericHSP, in r16946. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Apr 7 00:18:52 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 7 Apr 2010 00:18:52 -0400 Subject: [Bioperl-guts-l] [Bug 2955] Bio::DB::Refseq In-Reply-To: Message-ID: <201004070418.o374Iqgo026503@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2955 cjfields at bioperl.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |WORKSFORME ------- Comment #2 from cjfields at bioperl.org 2010-04-07 00:18 EST ------- ankit, The following code works for me: use Bio::DB::RefSeq; my $id = 'NM_003955'; my $factory = Bio::DB::RefSeq->new(); my $seq = $factory->get_Seq_by_acc($id); say $seq->id; -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Apr 7 00:21:12 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 7 Apr 2010 00:21:12 -0400 Subject: [Bioperl-guts-l] [Bug 2956] ranges in EMBL output In-Reply-To: Message-ID: <201004070421.o374LCoB026659@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2956 cjfields at bioperl.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |INVALID ------- Comment #3 from cjfields at bioperl.org 2010-04-07 00:21 EST ------- Pascal, I'm ruling this one as invalid. Feel free to reopen the report with some example code if you feel otherwise. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Apr 7 00:25:41 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 7 Apr 2010 00:25:41 -0400 Subject: [Bioperl-guts-l] [Bug 2960] HOWTO:EUtilities_Cookbook In-Reply-To: Message-ID: <201004070425.o374Pf7I026940@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2960 cjfields at bioperl.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |WORKSFORME ------- Comment #1 from cjfields at bioperl.org 2010-04-07 00:25 EST ------- Stephane, this works for me (both the example code in the cookbook and the one-liner). Make sure you are using the latest release and do not have conflicting module present. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Wed Apr 7 01:02:34 2010 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Wed, 7 Apr 2010 01:02:34 -0400 Subject: [Bioperl-guts-l] [16947] bioperl-live/trunk/Build.PL: [bug 2975] Message-ID: <201004070502.o3752Yek027428@dev.open-bio.org> Revision: 16947 Author: cjfields Date: 2010-04-07 01:02:34 -0400 (Wed, 07 Apr 2010) Log Message: ----------- [bug 2975] Add really nasty obvious warning to Build.PL due to lingering issues with XML::SAX::RTF masquerading as an XML::SAX parser Modified Paths: -------------- bioperl-live/trunk/Build.PL Modified: bioperl-live/trunk/Build.PL =================================================================== --- bioperl-live/trunk/Build.PL 2010-04-07 04:00:39 UTC (rev 16946) +++ bioperl-live/trunk/Build.PL 2010-04-07 05:02:34 UTC (rev 16947) @@ -13,6 +13,40 @@ use lib '.'; use Bio::Root::Build; +# XML::SAX::RTF doesn't work with BioPerl, at all, nada, zilch. +# +# Since we're running into this now on CPAN Testers, catch it up front and +# deal with it. +# +# See: https://rt.cpan.org/Ticket/Display.html?id=5943 +# http://bugzilla.open-bio.org/show_bug.cgi?id=2975 + +{ + +eval { + use XML::SAX; + 1; +}; + +unless ($@) { + if (grep {$_->{Name} =~ 'XML::SAX::RTF'} @{XML::SAX->parsers()}) { + warn < Message-ID: <201004070504.o37548xg028479@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2975 cjfields at bioperl.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #4 from cjfields at bioperl.org 2010-04-07 01:04 EST ------- (In reply to comment #3) > Very good point! Installing XML::SAX::ExpatXS before building BioPerl > completely solved the problem. Many thanks for your help! The problem in some cases appears to be with XML::SAX::RTF registering itself as an XML::SAX parser (even though it really isn't compliant). I added a warning to check for this to Build.PL, in r16947. Closing out. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Wed Apr 7 18:04:01 2010 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Wed, 7 Apr 2010 18:04:01 -0400 Subject: [Bioperl-guts-l] [16948] bioperl-live/trunk: Add a -string parameter, allows one to pass a string in to be converted to a filehandle. Message-ID: <201004072204.o37M41de002433@dev.open-bio.org> Revision: 16948 Author: cjfields Date: 2010-04-07 18:04:00 -0400 (Wed, 07 Apr 2010) Log Message: ----------- Add a -string parameter, allows one to pass a string in to be converted to a filehandle. Tests added to RootIO.t. Modified Paths: -------------- bioperl-live/trunk/Bio/Root/IO.pm bioperl-live/trunk/t/Root/RootIO.t Modified: bioperl-live/trunk/Bio/Root/IO.pm =================================================================== --- bioperl-live/trunk/Bio/Root/IO.pm 2010-04-07 05:02:34 UTC (rev 16947) +++ bioperl-live/trunk/Bio/Root/IO.pm 2010-04-07 22:04:00 UTC (rev 16948) @@ -252,6 +252,7 @@ Currently recognizes the following named parameters: -file name of file to open + -string a string that is to be converted to a filehandle -url name of URL to open -input name of file, or GLOB, or IO::Handle object -fh file handle (mutually exclusive with -file) @@ -276,9 +277,9 @@ $self->_register_for_cleanup(\&_io_cleanup); - my ($input, $noclose, $file, $fh, $flush, $url, + my ($input, $noclose, $file, $fh, $string, $flush, $url, $retries, $ua_parms) = - $self->_rearrange([qw(INPUT NOCLOSE FILE FH FLUSH URL RETRIES UA_PARMS)], + $self->_rearrange([qw(INPUT NOCLOSE FILE FH STRING FLUSH URL RETRIES UA_PARMS)], @args); if($url){ @@ -332,19 +333,24 @@ "not string and not GLOB"); } } + if(defined($file) && defined($fh)) { $self->throw("Providing both a file and a filehandle for reading - ". "only one please!"); } + if ($string) { + if(defined($file) || defined($fh)) { + $self->throw("File or filehandle provided with -string,". + " please unset if you are using -string as a file"); + } + open($fh, "<", \$string) + } + if(defined($file) && ($file ne '')) { $fh = Symbol::gensym(); open ($fh,$file) || $self->throw("Could not open $file: $!"); $self->file($file); - if ($HAS_EOL) { - $self->_load_module('PerlIO::eol'); - binmode $fh, ':raw:eol(LF-Native)'; - } } if (defined $fh) { @@ -359,6 +365,9 @@ $self->throw("file handle $fh doesn't appear to be a handle"); } } + if ($HAS_EOL) { + binmode $fh, ':raw:eol(LF-Native)'; + } $self->_fh($fh) if $fh; # if not provided, defaults to STDIN and STDOUT $self->_flush_on_write(defined $flush ? $flush : 1); Modified: bioperl-live/trunk/t/Root/RootIO.t =================================================================== --- bioperl-live/trunk/t/Root/RootIO.t 2010-04-07 05:02:34 UTC (rev 16947) +++ bioperl-live/trunk/t/Root/RootIO.t 2010-04-07 22:04:00 UTC (rev 16948) @@ -8,7 +8,7 @@ use lib '.'; use Bio::Root::Test; - test_begin(-tests => 48); + test_begin(-tests => 53); use_ok('Bio::Root::IO'); } @@ -173,3 +173,22 @@ $Bio::Root::IO::HAS_LWP = 0; lives_ok {$rio = Bio::Root::IO->new(-url=>$TESTURL)}; } + +############################################## +# test -string +############################################## + +my $teststring = "Foo\nBar\nBaz"; +ok $rio = Bio::Root::IO->new(-string =>$teststring), 'default -string method'; + +$line1 = $rio->_readline; +is($line1, "Foo\n"); + +$line2 = $rio->_readline; +is($line2, "Bar\n"); +$rio->_pushback($line2); + +$line3 = $rio->_readline; +is($line3, "Bar\n"); +$line3 = $rio->_readline; +is($line3, "Baz"); From dave_messina at dev.open-bio.org Thu Apr 8 09:54:56 2010 From: dave_messina at dev.open-bio.org (Dave Messina) Date: Thu, 8 Apr 2010 09:54:56 -0400 Subject: [Bioperl-guts-l] [16949] bioperl-run/trunk/lib/Bio/Tools/Run/Phylo/PAML/Evolver.pm: cwd call not used (and can produce error), so I'm removing it. Message-ID: <201004081354.o38Dsuo6009400@dev.open-bio.org> Revision: 16949 Author: dave_messina Date: 2010-04-08 09:54:56 -0400 (Thu, 08 Apr 2010) Log Message: ----------- cwd call not used (and can produce error), so I'm removing it. Modified Paths: -------------- bioperl-run/trunk/lib/Bio/Tools/Run/Phylo/PAML/Evolver.pm Modified: bioperl-run/trunk/lib/Bio/Tools/Run/Phylo/PAML/Evolver.pm =================================================================== --- bioperl-run/trunk/lib/Bio/Tools/Run/Phylo/PAML/Evolver.pm 2010-04-07 22:04:00 UTC (rev 16948) +++ bioperl-run/trunk/lib/Bio/Tools/Run/Phylo/PAML/Evolver.pm 2010-04-08 13:54:56 UTC (rev 16949) @@ -411,7 +411,6 @@ # FIXME: We should look for the stuff we prepared in the prepare method here my $rc = (1); { - my $cwd = cwd(); my $exit_status; my ($tmpdir) = $self->tempdir(); chdir($tmpdir); @@ -432,8 +431,6 @@ my $aln = $in->next_aln(); $self->alignment($aln); } - #chdir($cwd); - #### } return $rc; } From bugzilla-daemon at portal.open-bio.org Thu Apr 8 17:11:15 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 8 Apr 2010 17:11:15 -0400 Subject: [Bioperl-guts-l] [Bug 3049] New: incorrect formatting of LOCUS line in genbank output Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=3049 Summary: incorrect formatting of LOCUS line in genbank output Product: BioPerl Version: unspecified Platform: All OS/Version: All Status: NEW Severity: minor Priority: P2 Component: Bio::SeqIO AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: wdavis at biology.utah.edu I have a parser (non-Perl) that is having some trouble with "genbank" formatted files. The troublesome files are from from another source that uses bioperl to write their files with Bio::SeqIO::genbank Trouble is that the molecule type (in the LOCUS line) they are writing is free text, as allowed by the Bioperl Bio::SeqIO::genbank module: -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 8 17:17:22 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 8 Apr 2010 17:17:22 -0400 Subject: [Bioperl-guts-l] [Bug 3050] New: incorrect formatting of LOCUS line in genbank output Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=3050 Summary: incorrect formatting of LOCUS line in genbank output Product: BioPerl Version: unspecified Platform: All OS/Version: All Status: NEW Severity: minor Priority: P2 Component: Bio::SeqIO AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: wdavis at biology.utah.edu I have a parser (non-Perl) that is having some trouble with "genbank" formatted files. The troublesome files are from from another source that uses bioperl to write their files with Bio::SeqIO::genbank Trouble is that the molecule type (in the LOCUS line) they are writing is free text, as allowed by the Bioperl Bio::SeqIO::genbank module: -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 8 17:29:50 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 8 Apr 2010 17:29:50 -0400 Subject: [Bioperl-guts-l] [Bug 3051] New: incorrect formatting of LOCUS line in genbank output Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=3051 Summary: incorrect formatting of LOCUS line in genbank output Product: BioPerl Version: unspecified Platform: All OS/Version: All Status: NEW Severity: minor Priority: P2 Component: Bio::SeqIO AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: wdavis at biology.utah.edu I have a parser (non-Perl) that is having some trouble with "genbank" formatted files. The troublesome files are from from another source that uses bioperl to write their files with Bio::SeqIO::genbank Trouble is that the molecule type (in the LOCUS line) they are writing is free text, as allowed by the Bioperl Bio::SeqIO::genbank module: -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 8 17:39:22 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 8 Apr 2010 17:39:22 -0400 Subject: [Bioperl-guts-l] [Bug 3052] New: incorrect formatting of LOCUS line in genbank output Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=3052 Summary: incorrect formatting of LOCUS line in genbank output Product: BioPerl Version: unspecified Platform: All OS/Version: All Status: NEW Severity: normal Priority: P2 Component: Bio::SeqIO AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: wdavis at biology.utah.edu I have a parser (non-Perl) that is having some trouble with "genbank" formatted files. The troublesome files are from from another source that uses bioperl to write their files with Bio::SeqIO::genbank Trouble is that the molecule type (in the LOCUS line) they are writing is free text, as allowed by the Bioperl Bio::SeqIO::genbank module: -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 8 17:42:41 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 8 Apr 2010 17:42:41 -0400 Subject: [Bioperl-guts-l] [Bug 3053] New: incorrect formatting of LOCUS line in genbank output Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=3053 Summary: incorrect formatting of LOCUS line in genbank output Product: BioPerl Version: unspecified Platform: All OS/Version: All Status: NEW Severity: normal Priority: P2 Component: Bio::SeqIO AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: wdavis at biology.utah.edu I have a parser (non-Perl) that is having some trouble with "genbank" formatted files. The troublesome files are from from another source that uses bioperl to write their files with Bio::SeqIO::genbank Trouble is that the molecule type (in the LOCUS line) they are writing is free text, as allowed by the Bioperl Bio::SeqIO::genbank module however the genbank file definition at ftp://ftp.ncbi.nih.gov/genbank/gbrel.txt section 3.4.4 specifies the format for the LOCUS line: in the table of column positions they specify a limited vocabulary of fixed width: "48-53 NA, DNA, RNA, tRNA (transfer RNA), rRNA (ribosomal RNA), mRNA (messenger RNA), uRNA (small nuclear RNA), snRNA, snoRNA. Left justified." which to me strongly suggests that the genbank file format requires a fixed vocabulary for molecule type. if $mol is not in the fixed list of genbank molecule types it should be set to the default value of 'DNA', or some other smarter way of forcing the molecule type into the fixed vocabulary would be a help. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 8 17:43:32 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 8 Apr 2010 17:43:32 -0400 Subject: [Bioperl-guts-l] [Bug 3049] incorrect formatting of LOCUS line in genbank output In-Reply-To: Message-ID: <201004082143.o38LhWJG011023@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3049 wdavis at biology.utah.edu changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |DUPLICATE ------- Comment #1 from wdavis at biology.utah.edu 2010-04-08 17:43 EST ------- *** This bug has been marked as a duplicate of bug 3053 *** -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 8 17:43:32 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 8 Apr 2010 17:43:32 -0400 Subject: [Bioperl-guts-l] [Bug 3053] incorrect formatting of LOCUS line in genbank output In-Reply-To: Message-ID: <201004082143.o38LhW8v011030@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3053 ------- Comment #1 from wdavis at biology.utah.edu 2010-04-08 17:43 EST ------- *** Bug 3049 has been marked as a duplicate of this bug. *** -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 8 17:43:43 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 8 Apr 2010 17:43:43 -0400 Subject: [Bioperl-guts-l] [Bug 3050] incorrect formatting of LOCUS line in genbank output In-Reply-To: Message-ID: <201004082143.o38LhhJw011056@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3050 wdavis at biology.utah.edu changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |DUPLICATE ------- Comment #1 from wdavis at biology.utah.edu 2010-04-08 17:43 EST ------- *** This bug has been marked as a duplicate of bug 3053 *** -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 8 17:43:44 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 8 Apr 2010 17:43:44 -0400 Subject: [Bioperl-guts-l] [Bug 3053] incorrect formatting of LOCUS line in genbank output In-Reply-To: Message-ID: <201004082143.o38LhiMY011062@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3053 ------- Comment #2 from wdavis at biology.utah.edu 2010-04-08 17:43 EST ------- *** Bug 3050 has been marked as a duplicate of this bug. *** -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 8 17:43:53 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 8 Apr 2010 17:43:53 -0400 Subject: [Bioperl-guts-l] [Bug 3051] incorrect formatting of LOCUS line in genbank output In-Reply-To: Message-ID: <201004082143.o38Lhr3Y011085@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3051 wdavis at biology.utah.edu changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |DUPLICATE ------- Comment #1 from wdavis at biology.utah.edu 2010-04-08 17:43 EST ------- *** This bug has been marked as a duplicate of bug 3053 *** -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 8 17:43:53 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 8 Apr 2010 17:43:53 -0400 Subject: [Bioperl-guts-l] [Bug 3053] incorrect formatting of LOCUS line in genbank output In-Reply-To: Message-ID: <201004082143.o38Lhr8D011091@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3053 ------- Comment #3 from wdavis at biology.utah.edu 2010-04-08 17:43 EST ------- *** Bug 3051 has been marked as a duplicate of this bug. *** -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 8 17:44:14 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 8 Apr 2010 17:44:14 -0400 Subject: [Bioperl-guts-l] [Bug 3052] incorrect formatting of LOCUS line in genbank output In-Reply-To: Message-ID: <201004082144.o38LiEd3011139@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3052 wdavis at biology.utah.edu changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |DUPLICATE ------- Comment #1 from wdavis at biology.utah.edu 2010-04-08 17:44 EST ------- *** This bug has been marked as a duplicate of bug 3053 *** -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 8 17:44:15 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 8 Apr 2010 17:44:15 -0400 Subject: [Bioperl-guts-l] [Bug 3053] incorrect formatting of LOCUS line in genbank output In-Reply-To: Message-ID: <201004082144.o38LiF1N011146@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3053 ------- Comment #4 from wdavis at biology.utah.edu 2010-04-08 17:44 EST ------- *** Bug 3052 has been marked as a duplicate of this bug. *** -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From David.Messina at sbc.su.se Fri Apr 9 18:25:46 2010 From: David.Messina at sbc.su.se (Dave Messina) Date: Sat, 10 Apr 2010 00:25:46 +0200 Subject: [Bioperl-guts-l] preferred route of OS X installation? Message-ID: <5D85E895-2CE4-4D42-B0C4-E027ED7BB0A2@sbc.su.se> Hi everybody, I just noticed that the link for Mac OS X Installation on the front page of the website is: http://www.bioperl.org/wiki/Getting_BioPerl#Mac_OS_X_using_fink Is fink installation indeed what would be recommended for OS X these days? I ask because I last tried fink a few years ago and could never get it to work properly, so my inclination is to point people to http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix instead. Dave From cjfields at illinois.edu Fri Apr 9 20:59:56 2010 From: cjfields at illinois.edu (Chris Fields) Date: Fri, 9 Apr 2010 19:59:56 -0500 Subject: [Bioperl-guts-l] preferred route of OS X installation? In-Reply-To: <5D85E895-2CE4-4D42-B0C4-E027ED7BB0A2@sbc.su.se> References: <5D85E895-2CE4-4D42-B0C4-E027ED7BB0A2@sbc.su.se> Message-ID: Yes, please change that. fink is woefully out-of-date I believe. chris On Apr 9, 2010, at 5:25 PM, Dave Messina wrote: > Hi everybody, > > I just noticed that the link for Mac OS X Installation on the front page of the website is: > > http://www.bioperl.org/wiki/Getting_BioPerl#Mac_OS_X_using_fink > > > Is fink installation indeed what would be recommended for OS X these days? > > I ask because I last tried fink a few years ago and could never get it to work properly, so my inclination is to point people to > > http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix > > instead. > > > Dave > > > _______________________________________________ > Bioperl-guts-l mailing list > Bioperl-guts-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-guts-l From David.Messina at sbc.su.se Sat Apr 10 03:17:27 2010 From: David.Messina at sbc.su.se (Dave Messina) Date: Sat, 10 Apr 2010 09:17:27 +0200 Subject: [Bioperl-guts-l] preferred route of OS X installation? In-Reply-To: References: <5D85E895-2CE4-4D42-B0C4-E027ED7BB0A2@sbc.su.se> Message-ID: <9469523F-33FB-4A5E-8AD5-559D4CCF8C37@sbc.su.se> I don't have edit permissions on the front page, actually. :) Dave On Apr 10, 2010, at 2:59, Chris Fields wrote: > Yes, please change that. fink is woefully out-of-date I believe. > > chris > > On Apr 9, 2010, at 5:25 PM, Dave Messina wrote: > >> Hi everybody, >> >> I just noticed that the link for Mac OS X Installation on the front >> page of the website is: >> >> http://www.bioperl.org/wiki/Getting_BioPerl#Mac_OS_X_using_fink >> >> >> Is fink installation indeed what would be recommended for OS X >> these days? >> >> I ask because I last tried fink a few years ago and could never get >> it to work properly, so my inclination is to point people to >> >> http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix >> >> instead. >> >> >> Dave >> >> >> _______________________________________________ >> Bioperl-guts-l mailing list >> Bioperl-guts-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-guts-l > From lstein at dev.open-bio.org Mon Apr 12 09:59:12 2010 From: lstein at dev.open-bio.org (Lincoln Stein) Date: Mon, 12 Apr 2010 09:59:12 -0400 Subject: [Bioperl-guts-l] [16950] bioperl-live/trunk/Bio/DB/SeqFeature/Store/DBI/mysql.pm: fixed performance of sequence fetching; improvement of roughly 100x Message-ID: <201004121359.o3CDxCJ6012576@dev.open-bio.org> Revision: 16950 Author: lstein Date: 2010-04-12 09:59:12 -0400 (Mon, 12 Apr 2010) Log Message: ----------- fixed performance of sequence fetching; improvement of roughly 100x Modified Paths: -------------- bioperl-live/trunk/Bio/DB/SeqFeature/Store/DBI/mysql.pm Modified: bioperl-live/trunk/Bio/DB/SeqFeature/Store/DBI/mysql.pm =================================================================== --- bioperl-live/trunk/Bio/DB/SeqFeature/Store/DBI/mysql.pm 2010-04-08 13:54:56 UTC (rev 16949) +++ bioperl-live/trunk/Bio/DB/SeqFeature/Store/DBI/mysql.pm 2010-04-12 13:59:12 UTC (rev 16950) @@ -658,23 +658,24 @@ $start-- if defined $start; $end-- if defined $end; - my $offset1 = $self->_offset_boundary($seqid,$start || 'left'); - my $offset2 = $self->_offset_boundary($seqid,$end || 'right'); + my $id = $self->_locationid($seqid); + my $offset1 = $self->_offset_boundary($id,$start || 'left'); + my $offset2 = $self->_offset_boundary($id,$end || 'right'); my $sequence_table = $self->_sequence_table; - my $locationlist_table = $self->_locationlist_table; - my $sth = $self->_prepare(<= ? - AND offset <= ? - ORDER BY offset + FROM $sequence_table as s + WHERE s.id=? + AND s.offset >= ? + AND s.offset <= ? + ORDER BY s.offset END + my $sth = $self->_prepare($sql); my $seq = ''; - $sth->execute($seqid,$offset1,$offset2) or $self->throw($sth->errstr); + $self->_print_query($sql,$id,$offset1,$offset2) if DEBUG || $self->debug; + $sth->execute($id,$offset1,$offset2) or $self->throw($sth->errstr); while (my($frag,$offset) = $sth->fetchrow_array) { substr($frag,0,$start-$offset) = '' if defined $start && $start > $offset; @@ -697,11 +698,12 @@ my $locationlist_table = $self->_locationlist_table; my $sql; - $sql = $position eq 'left' ? "SELECT min(offset) FROM $sequence_table as s,$locationlist_table as ll WHERE s.id=ll.id AND ll.seqname=?" - :$position eq 'right' ? "SELECT max(offset) FROM $sequence_table as s,$locationlist_table as ll WHERE s.id=ll.id AND ll.seqname=?" - :"SELECT max(offset) FROM $sequence_table as s,$locationlist_table as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?"; + $sql = $position eq 'left' ? "SELECT min(offset) FROM $sequence_table as s WHERE s.id=?" + :$position eq 'right' ? "SELECT max(offset) FROM $sequence_table as s WHERE s.id=?" + :"SELECT max(offset) FROM $sequence_table as s WHERE s.id=? AND offset<=?"; my $sth = $self->_prepare($sql); my @args = $position =~ /^-?\d+$/ ? ($seqid,$position) : ($seqid); + $self->_print_query($sql, at args) if DEBUG || $self->debug; $sth->execute(@args) or $self->throw($sth->errstr); my $boundary = $sth->fetchall_arrayref->[0][0]; $sth->finish; From bugzilla-daemon at portal.open-bio.org Mon Apr 12 12:38:28 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 12 Apr 2010 12:38:28 -0400 Subject: [Bioperl-guts-l] [Bug 3055] New: GFF3Loader with allow_whitespace(1) fails to load valid GFF3 with spaces in 9th column Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=3055 Summary: GFF3Loader with allow_whitespace(1) fails to load valid GFF3 with spaces in 9th column Product: BioPerl Version: main-trunk Platform: All OS/Version: All Status: NEW Severity: normal Priority: P2 Component: Core Components AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: nathan.weeks at ars.usda.gov When invoked with "allow_whitespace(1)", the Bio::DB::SeqFeature::Store::GFF3Loader in bioperl-1.6.1 doesn't handle whitespace in the 9th column. The following patch seems to fix the issue for our GFF: ################################################################## --- ./Bio/DB/SeqFeature/Store/GFF3Loader.pm.orig 2010-04-12 11:07:34.000000000 -0500 +++ ./Bio/DB/SeqFeature/Store/GFF3Loader.pm 2010-04-12 11:33:13.000000000 -0500 @@ -497,7 +497,6 @@ my @columns = map {$_ eq '.' ? undef : $_ } split /\t/,$gff_line; $self->invalid_gff($gff_line) if @columns < 4; - $self->invalid_gff($gff_line) if @columns > 9 && $allow_whitespace; { local $^W = 0; ################################################################## To save space/time, I've neglected to mention the "symptoms" -- I can provide this information if requested -- but this issue was found by uploading a GFF containing features with the "Target=" attribute into GBrowse 1.70, which sets allow_whitespace(1) for all uploaded GFFs. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Apr 12 16:25:10 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 12 Apr 2010 16:25:10 -0400 Subject: [Bioperl-guts-l] [Bug 3056] New: Bio::Tools::Run::Primer3 has not updated for Primer3 versions 2 and above Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=3056 Summary: Bio::Tools::Run::Primer3 has not updated for Primer3 versions 2 and above Product: BioPerl Version: main-trunk Platform: All OS/Version: Windows XP Status: NEW Severity: normal Priority: P2 Component: bioperl-run AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: manchunjohn-ma at uiowa.edu Bio::Tools::Run::Primer3 relies on the list of Primer3 1.X arguments hardcoded at $self->@PRIMER3_PARAMS for sanity check in add_targets. However, primer3 versions 2.X, released from December 2008, uses a totally different set of arguments, many of them absent from @PRIMER3_PARAMS. As a result, if these new arguments were invoked at add_targets, such arguments would be patently ignored, per line 367 of primer3.pm. Of course, experienced users can invoke $self->{'no_param_checks'}=1 to bypass the sanity check, but this is not something known to newcomers. There're three possible ways to solve the problem: 1. Expand @PRIMER3_PARAMS to include the 2.X arguments; 2. Remove line 367 of primer3.pm such that "invalid" arguments will be warned but will continued be processed nonetheless, or 3. Encode a setter for no_param_checks. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Apr 12 16:42:50 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 12 Apr 2010 16:42:50 -0400 Subject: [Bioperl-guts-l] [Bug 3056] Bio::Tools::Run::Primer3 has not updated for Primer3 versions 2 and above In-Reply-To: Message-ID: <201004122042.o3CKgoLO009203@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3056 ------- Comment #1 from cjfields at bioperl.org 2010-04-12 16:42 EST ------- Won't fix. We have an experimental Primer3 reimplementation in bioperl-dev that you are welcome to try, called Bio::Tools::Primer3Redux (it has a different API, hence the name). You can checkout the latest code using svn. I'll leave this open in the meantime, as it still needs tests. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Apr 14 09:55:57 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 14 Apr 2010 09:55:57 -0400 Subject: [Bioperl-guts-l] [Bug 3058] New: Bio::SearchIO is unable to parse fasta35 output Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=3058 Summary: Bio::SearchIO is unable to parse fasta35 output Product: BioPerl Version: 1.6 branch Platform: Other OS/Version: Linux Status: NEW Severity: normal Priority: P1 Component: Bio::Search/Bio::SearchIO AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: michael.watson at bbsrc.ac.uk I have two versions of Bioperl, 1.6.0 and 1.6.1 I have a very large fasta35 output file (10,000s of searches). When I try a simple parsing of this file using Bio::SearchIO, I get the following: 1.6.0 ------- Multiple warning messages of the kind: --------------------- WARNING --------------------- MSG: Unrecognized alignment line (1) ' /usr/local/fasta3/bin/fasta35 -n -U -Q -H -A -E 2.0 -C 19 -m 0 -m 9i test.fa ../other_mirs.fa -O test.fasta35' --------------------------------------------------- 1.6.1 ------- A single stack trace: ------------- EXCEPTION ------------- MSG: Unrecognized alignment line (1) ' /usr/local/fasta3/bin/fasta35 -n -U -Q -H -A -E 2.0 -C 19 -m 0 -m 9i test.fa ../other_mirs.fa -O test.fasta35' STACK Bio::SearchIO::fasta::next_result /usr/local/BioPerl-1.6.1/Bio/SearchIO/fasta.pm:1333 STACK toplevel test_6.1.pl:9 ------------------------------------- Further analysis seems to suggest that whenever the warning message is flashed up, that particular SearchIO object is not created. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Apr 14 10:00:35 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 14 Apr 2010 10:00:35 -0400 Subject: [Bioperl-guts-l] [Bug 3058] Bio::SearchIO is unable to parse fasta35 output In-Reply-To: Message-ID: <201004141400.o3EE0ZCq017853@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3058 ------- Comment #1 from michael.watson at bbsrc.ac.uk 2010-04-14 10:00 EST ------- Created an attachment (id=1479) --> (http://bugzilla.open-bio.org/attachment.cgi?id=1479&action=view) Test fasta35 output file -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Apr 14 10:01:24 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 14 Apr 2010 10:01:24 -0400 Subject: [Bioperl-guts-l] [Bug 3058] Bio::SearchIO is unable to parse fasta35 output In-Reply-To: Message-ID: <201004141401.o3EE1Ogn017956@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3058 ------- Comment #2 from michael.watson at bbsrc.ac.uk 2010-04-14 10:01 EST ------- Created an attachment (id=1480) --> (http://bugzilla.open-bio.org/attachment.cgi?id=1480&action=view) A test script for Bioperl 6.0 -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Apr 14 10:01:46 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 14 Apr 2010 10:01:46 -0400 Subject: [Bioperl-guts-l] [Bug 3058] Bio::SearchIO is unable to parse fasta35 output In-Reply-To: Message-ID: <201004141401.o3EE1k58018000@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3058 ------- Comment #3 from michael.watson at bbsrc.ac.uk 2010-04-14 10:01 EST ------- Created an attachment (id=1481) --> (http://bugzilla.open-bio.org/attachment.cgi?id=1481&action=view) A test script for Bioperl 6.1 -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 15 00:08:47 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 15 Apr 2010 00:08:47 -0400 Subject: [Bioperl-guts-l] [Bug 3031] Unable to parse algorithm_reference from BLAST reports using Bio::SearchIO In-Reply-To: Message-ID: <201004150408.o3F48lRP016898@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3031 razi.khaja at gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 15 00:11:36 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 15 Apr 2010 00:11:36 -0400 Subject: [Bioperl-guts-l] [Bug 3031] Unable to parse algorithm_reference from BLAST reports using Bio::SearchIO In-Reply-To: Message-ID: <201004150411.o3F4Baqb017021@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3031 razi.khaja at gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |FIXED ------- Comment #3 from razi.khaja at gmail.com 2010-04-15 00:11 EST ------- Patches were submitted when this Bug/Enhancement was opened. Please patch and run tests. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Thu Apr 15 00:21:18 2010 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Thu, 15 Apr 2010 00:21:18 -0400 Subject: [Bioperl-guts-l] [16951] bioperl-live/trunk: [bug 3031] Message-ID: <201004150421.o3F4LI3Z016242@dev.open-bio.org> Revision: 16951 Author: cjfields Date: 2010-04-15 00:21:17 -0400 (Thu, 15 Apr 2010) Log Message: ----------- [bug 3031] patches for catching algorithm ref, courtesy Razi Khaja. Modified Paths: -------------- bioperl-live/trunk/Bio/SearchIO/blast.pm bioperl-live/trunk/t/SearchIO/blast.t Modified: bioperl-live/trunk/Bio/SearchIO/blast.pm =================================================================== --- bioperl-live/trunk/Bio/SearchIO/blast.pm 2010-04-12 13:59:12 UTC (rev 16950) +++ bioperl-live/trunk/Bio/SearchIO/blast.pm 2010-04-15 04:21:17 UTC (rev 16951) @@ -209,6 +209,7 @@ 'BlastOutput_program' => 'RESULT-algorithm_name', 'BlastOutput_version' => 'RESULT-algorithm_version', + 'BlastOutput_algorithm-reference' => 'RESULT-algorithm_reference', 'BlastOutput_query-def' => 'RESULT-query_name', 'BlastOutput_query-len' => 'RESULT-query_length', 'BlastOutput_query-acc' => 'RESULT-query_accession', @@ -504,6 +505,26 @@ } ); } + # parse the BLAST algorithm reference + elsif(/^Reference:\s+(.*)$/) { + # want to preserve newlines for the BLAST algorithm reference + my $algorithm_reference = "$1\n"; + $_ = $self->_readline; + # while the current line, does not match an empty line, a RID:, or a Database:, we are still looking at the + # algorithm_reference, append it to what we parsed so far + while($_ !~ /^$/ && $_ !~ /^RID:/ && $_ !~ /^Database:/) { + $algorithm_reference .= "$_"; + $_ = $self->_readline; + } + # if we exited the while loop, we saw an empty line, a RID:, or a Database:, so push it back + $self->_pushback($_); + $self->element( + { + 'Name' => 'BlastOutput_algorithm-reference', + 'Data' => $algorithm_reference + } + ); + } # added Windows workaround for bug 1985 elsif (/^(Searching|Results from round)/) { next unless $1 =~ /Results from round/; Modified: bioperl-live/trunk/t/SearchIO/blast.t =================================================================== --- bioperl-live/trunk/t/SearchIO/blast.t 2010-04-12 13:59:12 UTC (rev 16950) +++ bioperl-live/trunk/t/SearchIO/blast.t 2010-04-15 04:21:17 UTC (rev 16951) @@ -7,7 +7,7 @@ use lib '.'; use Bio::Root::Test; - test_begin(-tests => 1116); + test_begin(-tests => 1142); use_ok('Bio::SearchIO'); } @@ -19,6 +19,12 @@ $result = $searchio->next_result; +is($result->algorithm_reference, 'Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, +Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), +"Gapped BLAST and PSI-BLAST: a new generation of protein database search +programs", Nucleic Acids Res. 25:3389-3402. +'); + is($result->database_name, 'ecoli.aa', 'database_name()'); is($result->database_entries, 4289); is($result->database_letters, 1358990); @@ -95,6 +101,9 @@ $result = $searchio->next_result; +is($result->algorithm_reference, 'Gish, W. (1996-2000) http://blast.wustl.edu +'); + is($result->database_name, 'ecoli.aa'); is($result->database_letters, 1358990); is($result->database_entries, 4289); @@ -205,7 +214,8 @@ '-file' => test_input_file('ecolitst.noseqs.wublastp')); $result = $searchio->next_result; - +is($result->algorithm_reference, 'Gish, W. (1996-2004) http://blast.wustl.edu +'); is($result->database_name, 'ecoli.aa'); is($result->database_letters, 1358990); is($result->database_entries, 4289); @@ -278,6 +288,11 @@ '-file' => test_input_file('HUMBETGLOA.tblastx')); $result = $searchio->next_result; +is($result->algorithm_reference, 'Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, +Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), +"Gapped BLAST and PSI-BLAST: a new generation of protein database search +programs", Nucleic Acids Res. 25:3389-3402. +'); is($result->database_name, 'ecoli.nt'); is($result->database_letters, 4662239); is($result->database_entries, 400); @@ -364,6 +379,11 @@ $result = $searchio->next_result; +is($result->algorithm_reference, 'Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, +Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), +"Gapped BLAST and PSI-BLAST: a new generation of protein database search +programs", Nucleic Acids Res. 25:3389-3402. +'); is($result->database_name, 'All GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS, GSS,or phase 0, 1 or 2 HTGS sequences) '); is($result->database_letters, 4677375331); is($result->database_entries, 1083200); @@ -454,6 +474,10 @@ '-file' => test_input_file('dnaEbsub_ecoli.wublastx')); $result = $searchio->next_result; +is($result->algorithm_reference, 'Gish, W. (1996-2000) http://blast.wustl.edu +Gish, Warren and David J. States (1993). Identification of protein coding +regions by database similarity search. Nat. Genet. 3:266-72. +'); is($result->database_name, 'ecoli.aa'); is($result->database_letters, 1358990); is($result->database_entries, 4289); @@ -586,6 +610,8 @@ '-file' => test_input_file('dnaEbsub_ecoli.wutblastn')); $result = $searchio->next_result; +is($result->algorithm_reference, 'Gish, W. (1996-2000) http://blast.wustl.edu +'); is($result->database_name, 'ecoli.nt'); is($result->database_letters, 4662239); is($result->database_entries, 400); @@ -660,6 +686,8 @@ '-file' => test_input_file('dnaEbsub_ecoli.wutblastx')); $result = $searchio->next_result; +is($result->algorithm_reference, 'Gish, W. (1996-2000) http://blast.wustl.edu +'); is($result->database_name, 'ecoli.nt'); is($result->database_letters, 4662239); is($result->database_entries, 400); @@ -770,7 +798,8 @@ '-file' => test_input_file('echofilter.wublastn')); $result = $searchio->next_result; - +is($result->algorithm_reference, 'Gish, W. (1996-2006) http://blast.wustl.edu +'); is($result->database_name, 'NM_003201.fa'); is($result->database_letters, 1936); is($result->database_entries, 1); @@ -832,6 +861,11 @@ @expected = qw(CATH_RAT CATL_HUMAN CATL_RAT PAPA_CARPA); my $results_left = 4; while( my $result = $searchio->next_result ) { + is($result->algorithm_reference, 'Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, +Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), +"Gapped BLAST and PSI-BLAST: a new generation of protein database search +programs", Nucleic Acids Res. 25:3389-3402. +'); is($result->query_name, shift @expected, "Multiblast query test"); $results_left--; } @@ -842,7 +876,11 @@ $searchio = Bio::SearchIO->new('-format' => 'blast', '-file' => test_input_file('test.gcgblast')); $result = $searchio->next_result(); - +is($result->algorithm_reference, 'Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, +Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), +"Gapped BLAST and PSI-BLAST: a new generation of protein database search +programs", Nucleic Acids Res. 25:3389-3402. +'); is($result->query_name, '/v0/people/staji002/test.gcg'); is($result->algorithm, 'BLASTP'); is($result->algorithm_version, '2.2.1 [Apr-13-2001]'); @@ -882,6 +920,11 @@ $searchio = Bio::SearchIO->new(-format => 'blast', -file => test_input_file('testdbaccnums.out')); $result = $searchio->next_result; +is($result->algorithm_reference, 'Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, +Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), +"Gapped BLAST and PSI-BLAST: a new generation of protein database search +programs", Nucleic Acids Res. 25:3389-3402. +'); @valid = ( ['pir||T14789','T14789','T14789','CAB53709','AAH01726'], ['gb|NP_065733.1|CYT19', 'NP_065733','CYT19'], @@ -950,6 +993,10 @@ 6406, 6620, 1, 1691, 1905, 1] ); +is($r->algorithm_reference, 'Zheng Zhang, Scott Schwartz, Lukas Wagner, and Webb Miller (2000), +"A greedy algorithm for aligning DNA sequences", +J Comput Biol 2000; 7(1-2):203-14. +'); is($r->algorithm, 'MEGABLAST'); is($r->query_name, '503384'); is($r->query_description, '11337 bp 2 contigs'); @@ -990,6 +1037,7 @@ -file => test_input_file('ecoli_domains.rpsblast')); $r = $parser->next_result; +is($r->algorithm_reference, undef); is($r->query_name, 'gi|1786183|gb|AAC73113.1|'); is($r->query_gi, 1786183); is($r->num_hits, 7); @@ -1011,7 +1059,11 @@ '-file' => test_input_file('psiblastreport.out')); $result = $searchio->next_result; - +is($result->algorithm_reference, 'Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, +Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), +"Gapped BLAST and PSI-BLAST: a new generation of protein database search +programs", Nucleic Acids Res. 25:3389-3402. +'); is($result->database_name, '/home/peter/blast/data/swissprot.pr'); is($result->database_entries, 88780); is($result->database_letters, 31984247); @@ -1159,6 +1211,7 @@ isa_ok($result,'Bio::Search::Result::ResultI'); is($result->query_name, ''); is($result->algorithm, 'BLASTP'); +is($result->algorithm_reference, undef); $hit = $result->next_hit; is($hit->name, 'ALEU_HORVU'); is($hit->length, 362); @@ -1181,6 +1234,7 @@ isa_ok($result,'Bio::Search::Result::ResultI'); is($result->query_name, ''); is($result->algorithm, 'BLASTN'); +is($result->algorithm_reference, undef); is($result->query_length, 180); $hit = $result->next_hit; is($hit->length, 179); @@ -1208,6 +1262,7 @@ is($result->query_name, ''); is($result->query_length, 180); is($result->algorithm, 'BLASTN'); +is($result->algorithm_reference, undef); $hit = $result->next_hit; is($hit->name, 'human'); is($hit->length, 179); @@ -1267,6 +1322,7 @@ is($result->query_name, 'AE000111.1'); is($result->query_description, 'Escherichia coli K-12 MG1655 section 1 of 400 of the complete genome'); is($result->algorithm, 'BLASTX'); +is($result->algorithm_reference, undef); is($result->query_length, 720); $hit = $result->next_hit; is($hit->name, 'AK1H_ECOLI'); @@ -1296,6 +1352,7 @@ isa_ok($result,'Bio::Search::Result::ResultI'); is($result->query_name, 'Escherichia'); is($result->algorithm, 'TBLASTX'); +is($result->algorithm_reference, undef); @@ Diff output truncated at 10000 characters. @@ From bugzilla-daemon at portal.open-bio.org Thu Apr 15 00:22:51 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 15 Apr 2010 00:22:51 -0400 Subject: [Bioperl-guts-l] [Bug 3031] Unable to parse algorithm_reference from BLAST reports using Bio::SearchIO In-Reply-To: Message-ID: <201004150422.o3F4MpIa017367@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3031 ------- Comment #4 from cjfields at bioperl.org 2010-04-15 00:22 EST ------- Committed in r16951. However, for future reference you don't want to close the bug report out until after the patch is applied (we're all extremely busy people, but we'll get to it). -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 15 11:11:56 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 15 Apr 2010 11:11:56 -0400 Subject: [Bioperl-guts-l] [Bug 3031] Unable to parse algorithm_reference from BLAST reports using Bio::SearchIO In-Reply-To: Message-ID: <201004151511.o3FFBuDC006301@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3031 razi.khaja at gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |razi.khaja at gmail.com Target Milestone|1.6.3 point release |1.6.2 point release ------- Comment #5 from razi.khaja at gmail.com 2010-04-15 11:11 EST ------- Thanks for committing Chris! I'm new at submitting patches and using bugzilla, so I apologize for my mistakes, but I'd like to learn from them. The next time I submit a bug/enhancement and patches, I will try to follow the state changes in the bug life cycle: http://bugzilla.open-bio.org/page.cgi?id=fields.html At the moment the Status=RESOLVED and Resolution=FIXED. Should the Status of the bug now be marked as VERIFIED? Does the Status of the bug get changed to CLOSED when bioperl-1.6.2 is released? Thanks, Razi -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From dave_messina at dev.open-bio.org Mon Apr 19 05:47:06 2010 From: dave_messina at dev.open-bio.org (Dave Messina) Date: Mon, 19 Apr 2010 05:47:06 -0400 Subject: [Bioperl-guts-l] [16952] bioperl-live/trunk/Bio/SearchIO/fasta.pm: typo in new() docs ?\226? \128?\148?\194?\160default ID length is 6, not 7. Message-ID: <201004190947.o3J9l6fu028187@dev.open-bio.org> Revision: 16952 Author: dave_messina Date: 2010-04-19 05:47:05 -0400 (Mon, 19 Apr 2010) Log Message: ----------- typo in new() docs ?\226?\128?\148?\194?\160default ID length is 6, not 7. Modified Paths: -------------- bioperl-live/trunk/Bio/SearchIO/fasta.pm Modified: bioperl-live/trunk/Bio/SearchIO/fasta.pm =================================================================== --- bioperl-live/trunk/Bio/SearchIO/fasta.pm 2010-04-15 04:21:17 UTC (rev 16951) +++ bioperl-live/trunk/Bio/SearchIO/fasta.pm 2010-04-19 09:47:05 UTC (rev 16952) @@ -173,7 +173,7 @@ Function: Builds a new Bio::SearchIO::fasta object Returns : Bio::SearchIO::fasta Args : -idlength - set ID length to something other - than the default (7), this is only + than the default (6), this is only necessary if you have compiled FASTA with a new default id length to display in the HSP alignment blocks From kortsch at dev.open-bio.org Wed Apr 21 06:45:58 2010 From: kortsch at dev.open-bio.org (Dan Kortschak) Date: Wed, 21 Apr 2010 06:45:58 -0400 Subject: [Bioperl-guts-l] [16953] bioperl-live/trunk/Bio/Tools/Run/WrapperBase/CommandExts.pm: Add last_execution attribute and accessor Message-ID: <201004211045.o3LAjwjf031778@dev.open-bio.org> Revision: 16953 Author: kortsch Date: 2010-04-21 06:45:57 -0400 (Wed, 21 Apr 2010) Log Message: ----------- Add last_execution attribute and accessor Modified Paths: -------------- bioperl-live/trunk/Bio/Tools/Run/WrapperBase/CommandExts.pm Modified: bioperl-live/trunk/Bio/Tools/Run/WrapperBase/CommandExts.pm =================================================================== --- bioperl-live/trunk/Bio/Tools/Run/WrapperBase/CommandExts.pm 2010-04-19 09:47:05 UTC (rev 16952) +++ bioperl-live/trunk/Bio/Tools/Run/WrapperBase/CommandExts.pm 2010-04-21 10:45:57 UTC (rev 16953) @@ -996,6 +996,7 @@ } @files; @files = map { defined $_ ? $_ : () } @files; # squish undefs my @ipc_args = ( $exe, @$options, @files ); + $self->{_last_execution} = join( $self->{'_options'}->{'_join'}, @ipc_args ); eval { IPC::Run::run(\@ipc_args, $in, $out, $err) or die ("There was a problem running $exe : ".$$err); @@ -1027,6 +1028,21 @@ return $self->{'_no_throw'}; } +=head2 last_execution() + + Title : last_execution + Usage : + Function: return the last executed command with options + Returns : string of command line sent to IPC::Run + Args : + +=cut + +sub last_execution { + my $self = shift; + return $self->{'_last_execution'}; +} + =head2 _dash_switch() Title : _dash_switch From bugzilla-daemon at portal.open-bio.org Thu Apr 22 07:55:11 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 22 Apr 2010 07:55:11 -0400 Subject: [Bioperl-guts-l] [Bug 3061] New: AlignIO hash sequence storage Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=3061 Summary: AlignIO hash sequence storage Product: BioPerl Version: unspecified Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Core Components AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: bernd at bio.vu.nl Hi Something I stumble one from time to time: the storage of sequence in AlignIO is based in SeqIDs. This complicated reading alignments with duplicate IDs, which actually do occur quite a lot (e.g. CDD of NCBI). Usually I try to "uniqfy" IDs but this is not straightforward for all alignments formats. Actually this is were BioPerl is really useful ;-) I'd propose to store the Sequences in a hash in AlignIO using unique keys, possibly optionally, to be able to read all sequences in the alignment, even when they all have the same ID. This would solve the replacing warnings too. -------------------- WARNING --------------------- MSG: Replacing one sequence [10/1-214] Possibly this can be taken in with the AlignIO refactoring Regards, Bernd -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Apr 23 15:59:26 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 23 Apr 2010 15:59:26 -0400 Subject: [Bioperl-guts-l] [Bug 3058] Bio::SearchIO is unable to parse fasta35 output In-Reply-To: Message-ID: <201004231959.o3NJxQQs000626@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3058 ------- Comment #4 from online at davemessina.com 2010-04-23 15:59 EST ------- Mick, did you see this post? http://groups.google.com/group/bioperl-l/msg/25c17748d1ac6ef4 Could you try rerunning your fasta without the -O flag and redirect output with a > instead? And see whether the output changes such that BioPerl can successfully parse the records? In case it doesn't, I've been working on a fix for this bug using the output file you've attached here. Dave -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sat Apr 24 06:16:15 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sat, 24 Apr 2010 06:16:15 -0400 Subject: [Bioperl-guts-l] [Bug 3063] Unable to parse BLAST RID (Request ID) from BLAST reports using Bio::SearchIO In-Reply-To: Message-ID: <201004241016.o3OAGFed019327@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3063 razi.khaja at gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|razi.khaja at gmail.com |bioperl-guts-l at bioperl.org Status|ASSIGNED |NEW -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sat Apr 24 06:17:35 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sat, 24 Apr 2010 06:17:35 -0400 Subject: [Bioperl-guts-l] [Bug 3063] Unable to parse BLAST RID (Request ID) from BLAST reports using Bio::SearchIO In-Reply-To: Message-ID: <201004241017.o3OAHZL1019362@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3063 razi.khaja at gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Apr 26 12:37:41 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 26 Apr 2010 12:37:41 -0400 Subject: [Bioperl-guts-l] [Bug 3061] AlignIO hash sequence storage In-Reply-To: Message-ID: <201004261637.o3QGbfca025366@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3061 cjfields at bioperl.org changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|normal |enhancement Target Milestone|1.6.3 point release |1.7.0 release ------- Comment #1 from cjfields at bioperl.org 2010-04-26 12:37 EST ------- Bernd, this will likely be handled with the scheduled Align refactor, but it may break API so I'm pushing it to 1.7 and listing it as an enhancement. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Mon Apr 26 12:44:21 2010 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Mon, 26 Apr 2010 12:44:21 -0400 Subject: [Bioperl-guts-l] [16954] bioperl-live/trunk: [bug 3063] Message-ID: <201004261644.o3QGiL5u026350@dev.open-bio.org> Revision: 16954 Author: cjfields Date: 2010-04-26 12:44:21 -0400 (Mon, 26 Apr 2010) Log Message: ----------- [bug 3063] patch for catching RID (courtesy Razi Khaja] Modified Paths: -------------- bioperl-live/trunk/Bio/Search/Result/GenericResult.pm bioperl-live/trunk/Bio/Search/Result/ResultI.pm bioperl-live/trunk/Bio/SearchIO/blast.pm bioperl-live/trunk/t/SearchIO/blast.t Modified: bioperl-live/trunk/Bio/Search/Result/GenericResult.pm =================================================================== --- bioperl-live/trunk/Bio/Search/Result/GenericResult.pm 2010-04-21 10:45:57 UTC (rev 16953) +++ bioperl-live/trunk/Bio/Search/Result/GenericResult.pm 2010-04-26 16:44:21 UTC (rev 16954) @@ -167,6 +167,7 @@ -algorithm => program name (blastx) -algorithm_version => version of the algorithm (2.1.2) -algorithm_reference => literature reference string for this algorithm + -rid => value of the BLAST Request ID (eg. RID: ZABJ4EA7014) -hit_factory => Bio::Factory::ObjectFactoryI capable of making Bio::Search::Hit::HitI objects @@ -185,7 +186,7 @@ my ($qname,$qacc,$qdesc,$qlen, $qgi, $dbname,$dblet,$dbent,$params, $stats, $hits, $algo, $algo_v, - $prog_ref, $algo_r, $hit_factory) = $self->_rearrange([qw(QUERY_NAME + $prog_ref, $algo_r, $rid, $hit_factory) = $self->_rearrange([qw(QUERY_NAME QUERY_ACCESSION QUERY_DESCRIPTION QUERY_LENGTH @@ -200,6 +201,7 @@ ALGORITHM_VERSION PROGRAM_REFERENCE ALGORITHM_REFERENCE + RID HIT_FACTORY )], at args); @@ -208,6 +210,8 @@ defined $algo_v && $self->algorithm_version($algo_v); defined $algo_r && $self->algorithm_reference($algo_r); + defined $rid && $self->rid($rid); + defined $qname && $self->query_name($qname); defined $qacc && $self->query_accession($qacc); defined $qdesc && $self->query_description($qdesc); @@ -743,7 +747,28 @@ sub program_reference { shift->algorithm_reference(@_); } +=head2 rid + Title : rid + Usage : $obj->rid($newval) + Function: + Returns : value of the BLAST Request ID (eg. RID: ZABJ4EA7014) + Args : newvalue (optional) + Comments: The default implementation in ResultI returns an empty string + rather than throwing a NotImplemented exception, since + the RID may not always be available and is not critical. + See: (1) http://www.ncbi.nlm.nih.gov/Class/MLACourse/Modules/BLAST/rid.html + (2) http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/new/node63.html +=cut + +sub rid{ + my ($self,$value) = @_; + if( defined $value) { + $self->{'rid'} = $value; + } + return $self->{'rid'}; +} + =head2 no_hits_found See documentation in L Modified: bioperl-live/trunk/Bio/Search/Result/ResultI.pm =================================================================== --- bioperl-live/trunk/Bio/Search/Result/ResultI.pm 2010-04-21 10:45:57 UTC (rev 16953) +++ bioperl-live/trunk/Bio/Search/Result/ResultI.pm 2010-04-26 16:44:21 UTC (rev 16954) @@ -433,6 +433,25 @@ return ''; } +=head2 rid + + Title : rid + Usage : $obj->rid($newval) + Function: + Returns : value of the BLAST Request ID (eg. RID: ZABJ4EA7014) + Args : newvalue (optional) + Comments: The default implementation in ResultI returns an empty string + rather than throwing a NotImplemented exception, since + the RID may not always be available and is not critical. + See: (1) http://www.ncbi.nlm.nih.gov/Class/MLACourse/Modules/BLAST/rid.html + (2) http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/new/node63.html +=cut + +sub rid{ + my ($self) = @_; + return ''; +} + =head2 num_hits Title : num_hits Modified: bioperl-live/trunk/Bio/SearchIO/blast.pm =================================================================== --- bioperl-live/trunk/Bio/SearchIO/blast.pm 2010-04-21 10:45:57 UTC (rev 16953) +++ bioperl-live/trunk/Bio/SearchIO/blast.pm 2010-04-26 16:44:21 UTC (rev 16954) @@ -210,6 +210,7 @@ 'BlastOutput_program' => 'RESULT-algorithm_name', 'BlastOutput_version' => 'RESULT-algorithm_version', 'BlastOutput_algorithm-reference' => 'RESULT-algorithm_reference', + 'BlastOutput_rid' => 'RESULT-rid', 'BlastOutput_query-def' => 'RESULT-query_name', 'BlastOutput_query-len' => 'RESULT-query_length', 'BlastOutput_query-acc' => 'RESULT-query_accession', @@ -525,6 +526,16 @@ } ); } + # parse BLAST RID (Request ID) + elsif(/^RID:\s+(.*)$/) { + my $rid = $1; + $self->element( + { + 'Name' => 'BlastOutput_rid', + 'Data' => $rid + } + ); + } # added Windows workaround for bug 1985 elsif (/^(Searching|Results from round)/) { next unless $1 =~ /Results from round/; Modified: bioperl-live/trunk/t/SearchIO/blast.t =================================================================== --- bioperl-live/trunk/t/SearchIO/blast.t 2010-04-21 10:45:57 UTC (rev 16953) +++ bioperl-live/trunk/t/SearchIO/blast.t 2010-04-26 16:44:21 UTC (rev 16954) @@ -7,7 +7,7 @@ use lib '.'; use Bio::Root::Test; - test_begin(-tests => 1142); + test_begin(-tests => 1147); use_ok('Bio::SearchIO'); } @@ -384,6 +384,7 @@ "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. '); +is($result->rid, '1012577175-3730-28291'); is($result->database_name, 'All GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS, GSS,or phase 0, 1 or 2 HTGS sequences) '); is($result->database_letters, 4677375331); is($result->database_entries, 1083200); @@ -925,6 +926,7 @@ "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. '); +is($result->rid, '1036160600-011802-21377'); @valid = ( ['pir||T14789','T14789','T14789','CAB53709','AAH01726'], ['gb|NP_065733.1|CYT19', 'NP_065733','CYT19'], @@ -1464,6 +1466,7 @@ $searchio = Bio::SearchIO->new(-format => 'blast', -file => test_input_file('catalase-webblast.BLASTP')); ok($result = $searchio->next_result); +is($result->rid, '1118324516-16598-103707467515.BLASTQ1'); ok($hit = $result->next_hit); is($hit->name, 'gi|40747822|gb|EAA66978.1|', 'full hit name'); is($hit->accession, 'EAA66978', 'hit accession'); @@ -1488,6 +1491,7 @@ (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. '); +is($result->rid, '1141079027-8324-8848328247.BLASTQ4'); is($result->query_name, 'pyrR,'); is($result->query_length, 558); is($result->get_statistic('kappa'), '0.711'); @@ -1560,6 +1564,7 @@ is($result->database_letters, 1533424333); is($result->algorithm, 'BLASTP'); is($result->algorithm_version, '2.2.15 [Oct-15-2006]'); +is($result->rid, '1169055516-21385-22799250964.BLASTQ4'); is($result->query_name, 'gi|15608519|ref|NP_215895.1|'); is($result->query_gi, 15608519); is($result->query_length, 193); From bugzilla-daemon at portal.open-bio.org Mon Apr 26 12:44:50 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 26 Apr 2010 12:44:50 -0400 Subject: [Bioperl-guts-l] [Bug 3063] Unable to parse BLAST RID (Request ID) from BLAST reports using Bio::SearchIO In-Reply-To: Message-ID: <201004261644.o3QGiols025631@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3063 cjfields at bioperl.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |FIXED ------- Comment #2 from cjfields at bioperl.org 2010-04-26 12:44 EST ------- Committed in r16954. Thanks! -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Apr 26 18:06:06 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 26 Apr 2010 18:06:06 -0400 Subject: [Bioperl-guts-l] [Bug 3064] New: unusual blastp output causes SearchIO to crash Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=3064 Summary: unusual blastp output causes SearchIO to crash Product: BioPerl Version: unspecified Platform: Other OS/Version: Mac OS Status: NEW Severity: normal Priority: P2 Component: Bio::Search/Bio::SearchIO AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: jayoung at fhcrc.org CC: jayoung at fhcrc.org Hi, I'm using NCBI's BLASTP 2.2.23+ and have found an unusual output file that makes SearchIO's next_result() crash with the following error: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: no data for midline Query ------------------------------------------------------------ STACK: Error::throw STACK: Bio::Root::Root::throw /home/jayoung/traskdata/perl/bioperl-live/Bio/Root/Root.pm:368 STACK: Bio::SearchIO::blast::next_result /home/jayoung/traskdata/perl/bioperl-live/Bio/SearchIO/blast.pm:1842 STACK: /home/jayoung/bin/blastparsenew_really_simple.bioperl:14 ----------------------------------------------------------- It's the weird-looking line in the middle with no aligned query residues that's causing the issue: Query ------------------------------------------------------------ If I edit that line to include positions, parsing proceeds fine: Query 63 ------------------------------------------------------------ 63 I can't tell if NCBI have recently changed their output specs, or if this issue could have arisen in the past. On an older versions of blast, I tried this particular query and subject pair produced a slightly different alignment, where the long gap wasn't quite as long and didn't stretch over a full output line. I updated my bioperl via svn today (revision 16954). I'll attach the blast result file (as well as the edited version that parses OK) and the query and subject seqs each in fasta format. Hopefully it'll be easy to fix. thanks very much, Janet Young ------------------------------------------------------------------- Dr. Janet Young (Trask lab) Fred Hutchinson Cancer Research Center 1100 Fairview Avenue N., C3-168, P.O. Box 19024, Seattle, WA 98109-1024, USA. tel: (206) 667 1471 fax: (206) 667 6524 email: jayoung ...at... fhcrc.org http://www.fhcrc.org/labs/trask/ ------------------------------------------------------------------- -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Apr 26 18:07:05 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 26 Apr 2010 18:07:05 -0400 Subject: [Bioperl-guts-l] [Bug 3064] unusual blastp output causes SearchIO to crash In-Reply-To: Message-ID: <201004262207.o3QM75MK001309@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3064 ------- Comment #1 from jayoung at fhcrc.org 2010-04-26 18:07 EST ------- Created an attachment (id=1485) --> (http://bugzilla.open-bio.org/attachment.cgi?id=1485&action=view) blast output -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Apr 26 18:07:21 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 26 Apr 2010 18:07:21 -0400 Subject: [Bioperl-guts-l] [Bug 3064] unusual blastp output causes SearchIO to crash In-Reply-To: Message-ID: <201004262207.o3QM7LqF001332@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3064 ------- Comment #2 from jayoung at fhcrc.org 2010-04-26 18:07 EST ------- Created an attachment (id=1486) --> (http://bugzilla.open-bio.org/attachment.cgi?id=1486&action=view) query seq -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Apr 26 18:07:47 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 26 Apr 2010 18:07:47 -0400 Subject: [Bioperl-guts-l] [Bug 3064] unusual blastp output causes SearchIO to crash In-Reply-To: Message-ID: <201004262207.o3QM7ldS001355@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3064 ------- Comment #3 from jayoung at fhcrc.org 2010-04-26 18:07 EST ------- Created an attachment (id=1487) --> (http://bugzilla.open-bio.org/attachment.cgi?id=1487&action=view) subject seq -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Apr 26 18:08:08 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 26 Apr 2010 18:08:08 -0400 Subject: [Bioperl-guts-l] [Bug 3064] unusual blastp output causes SearchIO to crash In-Reply-To: Message-ID: <201004262208.o3QM88wm001378@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3064 ------- Comment #4 from jayoung at fhcrc.org 2010-04-26 18:08 EST ------- Created an attachment (id=1488) --> (http://bugzilla.open-bio.org/attachment.cgi?id=1488&action=view) edited blast output - parses fine -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Apr 26 18:13:28 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 26 Apr 2010 18:13:28 -0400 Subject: [Bioperl-guts-l] [Bug 3064] unusual blastp output causes SearchIO to crash In-Reply-To: Message-ID: <201004262213.o3QMDSfp001497@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3064 cjfields at bioperl.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED ------- Comment #5 from cjfields at bioperl.org 2010-04-26 18:13 EST ------- (In reply to comment #0) > Hi, > > I'm using NCBI's BLASTP 2.2.23+ and have found an unusual output file that > makes SearchIO's next_result() crash with the following error: > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: no data for midline Query > ------------------------------------------------------------ > STACK: Error::throw > STACK: Bio::Root::Root::throw > /home/jayoung/traskdata/perl/bioperl-live/Bio/Root/Root.pm:368 > STACK: Bio::SearchIO::blast::next_result > /home/jayoung/traskdata/perl/bioperl-live/Bio/SearchIO/blast.pm:1842 > STACK: /home/jayoung/bin/blastparsenew_really_simple.bioperl:14 > ----------------------------------------------------------- > > It's the weird-looking line in the middle with no aligned query residues that's > causing the issue: > Query ------------------------------------------------------------ > > If I edit that line to include positions, parsing proceeds fine: > Query 63 ------------------------------------------------------------ 63 > > I can't tell if NCBI have recently changed their output specs, or if this issue > could have arisen in the past. On an older versions of blast, I tried this > particular query and subject pair produced a slightly different alignment, > where the long gap wasn't quite as long and didn't stretch over a full output > line. NCBI has recently updated their specs for this, we just hadn't encountered it yet (so this is a good test case). We'll work on fixing it; thanks for the bug submission. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From fangly at dev.open-bio.org Tue Apr 27 00:28:55 2010 From: fangly at dev.open-bio.org (Florent E Angly) Date: Tue, 27 Apr 2010 00:28:55 -0400 Subject: [Bioperl-guts-l] [16955] bioperl-live/trunk: Partial redesign to simplify/ clarify the internal code of B::A::T::ContigSpectrum Message-ID: <201004270428.o3R4St07022483@dev.open-bio.org> Revision: 16955 Author: fangly Date: 2010-04-27 00:28:55 -0400 (Tue, 27 Apr 2010) Log Message: ----------- Partial redesign to simplify/clarify the internal code of B::A::T::ContigSpectrum Modified Paths: -------------- bioperl-live/trunk/Bio/Assembly/Tools/ContigSpectrum.pm bioperl-live/trunk/t/Assembly/ContigSpectrum.t Modified: bioperl-live/trunk/Bio/Assembly/Tools/ContigSpectrum.pm =================================================================== --- bioperl-live/trunk/Bio/Assembly/Tools/ContigSpectrum.pm 2010-04-26 16:44:21 UTC (rev 16954) +++ bioperl-live/trunk/Bio/Assembly/Tools/ContigSpectrum.pm 2010-04-27 04:28:55 UTC (rev 16955) @@ -141,8 +141,7 @@ to_string create a string representation of the spectrum spectrum import a hash contig spectrum - contig determine a contig spectrum from a contig - assembly determine a contig spectrum from an assembly + assembly determine a contig spectrum from an assembly, contig or singlet dissolve calculate a dissolved contig spectrum (depends on assembly) cross produce a cross contig spectrum (depends on assembly) add add a contig spectrum to an existing one @@ -553,39 +552,17 @@ return $spectrum; } -=head2 contig - Title : contig - Usage : my @obj_list = $csp->contig(); - Function: Update the contig spectrum object by adding a contig or singlet - object / get a reference to the list of assembly, contig and singlet - objects used in the contig spectrum. - Returns : array reference of Bio::Assembly::Scaffold, Bio::Assembly::Contig and - Bio::Assembly::Singlet objects - Args : Bio::Assembly::Contig or Bio::Assembly::Singlet object - -=cut - -sub contig { - my ($self, $contig) = @_; - if (defined $contig) { - $self->_import_contig($contig); - } - my @obj_list = @{$self->{'_assembly'}} if defined $self->{'_assembly'}; - return \@obj_list; -} - - =head2 assembly Title : assembly Usage : my @obj_list = $csp->assembly(); - Function: Update the contig spectrum object by adding an assembly object / get - a reference to the list of assembly, contig and singlet objects used - in the contig spectrum object. - Returns : array reference of Bio::Assembly::Scaffold, Bio::Assembly::Contig and - Bio::Assembly::Singlet objects - Args : Bio::Assembly::Scaffold object + Function: Update the contig spectrum object by adding an assembly, contig or + singlet object to it + Returns : arrayref of assembly, contig and singlet objects used in the contig + spectrum object (Bio::Assembly::Scaffold, Bio::Assembly::Contig and + Bio::Assembly::Singlet objects) + Args : Bio::Assembly::Scaffold, Contig or Singlet object =cut @@ -594,8 +571,24 @@ if (defined $assembly) { $self->_import_assembly($assembly); } + return $self->get_assembly(); +} + + +=head2 get_assembly + + Title : get_assembly + Usage : $csp->get_assembly(); + Function: Get all assembly objects associated with a contig spectrum. + Returns : array reference of Bio::Assembly::Scaffold, Contig and Singlet objects + Args : none + +=cut + +sub get_assembly { + my ($self) = @_; my @obj_list = @{$self->{'_assembly'}} if defined $self->{'_assembly'}; - return \@obj_list; + return @obj_list; } @@ -959,9 +952,9 @@ Title : _new_from_assembly Usage : Function: Creates a new contig spectrum object based solely on the result of - an assembly - Returns : Bio::Assembly::Tools::ContigSpectrum - Args : Bio::Assembly::Scaffold + an assembly, contig or singlet + Returns : Bio::Assembly::Tools::ContigSpectrum object + Args : Bio::Assembly::Scaffold, Contig or Singlet object =cut @@ -985,7 +978,7 @@ # 3: Set sequence statistics: nof_seq and avg_seq_len ($csp->{'_avg_seq_len'}, $csp->{'_nof_seq'}) = $self->_get_assembly_seq_stats($assemblyobj); # 4: Set the spectrum: spectrum and max_size - for my $contigobj ($assemblyobj->all_contigs) { + for my $contigobj ( $self->_get_contig_like($assemblyobj) ) { my $size = $contigobj->num_sequences; if (defined $csp->{'_spectrum'}{$size}) { $csp->{'_spectrum'}{$size}++; @@ -994,11 +987,6 @@ } $csp->{'_max_size'} = $size if $size > $csp->{'_max_size'}; } - my $nof_singlets = $assemblyobj->get_nof_singlets(); - if (defined $nof_singlets) { - $csp->{'_spectrum'}{1} += $nof_singlets; - $csp->{'_max_size'} = 1 if $nof_singlets >= 1 && $csp->{'_max_size'} < 1; - } # 5: Set list of assembly objects used push @{$csp->{'_assembly'}}, $assemblyobj; # 6: Set number of repetitions @@ -1007,48 +995,6 @@ } -=head2 _new_from_contig - - Title : _new_from_contig - Usage : - Function: Creates a new contig spectrum object based solely on a contig or - singlet - Returns : Bio::Assembly::Tools::ContigSpectrum - Args : Bio::Assembly::Contig or Bio::Assembly::Singlet - -=cut - -sub _new_from_contig { - # Create new contig spectrum object based purely on what we can get from a - # contig object - my ($self, $contigobj) = @_; - my $csp = Bio::Assembly::Tools::ContigSpectrum->new(); - # 1: Set id - $csp->{'_id'} = $contigobj->id; - # 2: Set overlap statistics: nof_overlaps, min_overlap, avg_overlap, - # min_identity and avg_identity - $csp->{'_eff_asm_params'} = $self->{'_eff_asm_params'}; - $csp->{'_min_overlap'} = $self->{'_min_overlap'}; - $csp->{'_min_identity'} = $self->{'_min_identity'}; - if ($csp->{'_eff_asm_params'} > 0) { - ( $csp->{'_avg_overlap'}, $csp->{'_avg_identity'}, $csp->{'_min_overlap'}, - $csp->{'_min_identity'}, $csp->{'_nof_overlaps'} ) - = $csp->_get_contig_overlap_stats($contigobj); - } - # 3: Set sequence statistics: nof_seq and avg_seq_len - ($csp->{'_avg_seq_len'}, $csp->{'_nof_seq'}) = $csp->_get_contig_seq_stats($contigobj); - # 4: Set the spectrum: spectrum and max_size - my $size = $contigobj->num_sequences; - $csp->{'_spectrum'}{$size} = 1; - $csp->{'_max_size'} = $size; - # 5: Set list of assembly objects used - push @{$csp->{'_assembly'}}, $contigobj; - # 6: Set number of repetitions - $csp->{'_nof_rep'} = 1; - return $csp; -} - - =head2 _new_dissolved_csp Title : @@ -1109,42 +1055,28 @@ my $asm_spectrum = { 1 => 0 }; my $good_seqs = {}; for my $obj (@{$mixed_csp->{'_assembly'}}) { + # Dissolve this assembly/contig/singlet for the given sequences - if ($obj->isa('Bio::Assembly::Scaffold')) { - my $assembly = $obj; - # For each contig/singlet - for my $contig ($assembly->all_contigs, $assembly->all_singlets) { - ($asm_spectrum, $good_seqs) = $self->_dissolve_contig($dissolved, $contig, $seq_header, $asm_spectrum, $good_seqs); - } - } elsif ($obj->isa('Bio::Assembly::Contig')) { - # a contig or singlet - my $contig = $obj; - ($asm_spectrum, $good_seqs) = $self->_dissolve_contig($dissolved, $contig, $seq_header, $asm_spectrum, $good_seqs); + for my $contig ( $self->_get_contig_like($obj) ) { + ($asm_spectrum, $good_seqs) = $self->_dissolve_contig($dissolved, $contig, + $seq_header, $asm_spectrum, $good_seqs); } # Update spectrum $dissolved->_import_spectrum($asm_spectrum); + # Update nof_rep $dissolved->{'_nof_rep'}--; $dissolved->{'_nof_rep'} += $mixed_csp->{'_nof_rep'}; # Get sequence and overlap stats - if ($obj->isa('Bio::Assembly::Scaffold')) { - ($dissolved->{'_avg_seq_len'}, $dissolved->{'_nof_seq'}) = - $dissolved->_get_assembly_seq_stats($obj, $good_seqs); - if ($dissolved->{'_eff_asm_params'} > 0) { - ( $dissolved->{'_avg_overlap'}, $dissolved->{'_avg_identity'}, $dissolved->{'_min_overlap'}, - $dissolved->{'_min_identity'}, $dissolved->{'_nof_overlaps'} ) - = $dissolved->_get_assembly_overlap_stats($obj, $good_seqs); - } - } elsif ($obj->isa('Bio::Assembly::Contig')) { - ($dissolved->{'_avg_seq_len'}, $dissolved->{'_nof_seq'}) = - $dissolved->_get_contig_seq_stats($obj, $good_seqs); - if ($dissolved->{'_eff_asm_params'} > 0) { - ( $dissolved->{'_avg_overlap'}, $dissolved->{'_avg_identity'}, $dissolved->{'_min_overlap'}, - $dissolved->{'_min_identity'}, $dissolved->{'_nof_overlaps'} ) - = $dissolved->_get_contig_overlap_stats($obj, $good_seqs); - } + ($dissolved->{'_avg_seq_len'}, $dissolved->{'_nof_seq'}) = + $dissolved->_get_assembly_seq_stats($obj, $good_seqs); + if ($dissolved->{'_eff_asm_params'} > 0) { + ( $dissolved->{'_avg_overlap'}, $dissolved->{'_avg_identity'}, + $dissolved->{'_min_overlap'}, $dissolved->{'_min_identity'}, + $dissolved->{'_nof_overlaps'} ) + = $dissolved->_get_assembly_overlap_stats($obj, $good_seqs); } } @@ -1175,7 +1107,9 @@ # Update spectrum my $size = scalar @contig_seqs; - if ($size == 1) { + if ($size == 0) { + # do nothing + } elsif ($size == 1) { $$asm_spectrum{1}++; } elsif ($size > 1) { # Reassemble good sequences @@ -1186,7 +1120,9 @@ for my $qsize (keys %$contig_spectrum) { $$asm_spectrum{$qsize} += $$contig_spectrum{$qsize}; } - } + } else { + $self->throw("The size is not valid... how could that happen?"); + } return $asm_spectrum, $good_seqs; } @@ -1237,35 +1173,23 @@ my $spectrum = {1 => 0}; my $good_seqs = {}; for my $obj (@{$mixed_csp->{'_assembly'}}) { - if ($obj->isa('Bio::Assembly::Scaffold')) { - # Go through contigs and skip the pure ones - my $assembly = $obj; - for my $contig ($assembly->all_contigs) { - ($spectrum, $good_seqs) = $self->_cross_contig($cross, $contig, $spectrum, $good_seqs); - } - # Get sequence stats - ($cross->{'_avg_seq_len'}, $cross->{'_nof_seq'}) = $cross->_get_assembly_seq_stats($assembly, $good_seqs); - # Get eff_asm_param for these sequences - if ($cross->{'_eff_asm_params'} > 0) { - ( $cross->{'_avg_overlap'}, $cross->{'_avg_identity'}, $cross->{'_min_overlap'}, - $cross->{'_min_identity'}, $cross->{'_nof_overlaps'} ) - = $cross->_get_assembly_overlap_stats($assembly, $good_seqs); - } - } elsif ($obj->isa('Bio::Assembly::Contig')) { - my $contig = $obj; - ($spectrum, $good_seqs) = $self->_cross_contig($cross, $contig, $spectrum, $good_seqs); - # Get sequence stats @@ Diff output truncated at 10000 characters. @@ From cjfields at dev.open-bio.org Tue Apr 27 17:09:19 2010 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Tue, 27 Apr 2010 17:09:19 -0400 Subject: [Bioperl-guts-l] [16956] bioperl-live/trunk: [bug 3064] Message-ID: <201004272109.o3RL9JxF000661@dev.open-bio.org> Revision: 16956 Author: cjfields Date: 2010-04-27 17:09:18 -0400 (Tue, 27 Apr 2010) Log Message: ----------- [bug 3064] Parse BLAST+ output with unnumbered gaps for query/subject Modified Paths: -------------- bioperl-live/trunk/Bio/SearchIO/blast.pm bioperl-live/trunk/t/SearchIO/blast.t Added Paths: ----------- bioperl-live/trunk/t/data/blast_plus.blastp Modified: bioperl-live/trunk/Bio/SearchIO/blast.pm =================================================================== --- bioperl-live/trunk/Bio/SearchIO/blast.pm 2010-04-27 04:28:55 UTC (rev 16955) +++ bioperl-live/trunk/Bio/SearchIO/blast.pm 2010-04-27 21:09:18 UTC (rev 16956) @@ -1820,7 +1820,7 @@ last; } chomp; - if (/^((Query|Sbjct):?\s+(\-?\d+)\s*)(\S+)\s+(\-?\d+)/) { + if (/^((Query|Sbjct):?\s+(\-?\d+)?\s*)(\S+)\s+(\-?\d+)?/) { my ( $full, $type, $start, $str, $end ) = ( $1, $2, $3, $4, $5 ); Modified: bioperl-live/trunk/t/SearchIO/blast.t =================================================================== --- bioperl-live/trunk/t/SearchIO/blast.t 2010-04-27 04:28:55 UTC (rev 16955) +++ bioperl-live/trunk/t/SearchIO/blast.t 2010-04-27 21:09:18 UTC (rev 16956) @@ -7,7 +7,7 @@ use lib '.'; use Bio::Root::Test; - test_begin(-tests => 1147); + test_begin(-tests => 1153); use_ok('Bio::SearchIO'); } @@ -1699,3 +1699,29 @@ my ($tval, $aval) = @_; is(sprintf("%g",$tval), sprintf("%g",$aval)); } + +# bug 3064 - All-gap Query/Subject lines for BLAST+ do not have numbering + +$file = test_input_file('blast_plus.blastp'); + +$searchio = Bio::SearchIO->new(-format => 'blast', + -file => $file); + +my $total_hsps = 0; +while(my $query = $searchio->next_result) { + while(my $subject = $query->next_hit) { + while (my $hsp = $subject->next_hsp) { + $total_hsps++; + if ($total_hsps == 1) { + is($hsp->start('query'), 5); + is($hsp->start('hit'), 3); + is($hsp->end('query'), 220); + is($hsp->end('hit'), 308); + is(length($hsp->query_string), length($hsp->hit_string)); + } + } + } +} + +is($total_hsps, 2); + Added: bioperl-live/trunk/t/data/blast_plus.blastp =================================================================== --- bioperl-live/trunk/t/data/blast_plus.blastp (rev 0) +++ bioperl-live/trunk/t/data/blast_plus.blastp 2010-04-27 21:09:18 UTC (rev 16956) @@ -0,0 +1,93 @@ +BLASTP 2.2.23+ + + +Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. +Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. +Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of +protein database search programs", Nucleic Acids Res. 25:3389-3402. + + + +Reference for composition-based statistics: Alejandro A. Schaffer, +L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri +I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), +"Improving the accuracy of PSI-BLAST protein database searches with +composition-based statistics and other refinements", Nucleic Acids +Res. 29:2994-3005. + + + +Database: subject.fa + 1 sequences; 311 total letters + + + +Query= query +Length=220 + Score E +Sequences producing significant alignments: (Bits) Value + + subject 162 4e-45 + + +> subject +Length=311 + + Score = 162 bits (411), Expect = 4e-45, Method: Compositional matrix adjust. + Identities = 110/308 (35%), Positives = 145/308 (47%), Gaps = 94/308 (30%) + +Query 5 NNTQISGFLLMGLSNKPELQLPIFGLFLSMYLITVFGNLLIILDISSDSHLHTPMYFF-- 62 + N + IS F L G+S PE Q +FG+FL MYL+T+ GNLLIIL I SD HLHTPMYFF +Sbjct 3 NQSSISEFFLRGISAPPEQQQSLFGIFLCMYLVTLTGNLLIILAIGSDLHLHTPMYFFLA 62 + +Query ------------------------------------------------------------ + +Sbjct 63 NLSFVDMGLTSSTVTKMLVNIQTRHHTISYTGCLTQMYFFLMFGDLDSFFLAAMAYDRYV 122 + +Query 63 --------------------------LANLXV--QSLMLLQLSFCSEVEIPHFFCELHQM 94 + L N+ + ++ +LSFC EI HFFC++ + +Sbjct 123 AICHPLCYSTVMRPQVCALMLALCWVLTNIVALTHTFLMARLSFCVTGEIAHFFCDITPV 182 + +Query 95 IQLACSDTFLNDTVIYV--STVLLACGPLTGILYSYSKIVSSICRISSAQGKYKAFSTCA 152 + ++L+CSDT +N+ +++V TVL+ P I+ SY IV +I R+ + G KAFSTC+ +Sbjct 183 LKLSCSDTHINEMMVFVLGGTVLIV--PFLCIVTSYIHIVPAILRVRTRGGVGKAFSTCS 240 + +Query 153 SHLSVVSLFYCTVLGVYLSCAATQSSHGSAVASVMYTVVTPMLNPFIYSLRNKDIKEALI 212 + SHL VV +FY T+ YL + S A+ MYT+VTPMLNPFIYSLRNKD+K AL +Sbjct 241 SHLCVVCVFYGTLFSAYLCPPSIASEEKDIAAAAMYTIVTPMLNPFIYSLRNKDMKGALK 300 + +Query 213 RFLRRVTI 220 + R +I +Sbjct 301 RLFSHRSI 308 + + + Score = 15.0 bits (27), Expect = 1.9, Method: Compositional matrix adjust. + Identities = 6/10 (60%), Positives = 8/10 (80%), Gaps = 0/10 (0%) + +Query 125 LYSYSKIVSS 134 + L+S+ IVSS +Sbjct 302 LFSHRSIVSS 311 + + + +Lambda K H + 0.328 0.139 0.412 + +Gapped +Lambda K H + 0.267 0.0410 0.140 + +Effective search space used: 55770 + + + Database: subject.fa + Posted date: Apr 26, 2010 2:49 PM + Number of letters in database: 311 + Number of sequences in database: 1 + + + +Matrix: BLOSUM62 +Gap Penalties: Existence: 11, Extension: 1 +Neighboring words threshold: 11 +Window for multiple hits: 40 From bugzilla-daemon at portal.open-bio.org Tue Apr 27 17:10:30 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 27 Apr 2010 17:10:30 -0400 Subject: [Bioperl-guts-l] [Bug 3064] unusual blastp output causes SearchIO to crash In-Reply-To: Message-ID: <201004272110.o3RLAUnT008923@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3064 cjfields at bioperl.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |FIXED ------- Comment #6 from cjfields at bioperl.org 2010-04-27 17:10 EST ------- Janet, this is now fixed (along with tests) in r16956. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Apr 27 19:30:08 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 27 Apr 2010 19:30:08 -0400 Subject: [Bioperl-guts-l] [Bug 3064] unusual blastp output causes SearchIO to crash In-Reply-To: Message-ID: <201004272330.o3RNU87n012161@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3064 ------- Comment #7 from jayoung at fhcrc.org 2010-04-27 19:30 EST ------- Thanks, Chris - so fast! Works on my end too. (In reply to comment #6) > Janet, this is now fixed (along with tests) in r16956. > -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Apr 28 08:04:56 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 28 Apr 2010 08:04:56 -0400 Subject: [Bioperl-guts-l] [Bug 3061] AlignIO hash sequence storage In-Reply-To: Message-ID: <201004281204.o3SC4u6O001596@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3061 ------- Comment #2 from bernd at bio.vu.nl 2010-04-28 08:04 EST ------- Sure. Just one thing to think about: the interleaved formats (e.g. clustalw, msf, stockholm,selex) also use hashed to concatenate sequences. It would be great if the readers could handle duplicate IDs too. E.g phylip.pm uses $hash{$count} -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Apr 28 08:36:46 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 28 Apr 2010 08:36:46 -0400 Subject: [Bioperl-guts-l] [Bug 3061] AlignIO hash sequence storage In-Reply-To: Message-ID: <201004281236.o3SCakIr002463@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3061 ------- Comment #3 from cjfields at bioperl.org 2010-04-28 08:36 EST ------- Actually, I think the current the sequence storage is indexed by NSE instead of simple seq_id (NSE takes into account seq_id, version, start, end, strand). For example. one can parse Rfam output via Bio::AlignIO::stockholm; Rfam contains multiple sequences with the same ID but different locations, therefore different NSE. From SimpleAlign::add_seq: $name = $seq->get_nse; if( $self->{'_seq'}->{$name} ) { $self->warn("Replacing one sequence [$name]\n") unless $self->verbose < 0; } I would consider the ability to catch possibly redundant seqs (e.g. same NSE) to be a feature, not a bug, so we would need some reasonable explanation as to why this is necessary, and why the solution you suggest (i.e. modifying the seq_id, version, etc) wouldn't be a more appropriate solution. (In reply to comment #2) > Sure. Just one thing to think about: the interleaved formats (e.g. clustalw, > msf, stockholm,selex) also use hashed to concatenate sequences. It would be > great if the readers could handle duplicate IDs too. E.g phylip.pm uses > $hash{$count} -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Apr 28 08:52:21 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 28 Apr 2010 08:52:21 -0400 Subject: [Bioperl-guts-l] [Bug 3065] New: Bio::FeatureIO::gff _handle_feature (BioPerl 1.6.1) doesn't cope with ; delimited attributes Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=3065 Summary: Bio::FeatureIO::gff _handle_feature (BioPerl 1.6.1) doesn't cope with ; delimited attributes Product: BioPerl Version: 1.6 branch Platform: PC OS/Version: Linux Status: NEW Keywords: Bioperl Severity: normal Priority: P2 Component: Core Components AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: pkensche at cmbi.ru.nl A GFF line in which attributes are not separated by ';' but by '; ', i.e. with additional space, are not correctly parsed by _handle_feature. Example: 10 protein_coding gene 1085848 1095110 . - . gene_id="ENSG00000067064"; ID="ENST00000381344" Furthermore, the failure is not reported but a feature without any of the attributes in the attributes column is returned -- the tags whose names start with a space are silently skipped by the "grep {/^[a-z]/} keys %attr;" at the end of the function. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Apr 28 17:14:22 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 28 Apr 2010 17:14:22 -0400 Subject: [Bioperl-guts-l] [Bug 3068] New: SeqIO::fastq parser fails to include single 0 with quality scores Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=3068 Summary: SeqIO::fastq parser fails to include single 0 with quality scores Product: BioPerl Version: unspecified Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Bio::SeqIO AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: grok.gene at gmail.com I was trying to write a simple fastq to fasta converter when I encountered a problem with specific input. Code: #! /usr/bin/perl use strict; use Bio::SeqIO; my $file = shift; my $fq = Bio::SeqIO->newFh(-file => $file, -format => "fastq"); my $fa = Bio::SeqIO->newFh(-format => "fasta"); print $fa $_ while <$fq>; When presented with a fastq sequence like this: @G001760|T001760|NODE_32595_length_1986_cov_16.734138 tGTGAAAAACGACAGATCTTCTCATCCTGAAACGTCTGCTGCTGTAGATGGTttatgttg c + 6EEEEEEEEEEEEEEEEEEEEEE8EEEEEEEEEEEEEEEEEEEEEEEEEEEE64555654 0 The SeqIO::fastq subroutine next_dataset fails to associate the trailing '0' with the quality string. It seems it has to be a lone 0. I've tried replacing it with various other characters in the legal range for quality scores (not extensively). Two 0s are caught correctly as well. The result of this bug is an exception thrown by qc measures implemented in next_dataset: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Quality string [6EEEEEEEEEEEEEEEEEEEEEE8EEEEEEEEEEEEEEEEEEEEEEEEEEEE64555654] of length [60] doesn't match length of sequence tGTGAAAAACGACAGATCTTCTCATCCTGAAACGTCTGCTGCTGTAGATGGTttatgttgc [61], line: 6 STACK: Error::throw STACK: Bio::Root::Root::throw /usr/local/lib/perl5/site_perl/5.10.0/Bio/Root/Root.pm:368 STACK: Bio::SeqIO::fastq::next_dataset /usr/local/lib/perl5/site_perl/5.10.0/Bio/SeqIO/fastq.pm:102 STACK: Bio::SeqIO::fastq::next_seq /usr/local/lib/perl5/site_perl/5.10.0/Bio/SeqIO/fastq.pm:29 STACK: /home/grokgene/scripts/fastq2fasta.pl:11 ----------------------------------------------------------- I've traced briefly through next_dataset but I'm not sure what is causing this. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From dave_messina at dev.open-bio.org Wed Apr 28 18:46:39 2010 From: dave_messina at dev.open-bio.org (Dave Messina) Date: Wed, 28 Apr 2010 18:46:39 -0400 Subject: [Bioperl-guts-l] [16957] bioperl-live/trunk/Bio/SeqIO/fastq.pm: fix for bug #3068 Message-ID: <201004282246.o3SMkdxL003113@dev.open-bio.org> Revision: 16957 Author: dave_messina Date: 2010-04-28 18:46:38 -0400 (Wed, 28 Apr 2010) Log Message: ----------- fix for bug #3068 Modified Paths: -------------- bioperl-live/trunk/Bio/SeqIO/fastq.pm Modified: bioperl-live/trunk/Bio/SeqIO/fastq.pm =================================================================== --- bioperl-live/trunk/Bio/SeqIO/fastq.pm 2010-04-27 21:09:18 UTC (rev 16956) +++ bioperl-live/trunk/Bio/SeqIO/fastq.pm 2010-04-28 22:46:38 UTC (rev 16957) @@ -80,7 +80,7 @@ last FASTQ } chomp $line; - if (!$line) { + if ($line =~ /^$/) { delete $self->{lastline}; last FASTQ; } From bugzilla-daemon at portal.open-bio.org Wed Apr 28 18:48:12 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 28 Apr 2010 18:48:12 -0400 Subject: [Bioperl-guts-l] [Bug 3068] SeqIO::fastq parser fails to include single 0 with quality scores In-Reply-To: Message-ID: <201004282248.o3SMmCq6021827@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3068 online at davemessina.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #1 from online at davemessina.com 2010-04-28 18:48 EST ------- Fixed committed in r16957. All tests pass. Thanks for the report! -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Apr 28 21:36:47 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 28 Apr 2010 21:36:47 -0400 Subject: [Bioperl-guts-l] [Bug 3070] New: get_tiled_alns() skips first HSP Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=3070 Summary: get_tiled_alns() skips first HSP Product: BioPerl Version: 1.6 branch Platform: Macintosh OS/Version: Mac OS Status: NEW Severity: normal Priority: P2 Component: Bio::Search/Bio::SearchIO AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: jnmaloof at gmail.com Bio::Search::Tiling::MapTiling get_tiled_alns() leaves out one HSP from the alignment if there are more than one HSP per hit. As far as I can tell it is the first HSP that gets omitted. If there is only one HSP in a hit then it is included. Steps to reproduce: run the attached script "demo_get_tiled_alns.pl" having the file "demoBlastout.xml" in the same directory. Actual results: The blast file has a single result, single hit, with two hsps. The tiling that is produced contains both HSPs but only the second one (second one in the blast file) is included in the alignment. Expected results: both HSPs should be included in the alignment version: from bio-live. Tiling module updated on March 04, 2010 -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Apr 28 21:38:51 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 28 Apr 2010 21:38:51 -0400 Subject: [Bioperl-guts-l] [Bug 3070] get_tiled_alns() skips first HSP In-Reply-To: Message-ID: <201004290138.o3T1cptW025894@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3070 ------- Comment #1 from jnmaloof at gmail.com 2010-04-28 21:38 EST ------- Created an attachment (id=1494) --> (http://bugzilla.open-bio.org/attachment.cgi?id=1494&action=view) script to demonstrate bug use the attached script and data file to demonstrate the bug -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Apr 28 21:39:54 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 28 Apr 2010 21:39:54 -0400 Subject: [Bioperl-guts-l] [Bug 3070] get_tiled_alns() skips first HSP In-Reply-To: Message-ID: <201004290139.o3T1dsNW025935@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3070 ------- Comment #2 from jnmaloof at gmail.com 2010-04-28 21:39 EST ------- Created an attachment (id=1495) --> (http://bugzilla.open-bio.org/attachment.cgi?id=1495&action=view) blast file for demonstrating the bug use this in conjunction with attachment 1494 -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Apr 28 21:40:56 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 28 Apr 2010 21:40:56 -0400 Subject: [Bioperl-guts-l] [Bug 3070] get_tiled_alns() skips first HSP In-Reply-To: Message-ID: <201004290140.o3T1eurk025970@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3070 jnmaloof at gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |maj at fortinbras.us -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Apr 28 23:22:15 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 28 Apr 2010 23:22:15 -0400 Subject: [Bioperl-guts-l] [Bug 3070] get_tiled_alns() skips first HSP In-Reply-To: Message-ID: <201004290322.o3T3MF3E028037@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3070 ------- Comment #3 from maj at fortinbras.us 2010-04-28 23:22 EST ------- Ahh...I think I see what's happening. I don't think this is a bug per se, but maybe the behavior Julin wants is something that can be added. In the module, a tiling is defined as a minimal set of overlapping hsps, so if you have qry **************** 1 -------- 2 --------- 3 ---------- then the minimal set of overlapping hsps covering qry is {1,3}, and 2 is left out. There can be other tilings possible; these can be stepped through and alignments derived from them by doing while ( $t = $tiling->next_tiling ) { push @alns, $tiling->get_tiled_alns($type, $context, $t); } However, what Julin really wants is an alignment containing {1,2,3}, correctly lined up. Is that right? MAJ -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Apr 29 01:03:09 2010 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 29 Apr 2010 01:03:09 -0400 Subject: [Bioperl-guts-l] [Bug 3070] get_tiled_alns() skips first HSP In-Reply-To: Message-ID: <201004290503.o3T539q1030386@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=3070 ------- Comment #4 from jnmaloof at gmail.com 2010-04-29 01:03 EST ------- I don't think that is what is happening; I would be OK in the situation that you describe. I don't care about all of the HSPs, just having alignments of all of the sequence space that is covered by an HSP. In your example the alignment containing 1 and 3 would be great. The bheavior that I am seeing can be described as follows: qry ******************** 1--- 2 ---- Only 1, but not 2 is returned in the alignment array. I would expect two alignments to be returned. Another example (not included in my demo but I can send it if you would like). qry *********************************************** 1--- 2 ---- 3 ----- 4 ------ 5 ----- 6 ---- Here I would expect four alignments to be returned:containing HSPs {1}, {2,3,4 (tiled)} , {5}, and {6}, but only 3 alignments are currently returned; no alignment with HSP 1 would be returned. Does that make sense? (In reply to comment #3) > Ahh...I think I see what's happening. I don't think this is a bug per se, but > maybe the behavior Julin wants is something that can be added. In the module, a > tiling is defined as a minimal set of overlapping hsps, so if you have > > qry **************** > 1 -------- > 2 --------- > 3 ---------- > > then the minimal set of overlapping hsps covering qry is {1,3}, and 2 is left > out. There can be other tilings possible; these can be stepped through and > alignments derived from them by doing > > while ( $t = $tiling->next_tiling ) { > push @alns, $tiling->get_tiled_alns($type, $context, $t); > } > > However, what Julin really wants is an alignment containing {1,2,3}, correctly > lined up. Is that right? > MAJ > -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From scain at dev.open-bio.org Fri Apr 30 09:36:44 2010 From: scain at dev.open-bio.org (Scott Cain) Date: Fri, 30 Apr 2010 09:36:44 -0400 Subject: [Bioperl-guts-l] [16958] bioperl-live/trunk/Bio/DB/GFF/Adaptor/dbi/oracle.pm: so how long has there been a syntax error in this module Message-ID: <201004301336.o3UDaipe023322@dev.open-bio.org> Revision: 16958 Author: scain Date: 2010-04-30 09:36:43 -0400 (Fri, 30 Apr 2010) Log Message: ----------- so how long has there been a syntax error in this module Modified Paths: -------------- bioperl-live/trunk/Bio/DB/GFF/Adaptor/dbi/oracle.pm Modified: bioperl-live/trunk/Bio/DB/GFF/Adaptor/dbi/oracle.pm =================================================================== --- bioperl-live/trunk/Bio/DB/GFF/Adaptor/dbi/oracle.pm 2010-04-28 22:46:38 UTC (rev 16957) +++ bioperl-live/trunk/Bio/DB/GFF/Adaptor/dbi/oracle.pm 2010-04-30 13:36:43 UTC (rev 16958) @@ -885,7 +885,7 @@ my $sth = $self->dbh->do_query($query); my @results; - while (my ($class,$name,$note) = $sth->fetchrow_array) { + while (my ($class,$name,$note,$method,$source) = $sth->fetchrow_array) { next unless $class && $name; # sorry, ignore NULL objects my @matches = $note =~ /($regex)/g; my $relevance = 10*@matches;