From bugzilla-daemon at portal.open-bio.org Fri Feb 1 09:02:33 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 1 Feb 2008 09:02:33 -0500 Subject: [Bioperl-guts-l] [Bug 2438] New: no letters warning of SeqIO Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2438 Summary: no letters warning of SeqIO Product: BioPerl Version: 1.5 branch Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Bio::SeqIO AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: bernd at bio.vu.nl Hi, I was slighty puzzled by the following warning from Bio:SeqIO (using BioPerl 1.005002102): -------------------- WARNING --------------------- MSG: Got a sequence with no letters in it cannot guess alphabet [] --------------------------------------------------- This turned out to be due to sequences with only "X" residues such as: >1bcc_I mol:protein length:33 UBIQUINOL CYTOCHROME C OXIDOREDUCTASE XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX Although a detail, possibly the warning could be adapted to reflect this fact. E.g. with no letters or only "X"s, as I was expecting a sequence without any letters based on this warning. regards, bernd -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Feb 1 10:27:12 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 1 Feb 2008 10:27:12 -0500 Subject: [Bioperl-guts-l] [Bug 2439] New: multiple results HTMLResultWriter.pm and non-redundant entries in SearchIO Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2439 Summary: multiple results HTMLResultWriter.pm and non-redundant entries in SearchIO Product: BioPerl Version: 1.5 branch Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Bio::Search/Bio::SearchIO AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: bernd at bio.vu.nl Hi, Attached code and input with 2 BLAST results. I run this as: perl -w HTMLWriter.pl < blast2.txt > out.html I noticed the following issues with SearchIO and HTMLWriter: 1) after the first BLAST result an end_report is called which should not be the case. Also the Search Parameters section is empty here: Search Parameters Parameter Value Search Statistics Statistic Value BTW: this is also the case for a single FastA result. 2) The Links to the alignments on the E-values are not report specific, but are the hit IDs. This is actually also the case on the NCBI BLAST server and causes the same problem on there: if an ID occurs in several reports the link sends us to the first occurrence, which in the attached example is always the first report. 3) The (SearchIO) parsing of the hit ID and description could be improved for hits with concatenated descriptions. For example, the following description from the original attached BLAST report is changed into a single long description line. >pdb|1JTM|A Chain A, Alternative Structures Of A Sequence Extended T4 Lysozyme Show That The Highly Conserved Beta-Sheet Has Weak Intrinsic Folding Propensity pdb|1JTN|A Chain A, Alternative Structures Of A Sequence Extended T4 Lysozyme Show That The Highly Conserved Beta-Sheet Region Has Weak Intrinsic Folding Propensity pdb|1JTN|B Chain B, Alternative Structures Of A Sequence Extended T4 Lysozyme Show That The Highly Conserved Beta-Sheet Region Has Weak Intrinsic Folding Propensity However, based on the indentation we can see what the new IDs and descriptions are. I'd propose storing this info by concatenating the other hits to the first description with as is done in the NCBI BLAST DB. This way, SearchIO, will now contain the first ID as ID and the rest as description, where new hit IDs are seperated by . (This is an SearchIO issue). Now in HTMLWriter, we can use these tags to print the hit description as in the original BLAST: each new ID indented and starting on a new line. This would make the HTML result from NR databases much more readable. We could print the IDs bold or link them. Kind regards, Bernd -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Feb 1 10:27:45 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 1 Feb 2008 10:27:45 -0500 Subject: [Bioperl-guts-l] [Bug 2439] multiple results HTMLResultWriter.pm and non-redundant entries in SearchIO In-Reply-To: Message-ID: <200802011527.m11FRjn1016690@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2439 ------- Comment #1 from bernd at bio.vu.nl 2008-02-01 10:27 EST ------- Created an attachment (id=849) --> (http://bugzilla.open-bio.org/attachment.cgi?id=849&action=view) HTMLWriter example script -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Feb 1 10:28:28 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 1 Feb 2008 10:28:28 -0500 Subject: [Bioperl-guts-l] [Bug 2439] multiple results HTMLResultWriter.pm and non-redundant entries in SearchIO In-Reply-To: Message-ID: <200802011528.m11FSSYs016782@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2439 ------- Comment #2 from bernd at bio.vu.nl 2008-02-01 10:28 EST ------- Created an attachment (id=850) --> (http://bugzilla.open-bio.org/attachment.cgi?id=850&action=view) example NCBI BLAST output, to be used as input for HTMLWriter.pl -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bosborne at dev.open-bio.org Fri Feb 1 12:29:36 2008 From: bosborne at dev.open-bio.org (Brian Osborne) Date: Fri, 1 Feb 2008 12:29:36 -0500 Subject: [Bioperl-guts-l] [14461] bioperl-network/trunk/Bio/Network: Minor edits Message-ID: <200802011729.m11HTacs020901@dev.open-bio.org> Revision: 14461 Author: bosborne Date: 2008-02-01 12:29:35 -0500 (Fri, 01 Feb 2008) Log Message: ----------- Minor edits Modified Paths: -------------- bioperl-network/trunk/Bio/Network/Edge.pm bioperl-network/trunk/Bio/Network/IO/dip_tab.pm bioperl-network/trunk/Bio/Network/IO/psi10.pm bioperl-network/trunk/Bio/Network/IO.pm bioperl-network/trunk/Bio/Network/Interaction.pm bioperl-network/trunk/Bio/Network/ProteinNet.pm Modified: bioperl-network/trunk/Bio/Network/Edge.pm =================================================================== --- bioperl-network/trunk/Bio/Network/Edge.pm 2008-01-31 23:02:03 UTC (rev 14460) +++ bioperl-network/trunk/Bio/Network/Edge.pm 2008-02-01 17:29:35 UTC (rev 14461) @@ -52,8 +52,8 @@ =head1 AUTHORS -Richard Adams richard.adams at ed.ac.uk Brian Osborne bosborne at alum.mit.edu +Richard Adams richard.adams at ed.ac.uk Maintained by Brian Osborne @@ -131,5 +131,4 @@ sub next_node { my $self = shift; - } Modified: bioperl-network/trunk/Bio/Network/IO/dip_tab.pm =================================================================== --- bioperl-network/trunk/Bio/Network/IO/dip_tab.pm 2008-01-31 23:02:03 UTC (rev 14460) +++ bioperl-network/trunk/Bio/Network/IO/dip_tab.pm 2008-02-01 17:29:35 UTC (rev 14461) @@ -80,8 +80,8 @@ =head1 AUTHORS +Brian Osborne bosborne at alum.mit.edu Richard Adams richard.adams at ed.ac.uk -Brian Osborne bosborne at alum.mit.edu =cut Modified: bioperl-network/trunk/Bio/Network/IO/psi10.pm =================================================================== --- bioperl-network/trunk/Bio/Network/IO/psi10.pm 2008-01-31 23:02:03 UTC (rev 14460) +++ bioperl-network/trunk/Bio/Network/IO/psi10.pm 2008-02-01 17:29:35 UTC (rev 14461) @@ -228,8 +228,8 @@ =head1 AUTHORS +Brian Osborne bosborne at alum.mit.edu Richard Adams richard.adams at ed.ac.uk -Brian Osborne bosborne at alum.mit.edu =cut Modified: bioperl-network/trunk/Bio/Network/IO.pm =================================================================== --- bioperl-network/trunk/Bio/Network/IO.pm 2008-01-31 23:02:03 UTC (rev 14460) +++ bioperl-network/trunk/Bio/Network/IO.pm 2008-02-01 17:29:35 UTC (rev 14461) @@ -71,8 +71,8 @@ =head1 AUTHORS +Brian Osborne bosborne at alum.mit.edu Richard Adams richard.adams at ed.ac.uk -Brian Osborne bosborne at alum.mit.edu =cut @@ -217,3 +217,5 @@ } 1; + +__END__ Modified: bioperl-network/trunk/Bio/Network/Interaction.pm =================================================================== --- bioperl-network/trunk/Bio/Network/Interaction.pm 2008-01-31 23:02:03 UTC (rev 14460) +++ bioperl-network/trunk/Bio/Network/Interaction.pm 2008-02-01 17:29:35 UTC (rev 14461) @@ -54,8 +54,8 @@ =head1 AUTHORS +Brian Osborne bosborne at alum.mit.edu Richard Adams richard.adams at ed.ac.uk -Brian Osborne bosborne at alum.mit.edu Maintained by Brian Osborne @@ -66,8 +66,6 @@ use Bio::Root::Root; use Bio::AnnotatableI; use Bio::Annotation::Collection; -#use Bio::IdentifiableI; -#use Bio::DescribableI; use vars qw(@ISA); @@ -201,4 +199,6 @@ $id ? $self->primary_id($id) : return $self->primary_id; } +1; + __END__ Modified: bioperl-network/trunk/Bio/Network/ProteinNet.pm =================================================================== --- bioperl-network/trunk/Bio/Network/ProteinNet.pm 2008-01-31 23:02:03 UTC (rev 14460) +++ bioperl-network/trunk/Bio/Network/ProteinNet.pm 2008-02-01 17:29:35 UTC (rev 14461) @@ -25,7 +25,6 @@ print $protein->display_id," "; } } - print "\n"; } =head1 Perl Graph module @@ -334,8 +333,8 @@ =head1 AUTHORS -Richard Adams richard.adams at ed.ac.uk Brian Osborne bosborne at alum.mit.edu +Richard Adams richard.adams at ed.ac.uk Maintained by Brian Osborne @@ -1178,5 +1177,4 @@ sub next_node { - } From bosborne at dev.open-bio.org Fri Feb 1 14:11:42 2008 From: bosborne at dev.open-bio.org (Brian Osborne) Date: Fri, 1 Feb 2008 14:11:42 -0500 Subject: [Bioperl-guts-l] [14462] bioperl-network/trunk/Bio/Network/IO/psi10.pm: URL changes Message-ID: <200802011911.m11JBgqN021046@dev.open-bio.org> Revision: 14462 Author: bosborne Date: 2008-02-01 14:11:42 -0500 (Fri, 01 Feb 2008) Log Message: ----------- URL changes Modified Paths: -------------- bioperl-network/trunk/Bio/Network/IO/psi10.pm Modified: bioperl-network/trunk/Bio/Network/IO/psi10.pm =================================================================== --- bioperl-network/trunk/Bio/Network/IO/psi10.pm 2008-02-01 17:29:35 UTC (rev 14461) +++ bioperl-network/trunk/Bio/Network/IO/psi10.pm 2008-02-01 19:11:42 UTC (rev 14462) @@ -29,20 +29,20 @@ The following databases provide their data as PSI MI XML: BIND L -DIP L -HPRD L +DIP L +HPRD L IntAct L -MINT L +MINT L Each of these databases will call PSI format by some different name. For example, PSI MI from DIP comes in files with the suffix "mif". -Documentation for PSI XML can be found at L. +Documentation for PSI XML can be found at L. =head2 Version This module supports a subset of the fields described in PSI MI version 1.0 -(L). The NODE DATA section below +(L). The NODE DATA section below describes which fields are currently parsed into ProteinNet networks. =head2 Notes From bugzilla-daemon at portal.open-bio.org Fri Feb 1 17:33:26 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 1 Feb 2008 17:33:26 -0500 Subject: [Bioperl-guts-l] [Bug 2440] New: GenericHSP::seq_inds() reports wrong indices when parsing some reports Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2440 Summary: GenericHSP::seq_inds() reports wrong indices when parsing some reports Product: BioPerl Version: main-trunk Platform: All OS/Version: All Status: NEW Severity: normal Priority: P2 Component: Bio::Search/Bio::SearchIO AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: cjfields at uiuc.edu While working on a recent bug (http://bugzilla.open-bio.org/show_bug.cgi?id=2436) I found a pretty significant bug. It appears that seq_inds() is giving some bad return values tied specifically to reports where either or both query or hit are translated. Attached is an archive file of the script and BLAST reports. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Feb 1 17:36:39 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 1 Feb 2008 17:36:39 -0500 Subject: [Bioperl-guts-l] [Bug 2440] GenericHSP::seq_inds() reports wrong indices when parsing some reports In-Reply-To: Message-ID: <200802012236.m11Madtw013946@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2440 ------- Comment #1 from cjfields at uiuc.edu 2008-02-01 17:36 EST ------- Created an attachment (id=851) --> (http://bugzilla.open-bio.org/attachment.cgi?id=851&action=view) archive file of script and sample BLAST reports run script as: perl error.pl -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sun Feb 3 12:15:35 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 3 Feb 2008 12:15:35 -0500 Subject: [Bioperl-guts-l] [Bug 2342] blastall crash & StandAloneBlast (originally described by Matthew Laird) In-Reply-To: Message-ID: <200802031715.m13HFZNP010635@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2342 ------- Comment #2 from harmn at bioinformatics.nl 2008-02-03 12:15 EST ------- We had a similar problem and could trace it to a general problem with system calls caused by the Perl $SIG{CHLD} variable being set to "IGNORE" somewhere in the BioPerl code that we used, actually in a get_Seq_by_id call as the code below demonstrates (the first system call returns 0, the second returns -1): use Bio::DB::SwissProt; $sp = new Bio::DB::SwissProt; print system("echo before"),"\n"; $sp->get_Seq_by_id('ORYZ_ASPFL'); print system("echo after"),"\n"; If $SIG{CHLD} is set to "IGNORE", on some platforms all system calls will return -1. See http://perldoc.perl.org/perlipc.html for details. We solve it by clearing the $SIG{CHLD} variable after the _get_Seq_by_id call and before running the blastall method: undef($SIG{CHLD}); It probably would be better to solve it in the BioPerl code that sets the variable. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sun Feb 3 15:15:02 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 3 Feb 2008 15:15:02 -0500 Subject: [Bioperl-guts-l] [Bug 2342] blastall crash & StandAloneBlast (originally described by Matthew Laird) In-Reply-To: Message-ID: <200802032015.m13KF2Vm025084@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2342 ------- Comment #3 from cjfields at uiuc.edu 2008-02-03 15:15 EST ------- (In reply to comment #2) > We had a similar problem and could trace it to a general problem with system > calls caused by the Perl $SIG{CHLD} variable being set to "IGNORE" somewhere in > the BioPerl code that we used, actually in a get_Seq_by_id call as the code > below demonstrates (the first system call returns 0, the second returns -1): > > use Bio::DB::SwissProt; > $sp = new Bio::DB::SwissProt; > print system("echo before"),"\n"; > $sp->get_Seq_by_id('ORYZ_ASPFL'); > print system("echo after"),"\n"; > > If $SIG{CHLD} is set to "IGNORE", on some platforms all system calls will > return -1. See http://perldoc.perl.org/perlipc.html for details. > > We solve it by clearing the $SIG{CHLD} variable after the _get_Seq_by_id call > and before running the blastall method: > undef($SIG{CHLD}); > It probably would be better to solve it in the BioPerl code that sets the > variable. The only Bioperl module where $SIG{CHLD} is set is in Bio::DB::WebDBSeqI (the backend for many retrieval services) and is for piping returned sequences to a sequence stream; it is set as the default behavior. However, though this is worth investigating further as a separate issue and makes sense in your case, I can't see it being related to this bug report as the example code above uses Bio::Seq directly to construct a sequence object for BLAST'ing. No Bio::DB-related modules were apparently loaded. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bosborne at dev.open-bio.org Mon Feb 4 00:10:47 2008 From: bosborne at dev.open-bio.org (Brian Osborne) Date: Mon, 4 Feb 2008 00:10:47 -0500 Subject: [Bioperl-guts-l] [14464] bioperl-network/trunk/t/IO_psi25.t: Update, all tests pass Message-ID: <200802040510.m145AlBB026763@dev.open-bio.org> Revision: 14464 Author: bosborne Date: 2008-02-04 00:10:47 -0500 (Mon, 04 Feb 2008) Log Message: ----------- Update, all tests pass Modified Paths: -------------- bioperl-network/trunk/t/IO_psi25.t Modified: bioperl-network/trunk/t/IO_psi25.t =================================================================== --- bioperl-network/trunk/t/IO_psi25.t 2008-02-04 05:06:38 UTC (rev 14463) +++ bioperl-network/trunk/t/IO_psi25.t 2008-02-04 05:10:47 UTC (rev 14464) @@ -1,6 +1,6 @@ # This is -*-Perl-*- code# # Bioperl Test Harness Script for Modules# -# $Id: protgraph.t,v 1.1 2004/03/13 23:45:32 radams Exp +# $Id$ use vars qw($NUMTESTS $DEBUG $ERROR); use strict; @@ -46,7 +46,7 @@ ok 1; # -# PSI XML from HPRD +# PSI XML from IntAct # ok my $io = Bio::Network::IO->new (-format => 'psi25', @@ -55,6 +55,8 @@ __END__ +ok $g1->edge_count, 3; +ok $g1->node_count, 4; # # PSI XML from DIP From bosborne at dev.open-bio.org Mon Feb 4 00:13:33 2008 From: bosborne at dev.open-bio.org (Brian Osborne) Date: Mon, 4 Feb 2008 00:13:33 -0500 Subject: [Bioperl-guts-l] [14465] bioperl-network/trunk/Bio/Network/IO/psi25.pm: Add, all tests pass Message-ID: <200802040513.m145DXRX026788@dev.open-bio.org> Revision: 14465 Author: bosborne Date: 2008-02-04 00:13:33 -0500 (Mon, 04 Feb 2008) Log Message: ----------- Add, all tests pass Added Paths: ----------- bioperl-network/trunk/Bio/Network/IO/psi25.pm Added: bioperl-network/trunk/Bio/Network/IO/psi25.pm =================================================================== --- bioperl-network/trunk/Bio/Network/IO/psi25.pm (rev 0) +++ bioperl-network/trunk/Bio/Network/IO/psi25.pm 2008-02-04 05:13:33 UTC (rev 14465) @@ -0,0 +1,436 @@ +# $Id: psi10.pm 14461 2008-02-01 17:29:35Z bosborne $ +# +# BioPerl module for Bio::Network::IO::psi25 +# +# You may distribute this module under the same terms as perl itself +# POD documentation - main docs before the code + +=head1 NAME + +Bio::Network::IO::psi25 + +=head1 SYNOPSIS + +Do not use this module directly, use Bio::Network::IO: + + my $io = Bio::Network::IO->new(-format => 'psi25', + -file => 'data.xml'); + + my $network = $io->next_network; + +=head1 DESCRIPTION + +PSI MI (Protein Standards Initiative Molecular Interaction) XML is a format +to describe protein-protein interactions and interaction networks. +This module parses version 2.5 of PSI MI. + +=head2 Databases + +The following databases provide their data as PSI MI XML: + +BIND L +DIP L +HPRD L +IntAct L +MINT L + +Each of these databases will call PSI format by some different name. For +example, PSI MI from DIP comes in files with the suffix "mif". + +Documentation for PSI XML can be found at L. + +=head2 Version + +This module supports a subset of the fields described in PSI MI version 2.5. +(L). The NODE DATA section below +describes which fields are currently parsed into ProteinNet networks. + +=head2 Notes + +See the Bio::Network::IO::psi_xml page in the Bioperl Wiki +(L) +for notes on PSI XML from various databases. + +When using this parser recall that some PSI MI fields, or classes, +are populated by values taken from an ontology created for the PSI MI +format. This ontology is an OBO ontology and can be browsed at +L. + +=head1 METHODS + +The naming system is analagous to the SeqIO system, although usually +next_network() will be called only once per file. + +=head1 DATA IN THE NODE + +The Node (protein or protein complex) is roughly equivalent to the PSI MI +B (entrySet/entry/interactorList/interactor). The following are +subclasses of B whose values are accessible through the Node +object. + +=head2 interactor/names/shortLabel + +Annotation::SimpleValue + +=head2 interactor/names/fullName + +Annotation::SimpleValue + +=head2 interactor/xref/primaryRef + +Annotation::DBLink + +=head2 interactor/xref/secondaryRef + +Annotation::DBLink + +Bio::Species object + +=head2 interactor/organism/names/alias + +Bio::Species object + +=head2 interactor/organism/names/fullName + +Bio::Species object + +=head2 interactor/organism/names/shortLabel + +Bio::Species object + +=head1 DATA NOT YET AVAILABLE + +The following are subclasses of B whose values are currently not +accessible through the Node object. + +=head2 interactor/names/alias + +Annotation::SimpleValue + +=head2 interactor/sequence + +=head2 interactor/interactorType/names + +Controlled vocabulary maintained by PSI MI +L. +Example: "protein". + +OntologyTerm + +=head2 interactor/interactorType/xref + +Annotation::DBLink + +=head2 interactor/organism/cellType + +Annotation::OntologyTerm + +=head2 interactor/organism/compartment + +Annotation::OntologyTerm + +=head2 interactor/organism/tissue + +Annotation::OntologyTerm + + +=head1 INTERACTION DATA + +The Interaction object is roughly equivalent to the PSI MI B +(entrySet/entry/interactionList/interaction) and B +(entrySet/entry/experimentList/experimentDescription). The following are +subclasses of B and B whose values are +NOT yet accessible through the Interaction object. + +=head2 interaction/xref/primaryRef + +Annotation::DBLink + +=head2 interaction/xref/secondaryRef + +Annotation::DBLink + +=head2 interaction/organism/names/shortLabel + +Bio::Species object + +=head2 interaction/organism/names/alias + +Bio::Species object + +=head2 interaction/organism/names/fullName + +Bio::Species object + +=head2 interaction/modelled + +Annotation::SimpleValue + +=head2 interaction/intraMolecular + +Annotation::SimpleValue + +=head2 interaction/negative + +Annotation::SimpleValue + +=head2 interaction/interactionType + +Controlled vocabulary maintained by PSI MI +L. +Example: "phosphorylation reaction". + +OntologyTerm + +=head2 interaction/confidenceList + +Annotation::SimpleValue + +=head2 experimentDescription/confidenceList + +Annotation::SimpleValue + +=head2 experimentDescription/interactionDetectionMethod + +Controlled vocabulary maintained by PSI MI +L. +Example: "two hybrid array". + +Annotation::OntologyTerm + +=head2 featureElementType/featureType + +Controlled vocabulary maintained by PSI MI +L. +The featureType includes data on post-translational modification. +Example: "phospho-histidine". + +Annotation::OntologyTerm + +=head1 FEEDBACK + +=head2 Mailing Lists + +User feedback is an integral part of the evolution of this and other +Bioperl modules. Send your comments and suggestions preferably to one +of the Bioperl mailing lists. Your participation is much appreciated. + + bioperl-l at bioperl.org - General discussion + http://bioperl.org/wiki/Mailing_lists - About the mailing lists + +=head2 Reporting Bugs + +Report bugs to the Bioperl bug tracking system to help us keep track +the bugs and their resolution. Bug reports can be submitted via the +web: + + http://bugzilla.open-bio.org/ + +=head1 AUTHORS + +Brian Osborne bosborne at alum.mit.edu +Richard Adams richard.adams at ed.ac.uk + +=cut + +package Bio::Network::IO::psi25; +use strict; +use XML::Twig; +use Bio::Root::Root; +use Bio::Seq::SeqFactory; +use Bio::Network::ProteinNet; +use Bio::Network::Interaction; +use Bio::Network::IO; +use Bio::Network::Node; +use Bio::Species; +use Bio::Annotation::DBLink; +use Bio::Annotation::OntologyTerm; +use Bio::Annotation::Collection; +use Bio::Annotation::Comment; +use Bio::Annotation::Reference; +use Bio::Annotation::SimpleValue; +# use Bio::Network::IO::psi::intact; + +use vars qw( @ISA %species $net $fac ); + at ISA = qw(Bio::Network::IO Bio::Root::Root ); + +BEGIN { + $fac = Bio::Seq::SeqFactory->new(-type => 'Bio::Seq::RichSeq'); +} + +=head2 next_network + + Name : next_network + Purpose : Constructs a protein interaction graph from PSI XML data + Usage : my $net = $io->next_network() + Arguments : + Returns : A Bio::Network::ProteinNet object + +=cut + +sub next_network { + my $self = shift; + $net = Bio::Network::ProteinNet->new(refvertexed => 1); + + # the tag in the handler is an XML field, the value is + # the function called when that field is encountered + my $t = XML::Twig->new(TwigHandlers => { + interactor => \&_addInteractor, + interaction => \&_addInteraction + }); + $t->parsefile($self->file); + $net; +} + +=head2 _addInteractor + + Name : _addInteractor + Purpose : Parses protein information into Bio::Seq::RichSeq objects + Returns : + Usage : Internally called by next_network() + Arguments : None + Notes : Interactors without organism data get their Bio::Species + fields set to -1 +=cut + +sub _addInteractor { + my ($twig, $pi) = @_; + + my ($prot, $acc, $sp, $desc, $sp_obj, $taxid, $common, $full); + my $nullVal = "-1"; + + my $org = $pi->first_child('organism'); + + eval { $taxid = $org->att('ncbiTaxId'); }; + if ( $@ ){ + print "No organism for interactor " . + $pi->first_child('names')->first_child('fullName')->text . "\n"; + $common = $full = $taxid = $nullVal; + } elsif ( !exists($species{$taxid}) ) { + # Make new species object if doesn't already exist + $common = $org->first_child('names')->first_child('shortLabel')->text; + + # some PSI MI files have entries with species lacking "fullName" + eval { + $full = $org->first_child('names')->first_child('fullName')->text; + }; + $full = $common if $@; + + eval { + $sp_obj = Bio::Species->new(-ncbi_taxid => $taxid, + -name => $full, + -common_name => $common + ); }; + $species{$taxid} = $sp_obj; + } + + # Extract sequence identifiers + my @ids = $pi->first_child('xref')->children(); + my %ids = map {$_->att('db'), $_->att('id')} @ids; + $ids{'psixml'} = $pi->att('id'); + + my $prim_id = defined ($ids{'GI'}) ? $ids{'GI'} : ''; + # needs to be done by reference to an actual ontology: + $acc = $ids{'RefSeq'} || + $ids{'SWP'} || # DIP's name for Swissprot + $ids{'Swiss-Prot'} || # db name from HPRD + $ids{'Ref-Seq'} || # db name from HPRD + $ids{'uniprotkb'} || # db name from MINT + $ids{'GI'} || + $ids{'PIR'} || + $ids{'intact'} || # db name from IntAct + $ids{'psi-mi'} || # db name from IntAct + $ids{'DIP'} || # DIP node name + $ids{'ensembl'} || # db name from MINT + $ids{'flybase'} || # db name from MINT + $ids{'wormbase'} || # db name from MINT + $ids{'sgd'} || # db name from MINT + $ids{'ddbj/embl/genbank'} || # db name from MINT + $ids{'mint'}; # db name from MINT + + # Get description line - certain files, like PSI XML from HPRD, + # have "shortLabel" but no "fullName" + eval { + $desc = $pi->first_child('names')->first_child('fullName')->text; + }; + if ($@) { + print "No fullName for interactor " . + $pi->first_child('names')->first_child('shortLabel')->text . "\n"; + $desc = $pi->first_child('names')->first_child('shortLabel')->text; + } + + # Use ids other than accession_no or primary_id for DBLink annotations + my $ac = Bio::Annotation::Collection->new(); + for my $db (keys %ids) { + next if $ids{$db} eq $acc; + next if $ids{$db} eq $prim_id; + my $an = Bio::Annotation::DBLink->new( -database => $db, + -primary_id => $ids{$db}, + ); + $ac->add_Annotation('dblink',$an); + } + + # Make sequence object + eval { + $prot = $fac->create( + -accession_number => $acc, + -desc => $desc, + -display_id => $acc, + -primary_id => $prim_id, + -species => $species{$taxid}, + -annotation => $ac); + }; + + # Add node to network + my $node = Bio::Network::Node->new(-protein => [($prot)]); + $net->add_node($node); + + # Add primary identifier and accession to internal id<->node mapping hash + $net->add_id_to_node($ids{'psixml'},$node); + $net->add_id_to_node($prot->primary_id,$node); + $net->add_id_to_node($prot->accession_number,$node); + + # Add secondary identifiers to internal id <-> node mapping hash + $ac = $prot->annotation(); + for my $an ($ac->get_Annotations('dblink')) { + $net->add_id_to_node($an->primary_id,$node); + } + + $twig->purge(); +} + +=head2 _addInteraction + + Name : _addInteraction + Purpose : Adds a new Interaction to a graph + Usage : Do not call, called internally by next_network() + Returns : + Notes : All interactions are made of 2 nodes + +=cut + +sub _addInteraction { + my ($twig, $i) = @_; + + my @ints = $i->first_child('participantList')->children; + + # 2 nodes are required + if ( scalar @ints == 2 ) { + my @nodeids = map {$_->first_child('interactorRef')->text} @ints; + my $interx_id = $i->first_child('xref')->first_child('primaryRef')->att('id'); + + my $node1 = $net->get_nodes_by_id($nodeids[0]); + my $node2 = $net->get_nodes_by_id($nodeids[1]); + + my $interx = Bio::Network::Interaction->new(-id => $interx_id); + $net->add_interaction(-nodes => [($node1,$node2)], + -interaction => $interx ); + $net->add_id_to_interaction($interx_id,$interx); + + $twig->purge(); + } +} + +1; + +__END__ From bosborne at dev.open-bio.org Mon Feb 4 00:15:58 2008 From: bosborne at dev.open-bio.org (Brian Osborne) Date: Mon, 4 Feb 2008 00:15:58 -0500 Subject: [Bioperl-guts-l] [14466] bioperl-network/trunk/t/Node.t: Fix header Message-ID: <200802040515.m145Fwqc026813@dev.open-bio.org> Revision: 14466 Author: bosborne Date: 2008-02-04 00:15:58 -0500 (Mon, 04 Feb 2008) Log Message: ----------- Fix header Modified Paths: -------------- bioperl-network/trunk/t/Node.t Modified: bioperl-network/trunk/t/Node.t =================================================================== --- bioperl-network/trunk/t/Node.t 2008-02-04 05:13:33 UTC (rev 14465) +++ bioperl-network/trunk/t/Node.t 2008-02-04 05:15:58 UTC (rev 14466) @@ -1,6 +1,6 @@ # This is -*-Perl-*- code# # Bioperl Test Harness Script for Modules -# $Id: protgraph.t,v 1.1 2004/03/13 23:45:32 radams Exp +# $Id$ use vars qw($NUMTESTS $DEBUG $ERROR); use strict; From bosborne at dev.open-bio.org Mon Feb 4 00:20:11 2008 From: bosborne at dev.open-bio.org (Brian Osborne) Date: Mon, 4 Feb 2008 00:20:11 -0500 Subject: [Bioperl-guts-l] [14467] bioperl-network/trunk/t: Fix headers Message-ID: <200802040520.m145KB6p026838@dev.open-bio.org> Revision: 14467 Author: bosborne Date: 2008-02-04 00:20:11 -0500 (Mon, 04 Feb 2008) Log Message: ----------- Fix headers Modified Paths: -------------- bioperl-network/trunk/t/Edge.t bioperl-network/trunk/t/Graph-MD5.t bioperl-network/trunk/t/Graph-Seq.t bioperl-network/trunk/t/IO_dip_tab.t bioperl-network/trunk/t/IO_psi10.t bioperl-network/trunk/t/Interaction.t bioperl-network/trunk/t/ProteinNet.t Modified: bioperl-network/trunk/t/Edge.t =================================================================== --- bioperl-network/trunk/t/Edge.t 2008-02-04 05:15:58 UTC (rev 14466) +++ bioperl-network/trunk/t/Edge.t 2008-02-04 05:20:11 UTC (rev 14467) @@ -1,6 +1,6 @@ # This is -*-Perl-*- code# # Bioperl Test Harness Script for Modules -# $Id: protgraph.t,v 1.1 2004/03/13 23:45:32 radams Exp +# $Id$ use vars qw($NUMTESTS $DEBUG $ERROR); use strict; Modified: bioperl-network/trunk/t/Graph-MD5.t =================================================================== --- bioperl-network/trunk/t/Graph-MD5.t 2008-02-04 05:15:58 UTC (rev 14466) +++ bioperl-network/trunk/t/Graph-MD5.t 2008-02-04 05:20:11 UTC (rev 14467) @@ -1,6 +1,6 @@ # This is -*-Perl-*- code# # Bioperl Test Harness Script for Modules# -# $Id: protgraph.t,v 1.1 2004/03/13 23:45:32 radams Exp +# $Id$ use vars qw($NUMTESTS $DEBUG $ERROR); use strict; Modified: bioperl-network/trunk/t/Graph-Seq.t =================================================================== --- bioperl-network/trunk/t/Graph-Seq.t 2008-02-04 05:15:58 UTC (rev 14466) +++ bioperl-network/trunk/t/Graph-Seq.t 2008-02-04 05:20:11 UTC (rev 14467) @@ -1,6 +1,6 @@ # This is -*-Perl-*- code# # Bioperl Test Harness Script for Modules# -# $Id: protgraph.t,v 1.1 2004/03/13 23:45:32 radams Exp +# $Id$ use vars qw($NUMTESTS $DEBUG $ERROR); use strict; Modified: bioperl-network/trunk/t/IO_dip_tab.t =================================================================== --- bioperl-network/trunk/t/IO_dip_tab.t 2008-02-04 05:15:58 UTC (rev 14466) +++ bioperl-network/trunk/t/IO_dip_tab.t 2008-02-04 05:20:11 UTC (rev 14467) @@ -1,6 +1,6 @@ # This is -*-Perl-*- code# # Bioperl Test Harness Script for Modules -# $Id: protgraph.t,v 1.1 2004/03/13 23:45:32 radams Exp +# $Id$ use vars qw($NUMTESTS $DEBUG $ERROR); use strict; Modified: bioperl-network/trunk/t/IO_psi10.t =================================================================== --- bioperl-network/trunk/t/IO_psi10.t 2008-02-04 05:15:58 UTC (rev 14466) +++ bioperl-network/trunk/t/IO_psi10.t 2008-02-04 05:20:11 UTC (rev 14467) @@ -1,6 +1,6 @@ # This is -*-Perl-*- code# # Bioperl Test Harness Script for Modules# -# $Id: protgraph.t,v 1.1 2004/03/13 23:45:32 radams Exp +# $Id$ use vars qw($NUMTESTS $DEBUG $ERROR); use strict; Modified: bioperl-network/trunk/t/Interaction.t =================================================================== --- bioperl-network/trunk/t/Interaction.t 2008-02-04 05:15:58 UTC (rev 14466) +++ bioperl-network/trunk/t/Interaction.t 2008-02-04 05:20:11 UTC (rev 14467) @@ -1,6 +1,6 @@ # This is -*-Perl-*- code# # Bioperl Test Harness Script for Modules -# $Id: protgraph.t,v 1.1 2004/03/13 23:45:32 radams Exp +# $Id$ use vars qw($NUMTESTS $DEBUG $ERROR); use strict; Modified: bioperl-network/trunk/t/ProteinNet.t =================================================================== --- bioperl-network/trunk/t/ProteinNet.t 2008-02-04 05:15:58 UTC (rev 14466) +++ bioperl-network/trunk/t/ProteinNet.t 2008-02-04 05:20:11 UTC (rev 14467) @@ -1,6 +1,6 @@ # This is -*-Perl-*- code# # Bioperl Test Harness Script for Modules -# $Id: protgraph.t,v 1.1 2004/03/13 23:45:32 radams Exp +# $Id$ use vars qw($NUMTESTS $DEBUG $ERROR); use strict; From bosborne at dev.open-bio.org Mon Feb 4 00:22:00 2008 From: bosborne at dev.open-bio.org (Brian Osborne) Date: Mon, 4 Feb 2008 00:22:00 -0500 Subject: [Bioperl-guts-l] [14468] bioperl-network/trunk/t/Graph-Articulation.x: Fix headers Message-ID: <200802040522.m145M0uO026863@dev.open-bio.org> Revision: 14468 Author: bosborne Date: 2008-02-04 00:22:00 -0500 (Mon, 04 Feb 2008) Log Message: ----------- Fix headers Modified Paths: -------------- bioperl-network/trunk/t/Graph-Articulation.x Modified: bioperl-network/trunk/t/Graph-Articulation.x =================================================================== --- bioperl-network/trunk/t/Graph-Articulation.x 2008-02-04 05:20:11 UTC (rev 14467) +++ bioperl-network/trunk/t/Graph-Articulation.x 2008-02-04 05:22:00 UTC (rev 14468) @@ -1,6 +1,6 @@ # This is -*-Perl-*- code# # Bioperl Test Harness Script for Modules -# $Id: protgraph.t,v 1.1 2004/03/13 23:45:32 radams Exp +# $Id: Node.t 14466 2008-02-04 05:15:58Z bosborne $ use vars qw($NUMTESTS $DEBUG $ERROR); use strict; From bosborne at dev.open-bio.org Mon Feb 4 00:33:29 2008 From: bosborne at dev.open-bio.org (Brian Osborne) Date: Mon, 4 Feb 2008 00:33:29 -0500 Subject: [Bioperl-guts-l] [14469] bioperl-network/trunk/AUTHORS: Update Message-ID: <200802040533.m145XT9n026916@dev.open-bio.org> Revision: 14469 Author: bosborne Date: 2008-02-04 00:33:29 -0500 (Mon, 04 Feb 2008) Log Message: ----------- Update Modified Paths: -------------- bioperl-network/trunk/AUTHORS Modified: bioperl-network/trunk/AUTHORS =================================================================== --- bioperl-network/trunk/AUTHORS 2008-02-04 05:22:00 UTC (rev 14468) +++ bioperl-network/trunk/AUTHORS 2008-02-04 05:33:29 UTC (rev 14469) @@ -1,11 +1,14 @@ $Id: AUTHORS,v 1.3 2006-11-08 10:59:13 sendu Exp $ +Maintained by Brian Osborne + bioperl-network Authors +* Brian Osborne + * Richard Adams * Sendu Bala * Nat Goodman -* Brian Osborne From bosborne at dev.open-bio.org Mon Feb 4 00:38:11 2008 From: bosborne at dev.open-bio.org (Brian Osborne) Date: Mon, 4 Feb 2008 00:38:11 -0500 Subject: [Bioperl-guts-l] [14470] bioperl-network/trunk/README: Both versions of PSI MI now Message-ID: <200802040538.m145cB3a026943@dev.open-bio.org> Revision: 14470 Author: bosborne Date: 2008-02-04 00:38:11 -0500 (Mon, 04 Feb 2008) Log Message: ----------- Both versions of PSI MI now Modified Paths: -------------- bioperl-network/trunk/README Modified: bioperl-network/trunk/README =================================================================== --- bioperl-network/trunk/README 2008-02-04 05:33:29 UTC (rev 14469) +++ bioperl-network/trunk/README 2008-02-04 05:38:11 UTC (rev 14470) @@ -35,13 +35,14 @@ An interaction can be thought of as one experiment or one experimental observation. -The formats that can be parsed are DIP (tab-delimited) and PSI MI (XML). -Capabilities include the ability to merge networks, select nodes and -interactions by identifier, add and delete components (nodes, -interactions, and edges), count all components of a certain type, get -all components of a certain type, and get subgraphs. Then you have all -the functionality of Perl's Graph in addition such as traversal using -different algorithms, getting interior and exterior nodes, and getting all +The formats that can be parsed are DIP (tab-delimited) and PSI MI +(XML), either version 1 or version 2.5. Capabilities include the +ability to merge networks, select nodes and interactions by +identifier, add and delete components (nodes, interactions, and +edges), count all components of a certain type, get all components of +a certain type, and get subgraphs. Then you have all the functionality +of Perl's Graph in addition such as traversal using different +algorithms, getting interior and exterior nodes, and getting all connected subgraphs. Graph is quite rich in functionality, this list is only a small subset of available methods, see the documentation for Graph for more detail (http://search.cpan.org/~jhi/Graph/lib/Graph.pod). From bosborne at dev.open-bio.org Mon Feb 4 10:04:04 2008 From: bosborne at dev.open-bio.org (Brian Osborne) Date: Mon, 4 Feb 2008 10:04:04 -0500 Subject: [Bioperl-guts-l] [14471] bioperl-network/trunk/Bio/Network: use base pragma Message-ID: <200802041504.m14F44lq028981@dev.open-bio.org> Revision: 14471 Author: bosborne Date: 2008-02-04 10:04:03 -0500 (Mon, 04 Feb 2008) Log Message: ----------- use base pragma Modified Paths: -------------- bioperl-network/trunk/Bio/Network/Edge.pm bioperl-network/trunk/Bio/Network/IO.pm bioperl-network/trunk/Bio/Network/Interaction.pm bioperl-network/trunk/Bio/Network/Node.pm bioperl-network/trunk/Bio/Network/ProteinNet.pm Modified: bioperl-network/trunk/Bio/Network/Edge.pm =================================================================== --- bioperl-network/trunk/Bio/Network/Edge.pm 2008-02-04 05:38:11 UTC (rev 14470) +++ bioperl-network/trunk/Bio/Network/Edge.pm 2008-02-04 15:04:03 UTC (rev 14471) @@ -61,9 +61,7 @@ use strict; package Bio::Network::Edge; -use Bio::Root::Root; -use vars qw(@ISA); - at ISA = qw(Bio::Root::Root); +use base 'Bio::Root::Root'; =head2 new Modified: bioperl-network/trunk/Bio/Network/IO.pm =================================================================== --- bioperl-network/trunk/Bio/Network/IO.pm 2008-02-04 05:38:11 UTC (rev 14470) +++ bioperl-network/trunk/Bio/Network/IO.pm 2008-02-04 15:04:03 UTC (rev 14471) @@ -78,11 +78,9 @@ package Bio::Network::IO; use strict; -use vars qw(@ISA %DBNAMES); -use Bio::Root::IO; +use base 'Bio::Root::IO'; +use vars qw(%DBNAMES); - at ISA = qw(Bio::Root::IO); - # these values are used to standardize database names %DBNAMES = ( DIP => "DIP", # found in DIP files Modified: bioperl-network/trunk/Bio/Network/Interaction.pm =================================================================== --- bioperl-network/trunk/Bio/Network/Interaction.pm 2008-02-04 05:38:11 UTC (rev 14470) +++ bioperl-network/trunk/Bio/Network/Interaction.pm 2008-02-04 15:04:03 UTC (rev 14471) @@ -63,14 +63,9 @@ use strict; package Bio::Network::Interaction; -use Bio::Root::Root; -use Bio::AnnotatableI; -use Bio::Annotation::Collection; +use base qw(Bio::Root::Root Bio::AnnotatableI Bio::Annotation::Collection); -use vars qw(@ISA); - at ISA = qw( Bio::Root::Root Bio::AnnotatableI); - =head2 new Name : new Modified: bioperl-network/trunk/Bio/Network/Node.pm =================================================================== --- bioperl-network/trunk/Bio/Network/Node.pm 2008-02-04 05:38:11 UTC (rev 14470) +++ bioperl-network/trunk/Bio/Network/Node.pm 2008-02-04 15:04:03 UTC (rev 14471) @@ -60,9 +60,7 @@ use strict; package Bio::Network::Node; -use Bio::Root::Root; -use vars qw(@ISA); - at ISA = qw(Bio::Root::Root); +use base 'Bio::Root::Root'; =head2 new Modified: bioperl-network/trunk/Bio/Network/ProteinNet.pm =================================================================== --- bioperl-network/trunk/Bio/Network/ProteinNet.pm 2008-02-04 05:38:11 UTC (rev 14470) +++ bioperl-network/trunk/Bio/Network/ProteinNet.pm 2008-02-04 15:04:03 UTC (rev 14471) @@ -345,13 +345,10 @@ package Bio::Network::ProteinNet; use strict; -use Bio::Root::Root; use Graph 0.80; use Bio::Network::Interaction; use Bio::Root::Root; - use vars qw($GRAPH_ARRAY_INDEX @ISA); - @ISA = qw( Graph::Undirected Bio::Root::Root ); # A Graph object is an array reference, therefore we need @@ -913,7 +910,7 @@ Returns : An array or a count of the array of nodes that will fragment the graph if deleted. Notes : This method is currently broken due to bugs in Graph v. .69 - + and later =cut sub articulation_points { From bosborne at dev.open-bio.org Mon Feb 4 12:33:10 2008 From: bosborne at dev.open-bio.org (Brian Osborne) Date: Mon, 4 Feb 2008 12:33:10 -0500 Subject: [Bioperl-guts-l] [14472] bioperl-network/trunk/t/IO_psi25.t: Add node and interaction counts Message-ID: <200802041733.m14HXAY6029766@dev.open-bio.org> Revision: 14472 Author: bosborne Date: 2008-02-04 12:33:09 -0500 (Mon, 04 Feb 2008) Log Message: ----------- Add node and interaction counts Modified Paths: -------------- bioperl-network/trunk/t/IO_psi25.t Modified: bioperl-network/trunk/t/IO_psi25.t =================================================================== --- bioperl-network/trunk/t/IO_psi25.t 2008-02-04 15:04:03 UTC (rev 14471) +++ bioperl-network/trunk/t/IO_psi25.t 2008-02-04 17:33:09 UTC (rev 14472) @@ -16,7 +16,7 @@ use lib 't'; } use Test; - $NUMTESTS = 3; + $NUMTESTS = 5; plan tests => $NUMTESTS; eval { require Graph; }; if ( $@ ) { @@ -52,62 +52,9 @@ (-format => 'psi25', -file => Bio::Root::IO->catfile("t", "data", "human_small-01.xml")); ok my $g1 = $io->next_network(); +ok $g1->node_count, 646; +# remember that interactions are only formed of pairs of nodes +ok $g1->interactions, 439; -__END__ -ok $g1->edge_count, 3; -ok $g1->node_count, 4; - -# -# PSI XML from DIP -# -ok my $io = Bio::Network::IO->new - (-format => 'psi10', - -file => Bio::Root::IO->catfile("t", "data", "psi_xml.dat")); -ok my $g1 = $io->next_network(); -ok $g1->edge_count, 3; -ok $g1->node_count, 4; -ok $g1->is_connected,1; -my $n = $g1->get_nodes_by_id('O24853'); -my @proteins = $n->proteins; -ok $proteins[0]->species->binomial('FULL'),"Helicobacter pylori 26695"; -ok $proteins[0]->primary_seq->desc,"hypothetical HP0001"; -my @rts = $g1->articulation_points; -ok scalar @rts,1; # correct, by inspection in Cytoscape - at proteins = $rts[0]->proteins; -my $seq = $proteins[0]; -ok $seq->desc,"hypothetical HP0001"; # correct, by inspection in Cytoscape - -# -# PSI XML from IntAct -# -ok $io = Bio::Network::IO->new - (-format => 'psi10', - -file => Bio::Root::IO->catfile("t", "data", "sv40_small.xml")); -ok $g1 = $io->next_network(); -ok $g1->edge_count, 3; -ok $g1->node_count, 5; -ok $g1->is_connected, ""; - -$n = $g1->get_nodes_by_id("P03070"); - at proteins = $n->proteins; -ok $proteins[0]->species->binomial('FULL'),"Simian virus 40"; -ok $proteins[0]->primary_seq->desc,"Large T antigen"; - -my @components = $g1->connected_components; -ok scalar @components, 2; - -# seems there's an intermittent bug in articulation_points() here -# but not in the invocation above -# @rts = $g1->articulation_points; -# ok scalar @rts, 1; # OK, inspected in Cytoscape -# @proteins = $rts[0]->proteins; -# $seq = $proteins[0]; -# ok $seq->desc,"Erythropoietin receptor precursor"; # OK, inspected in Cytoscape - -# -# GO terms -# -$n = $g1->get_nodes_by_id("EBI-474016"); - at proteins = $n->proteins; - +__END__ From bosborne at dev.open-bio.org Mon Feb 4 13:46:34 2008 From: bosborne at dev.open-bio.org (Brian Osborne) Date: Mon, 4 Feb 2008 13:46:34 -0500 Subject: [Bioperl-guts-l] [14473] bioperl-network/trunk/t/ProteinNet.t: Sweep Message-ID: <200802041846.m14IkYCe029838@dev.open-bio.org> Revision: 14473 Author: bosborne Date: 2008-02-04 13:46:33 -0500 (Mon, 04 Feb 2008) Log Message: ----------- Sweep Modified Paths: -------------- bioperl-network/trunk/t/ProteinNet.t Modified: bioperl-network/trunk/t/ProteinNet.t =================================================================== --- bioperl-network/trunk/t/ProteinNet.t 2008-02-04 17:33:09 UTC (rev 14472) +++ bioperl-network/trunk/t/ProteinNet.t 2008-02-04 18:46:33 UTC (rev 14473) @@ -167,7 +167,7 @@ # test subgraph # $io = Bio::Network::IO->new -(-format => 'psi', +(-format => 'psi10', -file => Bio::Root::IO->catfile("t","data","bovin_small_intact.xml")); my $g = $io->next_network(); ok $g->edges, 15; @@ -251,7 +251,7 @@ # test that removing a node removes its edges correctly # ok $io = Bio::Network::IO->new - (-format => 'psi', + (-format => 'psi10', -file => Bio::Root::IO->catfile("t", "data", "sv40_small.xml")); ok $g1 = $io->next_network(); ok $g1->edge_count, 3; From bosborne at dev.open-bio.org Mon Feb 4 13:47:04 2008 From: bosborne at dev.open-bio.org (Brian Osborne) Date: Mon, 4 Feb 2008 13:47:04 -0500 Subject: [Bioperl-guts-l] [14474] bioperl-network/trunk/Bio/Network: Add get and set methods, add verbosity Message-ID: <200802041847.m14Il4UZ029863@dev.open-bio.org> Revision: 14474 Author: bosborne Date: 2008-02-04 13:47:04 -0500 (Mon, 04 Feb 2008) Log Message: ----------- Add get and set methods, add verbosity Modified Paths: -------------- bioperl-network/trunk/Bio/Network/IO/dip_tab.pm bioperl-network/trunk/Bio/Network/IO/psi25.pm bioperl-network/trunk/Bio/Network/IO.pm bioperl-network/trunk/Bio/Network/ProteinNet.pm Modified: bioperl-network/trunk/Bio/Network/IO/dip_tab.pm =================================================================== --- bioperl-network/trunk/Bio/Network/IO/dip_tab.pm 2008-02-04 18:46:33 UTC (rev 14473) +++ bioperl-network/trunk/Bio/Network/IO/dip_tab.pm 2008-02-04 18:47:04 UTC (rev 14474) @@ -131,8 +131,8 @@ # ($node_id1,$node_id2) = $self->_fix_id("DIP",$node_id1,$node_id2); ## skip if score is below threshold - if (defined($self->{'_th'}) && defined($score)) { - next unless $score >= $self->{'_th'}; + if ($self->threshold && defined($score)) { + next unless $score >= $self->threshold; } ## build node object if it's a new node, use DIP id Modified: bioperl-network/trunk/Bio/Network/IO/psi25.pm =================================================================== --- bioperl-network/trunk/Bio/Network/IO/psi25.pm 2008-02-04 18:46:33 UTC (rev 14473) +++ bioperl-network/trunk/Bio/Network/IO/psi25.pm 2008-02-04 18:47:04 UTC (rev 14474) @@ -248,10 +248,10 @@ use Bio::Annotation::Collection; use Bio::Annotation::Comment; use Bio::Annotation::Reference; -use Bio::Annotation::SimpleValue; +# use Bio::Annotation::SimpleValue; # use Bio::Network::IO::psi::intact; -use vars qw( @ISA %species $net $fac ); +use vars qw( @ISA %species $net $fac $verbose ); @ISA = qw(Bio::Network::IO Bio::Root::Root ); BEGIN { @@ -271,7 +271,7 @@ sub next_network { my $self = shift; $net = Bio::Network::ProteinNet->new(refvertexed => 1); - + $verbose = $self->verbose; # the tag in the handler is an XML field, the value is # the function called when that field is encountered my $t = XML::Twig->new(TwigHandlers => { @@ -302,9 +302,9 @@ my $org = $pi->first_child('organism'); eval { $taxid = $org->att('ncbiTaxId'); }; - if ( $@ ){ + if ($@) { print "No organism for interactor " . - $pi->first_child('names')->first_child('fullName')->text . "\n"; + $pi->first_child('names')->first_child('fullName')->text . "\n" if $verbose; $common = $full = $taxid = $nullVal; } elsif ( !exists($species{$taxid}) ) { # Make new species object if doesn't already exist @@ -355,10 +355,10 @@ }; if ($@) { print "No fullName for interactor " . - $pi->first_child('names')->first_child('shortLabel')->text . "\n"; + $pi->first_child('names')->first_child('shortLabel')->text . "\n" if $verbose; $desc = $pi->first_child('names')->first_child('shortLabel')->text; } - + # Use ids other than accession_no or primary_id for DBLink annotations my $ac = Bio::Annotation::Collection->new(); for my $db (keys %ids) { @@ -385,7 +385,7 @@ my $node = Bio::Network::Node->new(-protein => [($prot)]); $net->add_node($node); - # Add primary identifier and accession to internal id<->node mapping hash + # Add primary identifier and acc to internal id <-> node mapping hash $net->add_id_to_node($ids{'psixml'},$node); $net->add_id_to_node($prot->primary_id,$node); $net->add_id_to_node($prot->accession_number,$node); @@ -405,14 +405,16 @@ Purpose : Adds a new Interaction to a graph Usage : Do not call, called internally by next_network() Returns : - Notes : All interactions are made of 2 nodes - + Notes : All interactions are made of 2 nodes - if there are more + or less than 2 then no Interaction object is created =cut sub _addInteraction { my ($twig, $i) = @_; my @ints = $i->first_child('participantList')->children; + print "Interaction " . $i->first_child('xref')->first_child('primaryRef')->att('id') . + " has " . scalar @ints . " interactors\n" if $verbose; # 2 nodes are required if ( scalar @ints == 2 ) { Modified: bioperl-network/trunk/Bio/Network/IO.pm =================================================================== --- bioperl-network/trunk/Bio/Network/IO.pm 2008-02-04 18:46:33 UTC (rev 14473) +++ bioperl-network/trunk/Bio/Network/IO.pm 2008-02-04 18:47:04 UTC (rev 14474) @@ -16,7 +16,7 @@ # Read protein interaction data in some format my $io = Bio::Network::IO->new(-file => 'bovine.xml', - -format => 'psi' ); + -format => 'psi25' ); my $network = $io->next_network; =head1 DESCRIPTION @@ -45,7 +45,7 @@ =head1 REQUIREMENTS -To read or write from PSI XML you will need the XML::Twig module, +To read from PSI XML you will need the XML::Twig module, available from CPAN. =head1 FEEDBACK @@ -100,6 +100,7 @@ -format => format -threshold => a confidence score for the interaction, optional -source => optional database name (e.g. "intact") + -verbose => optional, set to 1 to get commentary =cut sub new { @@ -150,11 +151,40 @@ $self->throw("Sorry, you can't write from a generic Bio::NetworkIO object."); } +=head2 threshold + Name : get or set a threshold + Usage : $io->threshold($val) + Returns : The threshold + Args : A number or none + +=cut + +sub threshold { + my $self = shift; + $self->{_th} = @_ if @_; + return $self->{_th}; +} + +=head2 verbose + + Name : get or set verbosity + Usage : $io->verbose(1) + Returns : The verbosity setting + Args : 1 or none + +=cut + +sub verbose { + my $self = shift; + $self->{_verbose} = @_ if @_; + return $self->{_verbose}; +} + =head2 _load_format_module Title : _load_format_module - Usage : *INTERNAL Bio::Network::IO stuff* + Usage : INTERNAL Bio::Network::IO stuff Function: Loads up (like use) a module at run time on demand Returns : Args : @@ -193,8 +223,9 @@ sub _initialize_io { my ($self, @args) = @_; $self->SUPER::_initialize_io(@args); - my ($th) = $self->_rearrange( [qw(THRESHOLD)], @args); + my ($th,$verbose) = $self->_rearrange( [qw(THRESHOLD VERBOSE)], @args); $self->{'_th'} = $th; + $self->{'_verbose'} = $verbose; return $self; } Modified: bioperl-network/trunk/Bio/Network/ProteinNet.pm =================================================================== --- bioperl-network/trunk/Bio/Network/ProteinNet.pm 2008-02-04 18:46:33 UTC (rev 14473) +++ bioperl-network/trunk/Bio/Network/ProteinNet.pm 2008-02-04 18:47:04 UTC (rev 14474) @@ -13,7 +13,7 @@ # Read in from file my $graphio = Bio::Network::IO->new(-file => 'human.xml', - -format => 'psi'); + -format => 'psi25'); my $graph = $graphio->next_network(); my @edges = $gr->edges; From bosborne at dev.open-bio.org Mon Feb 4 13:49:38 2008 From: bosborne at dev.open-bio.org (Brian Osborne) Date: Mon, 4 Feb 2008 13:49:38 -0500 Subject: [Bioperl-guts-l] [14475] bioperl-network/trunk/Bio/Network/IO/psi25.pm: Add get and set methods, add verbosity Message-ID: <200802041849.m14IncHF029888@dev.open-bio.org> Revision: 14475 Author: bosborne Date: 2008-02-04 13:49:38 -0500 (Mon, 04 Feb 2008) Log Message: ----------- Add get and set methods, add verbosity Modified Paths: -------------- bioperl-network/trunk/Bio/Network/IO/psi25.pm Modified: bioperl-network/trunk/Bio/Network/IO/psi25.pm =================================================================== --- bioperl-network/trunk/Bio/Network/IO/psi25.pm 2008-02-04 18:47:04 UTC (rev 14474) +++ bioperl-network/trunk/Bio/Network/IO/psi25.pm 2008-02-04 18:49:38 UTC (rev 14475) @@ -1,4 +1,4 @@ -# $Id: psi10.pm 14461 2008-02-01 17:29:35Z bosborne $ +# $Id: psi25.pm 14461 2008-02-01 17:29:35Z bosborne $ # # BioPerl module for Bio::Network::IO::psi25 # @@ -20,9 +20,9 @@ =head1 DESCRIPTION -PSI MI (Protein Standards Initiative Molecular Interaction) XML is a format -to describe protein-protein interactions and interaction networks. -This module parses version 2.5 of PSI MI. +PSI MI (Protein Standards Initiative Molecular Interaction) XML is a +format to describe protein-protein interactions and interaction +networks. This module parses version 2.5 of PSI MI. =head2 Databases From cjfields at dev.open-bio.org Mon Feb 4 14:49:27 2008 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Mon, 4 Feb 2008 14:49:27 -0500 Subject: [Bioperl-guts-l] [14476] bioperl-ext/trunk/Bio/SeqIO/staden/read.xs: Add cast to correct pointer type (get rid of warnings) Message-ID: <200802041949.m14JnRc7030017@dev.open-bio.org> Revision: 14476 Author: cjfields Date: 2008-02-04 14:49:26 -0500 (Mon, 04 Feb 2008) Log Message: ----------- Add cast to correct pointer type (get rid of warnings) Modified Paths: -------------- bioperl-ext/trunk/Bio/SeqIO/staden/read.xs Modified: bioperl-ext/trunk/Bio/SeqIO/staden/read.xs =================================================================== --- bioperl-ext/trunk/Bio/SeqIO/staden/read.xs 2008-02-04 18:49:38 UTC (rev 14475) +++ bioperl-ext/trunk/Bio/SeqIO/staden/read.xs 2008-02-04 19:49:26 UTC (rev 14476) @@ -197,10 +197,10 @@ av_push(baseLocs, baseLoc); } - aRef = newRV_inc(aTrace); - cRef = newRV_inc(cTrace); - gRef = newRV_inc(gTrace); - tRef = newRV_inc(tTrace); + aRef = newRV_inc((SV *) aTrace); + cRef = newRV_inc((SV *) cTrace); + gRef = newRV_inc((SV *) gTrace); + tRef = newRV_inc((SV *) tTrace); baseRef = newRV_inc(baseLocs); sp = mark; From bosborne at dev.open-bio.org Mon Feb 4 15:40:29 2008 From: bosborne at dev.open-bio.org (Brian Osborne) Date: Mon, 4 Feb 2008 15:40:29 -0500 Subject: [Bioperl-guts-l] [14477] bioperl-network/trunk/Bio/Network/IO.pm: Typo Message-ID: <200802042040.m14KeTCI030202@dev.open-bio.org> Revision: 14477 Author: bosborne Date: 2008-02-04 15:40:29 -0500 (Mon, 04 Feb 2008) Log Message: ----------- Typo Modified Paths: -------------- bioperl-network/trunk/Bio/Network/IO.pm Modified: bioperl-network/trunk/Bio/Network/IO.pm =================================================================== --- bioperl-network/trunk/Bio/Network/IO.pm 2008-02-04 19:49:26 UTC (rev 14476) +++ bioperl-network/trunk/Bio/Network/IO.pm 2008-02-04 20:40:29 UTC (rev 14477) @@ -93,7 +93,7 @@ Name : new Usage : $io = Bio::Network::IO->new(-file => 'myfile.xml', - -format => 'psi'); + -format => 'psi25'); Returns : A Bio::Network::IO stream initialised to the appropriate format. Args : Named parameters: -file => $filename From bugzilla-daemon at portal.open-bio.org Mon Feb 4 15:52:50 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 4 Feb 2008 15:52:50 -0500 Subject: [Bioperl-guts-l] [Bug 2441] New: Bio::Assembly treatment of singletons is inconsistent Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2441 Summary: Bio::Assembly treatment of singletons is inconsistent Product: BioPerl Version: main-trunk Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: P3 Component: Core Components AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: rmb32 at cornell.edu Bio::Assembly::Scaffold's interface represents singletons as Bio::PrimarySeqI objects. However, there exists a Bio::Assembly::Singleton class. Of the Bio::Assembly::IO::* classes, two of them (ace and tigr) represent singletons as Bio::Assembly::Singlet objects, and one of them (phrap) uses just PrimarySeq objects. This confusion probably arose because Bio::Assembly::ScaffoldI refers to Bio::Assembly::Singlet objects in its interface specification, while Bio::Assembly::Scaffold, its implementation, says singletons are represented as PrimarySeqI objects. So, something needs to be done to bring all these modules into agreement on how to represent singletons. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bosborne at dev.open-bio.org Mon Feb 4 16:44:27 2008 From: bosborne at dev.open-bio.org (Brian Osborne) Date: Mon, 4 Feb 2008 16:44:27 -0500 Subject: [Bioperl-guts-l] [14478] bioperl-network/trunk/t/Graph-Articulation.x: Update, but Graph' s articulation_points() is still unreliable Message-ID: <200802042144.m14LiRw1030389@dev.open-bio.org> Revision: 14478 Author: bosborne Date: 2008-02-04 16:44:27 -0500 (Mon, 04 Feb 2008) Log Message: ----------- Update, but Graph's articulation_points() is still unreliable Modified Paths: -------------- bioperl-network/trunk/t/Graph-Articulation.x Modified: bioperl-network/trunk/t/Graph-Articulation.x =================================================================== --- bioperl-network/trunk/t/Graph-Articulation.x 2008-02-04 20:40:29 UTC (rev 14477) +++ bioperl-network/trunk/t/Graph-Articulation.x 2008-02-04 21:44:27 UTC (rev 14478) @@ -16,7 +16,7 @@ use lib 't'; } use Test; - $NUMTESTS = 47; + $NUMTESTS = 51; plan tests => $NUMTESTS; eval { require Graph; }; if ($@) { @@ -31,16 +31,17 @@ } } -exit 0 if $ERROR == 1; +exit 0 if $ERROR == 1; -require Bio::Network::ProteinNet; require Bio::Network::IO; -require Bio::Network::Interaction; my $verbose = 0; $verbose = 1 if $DEBUG; # tests for Graph's problematic articulation_points() +# As of 2/2008 this test suite is still not reliably passing - +# I run it 5 times and I'll get an error 1 out of 5: +# Can't locate object method "proteins" via package "Bio::Network::Node... ok 1; @@ -60,17 +61,18 @@ ok $nodes, 13; # # test articulation_points, but first check that each Node -# in network can load... +# in network exists as an object # $io = Bio::Network::IO->new -(-format => 'psi', +(-format => 'psi10', -file => Bio::Root::IO->catfile("t","data","bovin_small_intact.xml")); my $g = $io->next_network(); @nodes = $g->nodes; ok scalar @nodes, 23; + foreach my $node (@nodes) { - my @seqs = $nodes[0]->proteins; + my @seqs = $node->proteins; ok $seqs[0]->display_id; } @@ -79,39 +81,29 @@ @nodes = $g->articulation_points; ok scalar @nodes, 4; # OK, inspected in Cytoscape -my @eids = qw(EBI-307814 EBI-79764 EBI-620432 EBI-620400); -my @seqs = $nodes[0]->proteins; # Node not always loaded -my $id = $seqs[0]->display_id; -ok grep /$id/, @eids; - at seqs = $nodes[1]->proteins; # Node not always loaded -$id = $seqs[0]->display_id; -ok grep /$id/, @eids; - at seqs = $nodes[2]->proteins; # Node not always loaded -$id = $seqs[0]->display_id; -ok grep /$id/, @eids; - at seqs = $nodes[3]->proteins; # Node not always loaded -$id = $seqs[0]->display_id; -ok grep /$id/, @eids; +my @eids = qw(Q29462 P16106 Q27954 P53619); +foreach my $node (@nodes) { + my @seqs = $node->proteins; + ok my $id = $seqs[0]->display_id; + ok grep /$id/, @eids; +} # # additional articulation_points tests # arath_small-02.xml is PSI MI version 1.0 # ok $io = Bio::Network::IO->new - (-format => 'psi', + (-format => 'psi10', -file => Bio::Root::IO->catfile("t", "data", "arath_small-02.xml")); ok $g1 = $io->next_network(); ok $g1->nodes, 73; ok $g1->interactions, 516; @nodes = $g1->articulation_points; ok scalar @nodes, 8; -my @ids = qw(EBI-621930 EBI-622235 EBI-622281 EBI-622140 - EBI-622382 EBI-622306 EBI-622264 EBI-622203 ); + for my $node (@nodes) { for my $prot ($node->proteins) { - my $id = $prot->display_id; - ok grep /$id/, at ids; + ok $prot->display_id; } } __END__ - From bosborne at dev.open-bio.org Mon Feb 4 16:50:47 2008 From: bosborne at dev.open-bio.org (Brian Osborne) Date: Mon, 4 Feb 2008 16:50:47 -0500 Subject: [Bioperl-guts-l] [14479] bioperl-network/trunk/t/Graph-MD5.t: Update and check Message-ID: <200802042150.m14LolmB030414@dev.open-bio.org> Revision: 14479 Author: bosborne Date: 2008-02-04 16:50:47 -0500 (Mon, 04 Feb 2008) Log Message: ----------- Update and check Modified Paths: -------------- bioperl-network/trunk/t/Graph-MD5.t Modified: bioperl-network/trunk/t/Graph-MD5.t =================================================================== --- bioperl-network/trunk/t/Graph-MD5.t 2008-02-04 21:44:27 UTC (rev 14478) +++ bioperl-network/trunk/t/Graph-MD5.t 2008-02-04 21:50:47 UTC (rev 14479) @@ -16,7 +16,7 @@ use lib 't'; } use Test; - $NUMTESTS = 19; + $NUMTESTS = 20; plan tests => $NUMTESTS; eval { require Graph::Undirected; }; if ( $@ ) { @@ -76,19 +76,18 @@ my $seq = $g->random_vertex; # OK ok $seq->add($str); -my @rts = $g->articulation_points; -ok @rts; - my $t = Graph::Traversal::DFS->new($g); $t->dfs; @vs = $t->seen; +ok scalar @vs, 4; for my $seq (@vs) { ok $seq->add($str); # NOT OK in version .73 } @vs = $g->articulation_points; -ok $vs[0]->add($str); # OK in version .70 ok scalar @vs, 2; +ok $vs[0]->add($str); # OK in version .70 +ok $vs[1]->add($str); my @cc = $g->connected_components; for my $ref (@cc) { From cjfields at dev.open-bio.org Mon Feb 4 18:41:47 2008 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Mon, 4 Feb 2008 18:41:47 -0500 Subject: [Bioperl-guts-l] [14480] bioperl-live/trunk/Bio/SearchIO/Writer/TextResultWriter.pm: Fixed bits/raw_score issue related to GenericHit changes to raw_score/bits Message-ID: <200802042341.m14NflwZ030582@dev.open-bio.org> Revision: 14480 Author: cjfields Date: 2008-02-04 18:41:47 -0500 (Mon, 04 Feb 2008) Log Message: ----------- Fixed bits/raw_score issue related to GenericHit changes to raw_score/bits Modified Paths: -------------- bioperl-live/trunk/Bio/SearchIO/Writer/TextResultWriter.pm Modified: bioperl-live/trunk/Bio/SearchIO/Writer/TextResultWriter.pm =================================================================== --- bioperl-live/trunk/Bio/SearchIO/Writer/TextResultWriter.pm 2008-02-04 21:50:47 UTC (rev 14479) +++ bioperl-live/trunk/Bio/SearchIO/Writer/TextResultWriter.pm 2008-02-04 23:41:47 UTC (rev 14480) @@ -196,10 +196,12 @@ $self->filter('HIT'), $self->filter('HSP') ); return '' if( defined $resultfilter && ! &{$resultfilter}($result) ); - + my ($qtype,$dbtype,$dbseqtype,$type); my $alg = $result->algorithm; - + + my $wublast = ($result->algorithm_version =~ /WashU/) ? 1 : 0; + # This is actually wrong for the FASTAs I think if( $alg =~ /T(FAST|BLAST)([XY])/i ) { $qtype = $dbtype = 'translated'; @@ -264,11 +266,14 @@ } else { $descsub = sprintf($p,$desc); } - - $str .= sprintf("%s %-4s %s\n", + $str .= $wublast ? sprintf("%s %-4s %s\n", $descsub, defined $hit->raw_score ? $hit->raw_score : ' ', - defined $hit->significance ? $hit->significance : '?'); + defined $hit->significance ? $hit->significance : '?') : + sprintf("%s %-4s %s\n", + $descsub, + defined $hit->bits ? $hit->bits: ' ', + defined $hit->significance ? $hit->significance : '?'); my @hsps = $hit->hsps; if( @hsps ) { $hspstr .= sprintf(">%s %s\n%9sLength = %d\n\n", From bugzilla-daemon at portal.open-bio.org Mon Feb 4 18:52:13 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 4 Feb 2008 18:52:13 -0500 Subject: [Bioperl-guts-l] [Bug 2442] New: SeqFeature::Annotated constructor -feature arg breaks with another BSFA Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2442 Summary: SeqFeature::Annotated constructor -feature arg breaks with another BSFA Product: BioPerl Version: main-trunk Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Core Components AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: rmb32 at cornell.edu If you make a Bio::SeqFeature::Annotated feature and try to clone it into another Bio::SeqFeature::Annotated feature, it breaks. Here's a patch against t/SeqFeatAnnotated that shows this: Index: SeqFeatAnnotated.t =================================================================== RCS file: /home/repository/bioperl/bioperl-live/t/SeqFeatAnnotated.t,v retrieving revision 1.3 diff -r1.3 SeqFeatAnnotated.t 18a19 > -type => 'nucleotide_motif', 20a22 > -source => 'program_b', 53a56,65 > my $sfaa = Bio::SeqFeature::Annotated->new(-feature => $sfa); > is $sfaa->type->name,'nucleotide_motif'; > is $sfaa->primary_tag, 'nucleotide_motif'; > is $sfaa->source->display_text,'program_b'; > is $sfaa->source_tag,'program_b'; > is $sfaa->strand,1; > is $sfaa->start,1; > is $sfaa->end,5; > is $sfaa->score,12; > -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Feb 4 18:52:53 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 4 Feb 2008 18:52:53 -0500 Subject: [Bioperl-guts-l] [Bug 2442] SeqFeature::Annotated constructor -feature arg breaks with another BSFA In-Reply-To: Message-ID: <200802042352.m14Nqrm6017970@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2442 ------- Comment #1 from rmb32 at cornell.edu 2008-02-04 18:52 EST ------- Created an attachment (id=852) --> (http://bugzilla.open-bio.org/attachment.cgi?id=852&action=view) test patch -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Feb 4 20:36:15 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 4 Feb 2008 20:36:15 -0500 Subject: [Bioperl-guts-l] [Bug 2442] SeqFeature::Annotated constructor -feature arg breaks with another BSFA In-Reply-To: Message-ID: <200802050136.m151aFlB021876@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2442 ------- Comment #2 from cjfields at uiuc.edu 2008-02-04 20:36 EST ------- Confirmed this using bioperl-live; the error I get is: Can't locate object method "all_tags" via package "Bio::SeqFeature::Annotated" at Bio/SeqFeature/AnnotationAdaptor.pm line 247. Fixing that to 'get_all_tags' leads to the following error: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Object Bio::Annotation::SimpleValue=HASH(0x8d5480) was not valid with key type. If you were adding new keys in, perhaps you want to make use of the archetype method to allow registration to a more basic type STACK: Error::throw STACK: Bio::Root::Root::throw Bio/Root/Root.pm:357 STACK: Bio::Annotation::Collection::add_Annotation Bio/Annotation/Collection.pm:273 STACK: Bio::SeqFeature::Annotated::add_Annotation Bio/SeqFeature/Annotated.pm:594 STACK: Bio::SeqFeature::Annotated::from_feature Bio/SeqFeature/Annotated.pm:298 STACK: Bio::SeqFeature::Annotated::_initialize Bio/SeqFeature/Annotated.pm:247 STACK: Bio::SeqFeature::Annotated::new Bio/SeqFeature/Annotated.pm:210 STACK: t/SeqFeatAnnotated.t:55 ----------------------------------------------------------- I'll try looking into it. The problem is this class may be deprecated or refactored significantly in the next release (it has too many inherent problems). -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Feb 4 20:47:44 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 4 Feb 2008 20:47:44 -0500 Subject: [Bioperl-guts-l] [Bug 2442] SeqFeature::Annotated constructor -feature arg breaks with another BSFA In-Reply-To: Message-ID: <200802050147.m151li5V022469@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2442 ------- Comment #3 from rmb32 at cornell.edu 2008-02-04 20:47 EST ------- (In reply to comment #2) I've already done some looking, and I think the problem lies in the BSF::Annotated::from_feature() method where the AnnotationAdaptor is used to copy over all the annotations. Some of the BSFA accessors (qw/seq_id source type frame phase score/) store their values as Annotations, and these get sort of copied over twice, with this code. Try this code for the copying part, it works pretty well, if you make sure OntologyTerm is listed properly as the type for 'type' in the Bio::Annotation::TypeManager: # now pick up the annotations/tags of the other feature # We'll use AnnotationAdaptor to convert everything over my %no_copy = map {$_ => 1} qw/seq_id source type frame phase score/; my $adaptor = Bio::SeqFeature::AnnotationAdaptor->new(-feature => $feat); for my $key ( $adaptor->get_all_annotation_keys() ) { next if $no_copy{$key}; my @values = $adaptor->get_Annotations($key); @values = _aggregate_scalar_annotations(\%opts,$key, at values); foreach my $val (@values) { $self->add_Annotation($key,$val) } } With this, it just skips copying the annotations with 'reserved' names. But yes, the BSFA stuff really needs a big cleaning-out. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Mon Feb 4 21:16:21 2008 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Mon, 4 Feb 2008 21:16:21 -0500 Subject: [Bioperl-guts-l] [14481] bioperl-live/trunk: bug 2442 Message-ID: <200802050216.m152GL1F030771@dev.open-bio.org> Revision: 14481 Author: cjfields Date: 2008-02-04 21:16:21 -0500 (Mon, 04 Feb 2008) Log Message: ----------- bug 2442 Modified Paths: -------------- bioperl-live/trunk/Bio/SeqFeature/Annotated.pm bioperl-live/trunk/Bio/SeqFeature/AnnotationAdaptor.pm bioperl-live/trunk/t/SeqFeatAnnotated.t Modified: bioperl-live/trunk/Bio/SeqFeature/Annotated.pm =================================================================== --- bioperl-live/trunk/Bio/SeqFeature/Annotated.pm 2008-02-04 23:41:47 UTC (rev 14480) +++ bioperl-live/trunk/Bio/SeqFeature/Annotated.pm 2008-02-05 02:16:21 UTC (rev 14481) @@ -271,32 +271,34 @@ =cut sub from_feature { - my ($self,$feat,%opts) = @_; + my ($self,$feat,%opts) = @_; + + # should deal with any SeqFeatureI implementation (i.e. we don't want to + # automatically force a OO-heavy implementation on all classes) + ref($feat) && ($feat->isa('Bio::SeqFeatureI')) + or $self->throw('invalid arguments to from_feature'); + + #TODO: add overrides in opts for these values, so people don't have to screw up their feature object + #if they don't want to + + ### set most of the data + foreach my $fieldname (qw/ start end strand frame score location seq_id source_tag primary_tag/) { + #no strict 'refs'; #using symbolic refs, yes, but using them for methods is allowed now + $self->$fieldname( $feat->$fieldname ); + } - # should deal with any SeqFeatureI implementation (i.e. we don't want to - # automatically force a OO-heavy implementation on all classes) - ref($feat) && ($feat->isa('Bio::SeqFeatureI')) - or $self->throw('invalid arguments to from_feature'); + # now pick up the annotations/tags of the other feature + # We'll use AnnotationAdaptor to convert everything over - #TODO: add overrides in opts for these values, so people don't have to screw up their feature object - #if they don't want to - - ### set most of the data - foreach my $fieldname (qw/ start end strand frame score location seq_id source_tag primary_tag/) { - #no strict 'refs'; #using symbolic refs, yes, but using them for methods is allowed now - $self->$fieldname( $feat->$fieldname ); - } - - # now pick up the annotations/tags of the other feature - # We'll use AnnotationAdaptor to convert everything over - my $anncoll = Bio::SeqFeature::AnnotationAdaptor->new(-feature => $feat); - - for my $key ( $anncoll->get_all_annotation_keys() ) { - my @values = $anncoll->get_Annotations($key); - @values = _aggregate_scalar_annotations(\%opts,$key, at values); - foreach my $val (@values) { - $self->add_Annotation($key,$val) - } + my %no_copy = map {$_ => 1} qw/seq_id source type frame phase score/; + my $adaptor = Bio::SeqFeature::AnnotationAdaptor->new(-feature => $feat); + for my $key ( $adaptor->get_all_annotation_keys() ) { + next if $no_copy{$key}; + my @values = $adaptor->get_Annotations($key); + @values = _aggregate_scalar_annotations(\%opts,$key, at values); + foreach my $val (@values) { + $self->add_Annotation($key,$val) + } } } #given a key and its values, make the values into Modified: bioperl-live/trunk/Bio/SeqFeature/AnnotationAdaptor.pm =================================================================== --- bioperl-live/trunk/Bio/SeqFeature/AnnotationAdaptor.pm 2008-02-04 23:41:47 UTC (rev 14480) +++ bioperl-live/trunk/Bio/SeqFeature/AnnotationAdaptor.pm 2008-02-05 02:16:21 UTC (rev 14481) @@ -244,7 +244,11 @@ my @keys = (); # get the tags from the feature object - push(@keys, $self->feature()->all_tags()); + if ($self->feature()->can('get_all_tags')) { + push(@keys, $self->feature()->get_all_tags()); + } else { + push(@keys, $self->feature()->all_tags()); + } # ask the annotation implementation in addition, while avoiding duplicates if($self->annotation()) { push(@keys, Modified: bioperl-live/trunk/t/SeqFeatAnnotated.t =================================================================== --- bioperl-live/trunk/t/SeqFeatAnnotated.t 2008-02-04 23:41:47 UTC (rev 14480) +++ bioperl-live/trunk/t/SeqFeatAnnotated.t 2008-02-05 02:16:21 UTC (rev 14481) @@ -1,5 +1,5 @@ # -*-Perl-*- Test Harness script for Bioperl -# $Id: SeqFeatAnnotated.t,v 1.50 2007/06/27 10:16:37 sendu Exp $ +# $Id$ use strict; @@ -7,7 +7,7 @@ use lib 't/lib'; use BioperlTest; - test_begin(-tests => 26, -requires_module => 'URI::Escape'); + test_begin(-tests => 34, -requires_module => 'URI::Escape'); use_ok('Bio::SeqFeature::Generic'); use_ok('Bio::SeqFeature::Annotated'); @@ -17,8 +17,10 @@ -end => 5, -strand => "+", -frame => 2, + -type => 'nucleotide_motif', -phase => 2, -score => 12, + -source => 'program_b', -display_name => 'test.annot', -seq_id => 'test.displayname' ); @@ -50,6 +52,15 @@ is $sfa2->end,440; is $sfa2->get_Annotations('silly')->value,20; is $sfa2->get_Annotations('new')->value,1; +my $sfaa = Bio::SeqFeature::Annotated->new(-feature => $sfa); +is $sfaa->type->name,'nucleotide_motif'; +is $sfaa->primary_tag, 'nucleotide_motif'; +is $sfaa->source->display_text,'program_b'; +is $sfaa->source_tag,'program_b'; +is $sfaa->strand,1; +is $sfaa->start,1; +is $sfaa->end,5; +is $sfaa->score,12; my $sfa3 = Bio::SeqFeature::Annotated->new( -start => 1, -end => 5, From bugzilla-daemon at portal.open-bio.org Mon Feb 4 21:18:33 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 4 Feb 2008 21:18:33 -0500 Subject: [Bioperl-guts-l] [Bug 2442] SeqFeature::Annotated constructor -feature arg breaks with another BSFA In-Reply-To: Message-ID: <200802050218.m152IXP2024467@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2442 cjfields at uiuc.edu changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #4 from cjfields at uiuc.edu 2008-02-04 21:18 EST ------- Okay, works now. I've added your tests as well. Closing this out; thanks! -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Mon Feb 4 21:37:57 2008 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Mon, 4 Feb 2008 21:37:57 -0500 Subject: [Bioperl-guts-l] [14482] bioperl-live/trunk/Bio/Tree/Draw/Cladogram.pm: bug 2415 Message-ID: <200802050237.m152bvCJ030825@dev.open-bio.org> Revision: 14482 Author: cjfields Date: 2008-02-04 21:37:57 -0500 (Mon, 04 Feb 2008) Log Message: ----------- bug 2415 Modified Paths: -------------- bioperl-live/trunk/Bio/Tree/Draw/Cladogram.pm Modified: bioperl-live/trunk/Bio/Tree/Draw/Cladogram.pm =================================================================== --- bioperl-live/trunk/Bio/Tree/Draw/Cladogram.pm 2008-02-05 02:16:21 UTC (rev 14481) +++ bioperl-live/trunk/Bio/Tree/Draw/Cladogram.pm 2008-02-05 02:37:57 UTC (rev 14482) @@ -447,8 +447,8 @@ # print $xx{$node}, " ", $yy{$node}, " lineto\n"; if ($colors) { print $INFO "stroke\n"; - print $INFO $Rcolor{$node->ancestor}, " ", $Gcolor{$node->ancestor}, " ", - $Bcolor{$node->ancestor}, " setrgbcolor\n"; + print $INFO $Rcolor{$node}, " ", $Gcolor{$node}, " ", + $Bcolor{$node}, " setrgbcolor\n"; } print $INFO $xx{$node}, " ", $yy{$node}, " moveto\n"; print $INFO $xx{$node->ancestor}, " ", $yy{$node}, " lineto\n"; From bugzilla-daemon at portal.open-bio.org Mon Feb 4 21:38:05 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 4 Feb 2008 21:38:05 -0500 Subject: [Bioperl-guts-l] [Bug 2415] Wrong tree coloring by Tree::Draw::Cladogram In-Reply-To: Message-ID: <200802050238.m152c5KW025723@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2415 cjfields at uiuc.edu changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #3 from cjfields at uiuc.edu 2008-02-04 21:38 EST ------- Committing this; doesn't seem to cause any harm. This might be reverted if there are any complaints, but makes sense to have the results match what the docs state. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Feb 4 21:44:05 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 4 Feb 2008 21:44:05 -0500 Subject: [Bioperl-guts-l] [Bug 2427] error while using Bio::Search::cross_match In-Reply-To: Message-ID: <200802050244.m152i5GK026093@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2427 cjfields at uiuc.edu changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #3 from cjfields at uiuc.edu 2008-02-04 21:44 EST ------- Some remedial tests set up (t/cross_match.t). The parser will parsrer will probably need some more work eventually, but works for now. Closing out. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Mon Feb 4 22:11:28 2008 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Mon, 4 Feb 2008 22:11:28 -0500 Subject: [Bioperl-guts-l] [14483] bioperl-live/trunk/Bio/PrimarySeq.pm: bug 2438; 'X' no longer completely ambiguous Message-ID: <200802050311.m153BS0T030892@dev.open-bio.org> Revision: 14483 Author: cjfields Date: 2008-02-04 22:11:28 -0500 (Mon, 04 Feb 2008) Log Message: ----------- bug 2438; 'X' no longer completely ambiguous Modified Paths: -------------- bioperl-live/trunk/Bio/PrimarySeq.pm Modified: bioperl-live/trunk/Bio/PrimarySeq.pm =================================================================== --- bioperl-live/trunk/Bio/PrimarySeq.pm 2008-02-05 02:37:57 UTC (rev 14482) +++ bioperl-live/trunk/Bio/PrimarySeq.pm 2008-02-05 03:11:28 UTC (rev 14483) @@ -827,7 +827,7 @@ my $str = $self->seq(); # Remove char's that clearly denote ambiguity - $str =~ s/[-.?x]//gi; + $str =~ s/[-.?]//gi; my $total = CORE::length($str); if( $total == 0 ) { From bugzilla-daemon at portal.open-bio.org Mon Feb 4 22:12:22 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 4 Feb 2008 22:12:22 -0500 Subject: [Bioperl-guts-l] [Bug 2438] no letters warning of SeqIO In-Reply-To: Message-ID: <200802050312.m153CMfG027957@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2438 cjfields at uiuc.edu changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #1 from cjfields at uiuc.edu 2008-02-04 22:12 EST ------- It was a problem in Bio::PrimarySeq::_guess_alphabet(); it considers 'X' too ambiguous. I have removed it since 'X' almost always means 'any amino acid' as opposed to 'anything at all'. The sequence is still present in the object; try passing it to write_seq(). In the meantime, I'll go ahead and close this out. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From lstein at dev.open-bio.org Mon Feb 4 22:46:35 2008 From: lstein at dev.open-bio.org (Lincoln Stein) Date: Mon, 4 Feb 2008 22:46:35 -0500 Subject: [Bioperl-guts-l] [14484] bioperl-live/trunk/Bio/DB/SeqFeature: fixed a logic error which caused undefined end coordinates when creating a feature with abs =>1, a start and an undefined end Message-ID: <200802050346.m153kZpk030954@dev.open-bio.org> Revision: 14484 Author: lstein Date: 2008-02-04 22:46:34 -0500 (Mon, 04 Feb 2008) Log Message: ----------- fixed a logic error which caused undefined end coordinates when creating a feature with abs=>1, a start and an undefined end Modified Paths: -------------- bioperl-live/trunk/Bio/DB/SeqFeature/Segment.pm bioperl-live/trunk/Bio/DB/SeqFeature/Store.pm Modified: bioperl-live/trunk/Bio/DB/SeqFeature/Segment.pm =================================================================== --- bioperl-live/trunk/Bio/DB/SeqFeature/Segment.pm 2008-02-05 03:11:28 UTC (rev 14483) +++ bioperl-live/trunk/Bio/DB/SeqFeature/Segment.pm 2008-02-05 03:46:34 UTC (rev 14484) @@ -405,9 +405,9 @@ sub strand { shift->{strand} } sub ref { shift->seq_id } -sub length { - my $self = shift; - return abs($self->end - $self->start) +1; +sub length { + my $self = shift; + return abs($self->end - $self->start) +1; } sub primary_tag { 'region' } Modified: bioperl-live/trunk/Bio/DB/SeqFeature/Store.pm =================================================================== --- bioperl-live/trunk/Bio/DB/SeqFeature/Store.pm 2008-02-05 03:11:28 UTC (rev 14483) +++ bioperl-live/trunk/Bio/DB/SeqFeature/Store.pm 2008-02-05 03:46:34 UTC (rev 14484) @@ -1220,7 +1220,7 @@ my ($start,$end); if ($abs) { $start = $rel_start; - $end = $rel_end; + $end = defined $rel_end ? $rel_end : $start + $f->length - 1; } else { my $re = defined $rel_end ? $rel_end : $f->end - $f->start + 1; From lstein at dev.open-bio.org Tue Feb 5 11:35:25 2008 From: lstein at dev.open-bio.org (Lincoln Stein) Date: Tue, 5 Feb 2008 11:35:25 -0500 Subject: [Bioperl-guts-l] [14485] bioperl-live/trunk/Bio/Graphics/FeatureFile.pm: fixed bug in featurefile processing that inappropriately grouped features with no name Message-ID: <200802051635.m15GZPtT001783@dev.open-bio.org> Revision: 14485 Author: lstein Date: 2008-02-05 11:35:24 -0500 (Tue, 05 Feb 2008) Log Message: ----------- fixed bug in featurefile processing that inappropriately grouped features with no name Modified Paths: -------------- bioperl-live/trunk/Bio/Graphics/FeatureFile.pm Modified: bioperl-live/trunk/Bio/Graphics/FeatureFile.pm =================================================================== --- bioperl-live/trunk/Bio/Graphics/FeatureFile.pm 2008-02-05 03:46:34 UTC (rev 14484) +++ bioperl-live/trunk/Bio/Graphics/FeatureFile.pm 2008-02-05 16:35:24 UTC (rev 14485) @@ -597,7 +597,7 @@ } # either create a new feature or add a segment to it - if (my $feature = $self->{seenit}{$type,$name}) { + if (length $name && (my $feature = $self->{seenit}{$type,$name})) { # create a new segment to hold the parts if (!$feature->segments) { From bugzilla-daemon at portal.open-bio.org Tue Feb 5 12:43:17 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 5 Feb 2008 12:43:17 -0500 Subject: [Bioperl-guts-l] [Bug 2444] New: Bio::SearchIO blast signifance contains comma Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2444 Summary: Bio::SearchIO blast signifance contains comma Product: BioPerl Version: unspecified Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Bio::Search/Bio::SearchIO AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: bernd at bio.vu.nl BLAST reports parsed with Bio::SearchIO still have the "," in $hsp->significance. E.g. Score = 32.7 bits (73), Expect = 29, Method: Compositional matrix adjust. gives: "29," for $hsp->significance -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Feb 5 12:43:50 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 5 Feb 2008 12:43:50 -0500 Subject: [Bioperl-guts-l] [Bug 2444] Bio::SearchIO blast signifance contains comma In-Reply-To: Message-ID: <200802051743.m15Hhokx014891@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2444 ------- Comment #1 from bernd at bio.vu.nl 2008-02-05 12:43 EST ------- Created an attachment (id=854) --> (http://bugzilla.open-bio.org/attachment.cgi?id=854&action=view) BLAST report -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Feb 5 12:44:23 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 5 Feb 2008 12:44:23 -0500 Subject: [Bioperl-guts-l] [Bug 2444] Bio::SearchIO blast signifance contains comma In-Reply-To: Message-ID: <200802051744.m15HiNk6014946@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2444 ------- Comment #2 from bernd at bio.vu.nl 2008-02-05 12:44 EST ------- Created an attachment (id=855) --> (http://bugzilla.open-bio.org/attachment.cgi?id=855&action=view) BLAST report -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Feb 5 12:54:34 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 5 Feb 2008 12:54:34 -0500 Subject: [Bioperl-guts-l] [Bug 2445] New: Bio::SearchIO BLAST parser gets wrong score and evalue Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2445 Summary: Bio::SearchIO BLAST parser gets wrong score and evalue Product: BioPerl Version: unspecified Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Bio::Search/Bio::SearchIO AssignedTo: bioperl-guts-l at bioperl.org ReportedBy: bernd at bio.vu.nl For large BLAST reports with E-values > 10 SearchIO gets strange score/evalues for the hits. E.g. hit name, score, and signifance for attached report: gb|ABY47600.1| ... 27.7 where the description line is: gb|ABY47600.1| apolipoprotein A-I [Gobiocypris rarus] 33.9 15 -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Feb 5 12:55:08 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 5 Feb 2008 12:55:08 -0500 Subject: [Bioperl-guts-l] [Bug 2445] Bio::SearchIO BLAST parser gets wrong score and evalue In-Reply-To: Message-ID: <200802051755.m15Ht87j015855@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2445 ------- Comment #1 from bernd at bio.vu.nl 2008-02-05 12:55 EST ------- Created an attachment (id=856) --> (http://bugzilla.open-bio.org/attachment.cgi?id=856&action=view) BLAST parser -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Feb 5 12:55:37 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 5 Feb 2008 12:55:37 -0500 Subject: [Bioperl-guts-l] [Bug 2445] Bio::SearchIO BLAST parser gets wrong score and evalue In-Reply-To: Message-ID: <200802051755.m15Htb1I015911@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2445 ------- Comment #2 from bernd at bio.vu.nl 2008-02-05 12:55 EST ------- Created an attachment (id=857) --> (http://bugzilla.open-bio.org/attachment.cgi?id=857&action=view) BLAST report -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bosborne at dev.open-bio.org Tue Feb 5 12:59:55 2008 From: bosborne at dev.open-bio.org (Brian Osborne) Date: Tue, 5 Feb 2008 12:59:55 -0500 Subject: [Bioperl-guts-l] [14487] bioperl-network/trunk/Bio/Network/IO/psi25.pm: use base, minor edits Message-ID: <200802051759.m15Hxtjm002058@dev.open-bio.org> Revision: 14487 Author: bosborne Date: 2008-02-05 12:59:55 -0500 (Tue, 05 Feb 2008) Log Message: ----------- use base, minor edits Modified Paths: -------------- bioperl-network/trunk/Bio/Network/IO/psi25.pm Modified: bioperl-network/trunk/Bio/Network/IO/psi25.pm =================================================================== --- bioperl-network/trunk/Bio/Network/IO/psi25.pm 2008-02-05 17:09:50 UTC (rev 14486) +++ bioperl-network/trunk/Bio/Network/IO/psi25.pm 2008-02-05 17:59:55 UTC (rev 14487) @@ -34,16 +34,17 @@ IntAct L MINT L -Each of these databases will call PSI format by some different name. For -example, PSI MI from DIP comes in files with the suffix "mif". +Each of these databases will call PSI format by some different name. +for example, PSI MI from DIP comes in files with the suffix "mif". Documentation for PSI XML can be found at L. =head2 Version -This module supports a subset of the fields described in PSI MI version 2.5. -(L). The NODE DATA section below -describes which fields are currently parsed into ProteinNet networks. +This module supports a subset of the fields described in PSI MI version +2.5. (L). The DATA IN THE NODE +section below describes which fields are currently parsed into +ProteinNet networks. =head2 Notes @@ -229,12 +230,12 @@ =head1 AUTHORS Brian Osborne bosborne at alum.mit.edu -Richard Adams richard.adams at ed.ac.uk =cut package Bio::Network::IO::psi25; use strict; +use base qw(Bio::Network::IO Bio::Root::Root); use XML::Twig; use Bio::Root::Root; use Bio::Seq::SeqFactory; @@ -244,15 +245,14 @@ use Bio::Network::Node; use Bio::Species; use Bio::Annotation::DBLink; -use Bio::Annotation::OntologyTerm; use Bio::Annotation::Collection; -use Bio::Annotation::Comment; -use Bio::Annotation::Reference; -# use Bio::Annotation::SimpleValue; -# use Bio::Network::IO::psi::intact; +#use Bio::Annotation::Comment; +#use Bio::Annotation::Reference; +#use Bio::Annotation::SimpleValue; +#use Bio::Network::IO::psi::intact; +#use Bio::Annotation::OntologyTerm; -use vars qw( @ISA %species $net $fac $verbose ); - at ISA = qw(Bio::Network::IO Bio::Root::Root ); +use vars qw( %species $net $fac $verbose ); BEGIN { $fac = Bio::Seq::SeqFactory->new(-type => 'Bio::Seq::RichSeq'); From bugzilla-daemon at portal.open-bio.org Tue Feb 5 13:38:45 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 5 Feb 2008 13:38:45 -0500 Subject: [Bioperl-guts-l] [Bug 2444] Bio::SearchIO blast signifance contains comma In-Reply-To: Message-ID: <200802051838.m15IcjKL019395@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2444 ------- Comment #3 from cjfields at uiuc.edu 2008-02-05 13:38 EST ------- I don't see the extra comma using bioperl-live, but it did uncover a significant bug, in that significance is set to the same value for all HSPs. evalue() works fine, though. You should probably use evalue() over significance() until I figure out what's going on. I have a feeling it's related to recent changes I made to GenericHit methods. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Feb 5 14:01:03 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 5 Feb 2008 14:01:03 -0500 Subject: [Bioperl-guts-l] [Bug 2445] Bio::SearchIO BLAST parser gets wrong score and evalue In-Reply-To: Message-ID: <200802051901.m15J13Dq020592@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2445 ------- Comment #3 from bernd at bio.vu.nl 2008-02-05 14:01 EST ------- Just found the cause in blast.pm. line 628: elsif (/([\d\.\+\-eE]+)\s+([\d\.\+\-eE]+)(\s+\d+)?\s*$/) { This RE matches the following line wrong: ref|YP_170868.1| hypothetical protein syc0158_c [Synechococcu... 34.3 10 now ($1, $2, $3) = ('...',34.3, 10 ), while: my ( $score, $evalue ) = ( $1, $2 ); Deleting the last (\s+\d+)? solves the problem for the BLAST report. Why is (\s+\d+) present? I suppose it's there for a reason. The problem is that for E-values that match \d, such as 10, 12 etc the RegExp shifts one slot to the left. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Feb 5 14:12:27 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 5 Feb 2008 14:12:27 -0500 Subject: [Bioperl-guts-l] [Bug 2445] Bio::SearchIO BLAST parser gets wrong score and evalue In-Reply-To: Message-ID: <200802051912.m15JCRkA021042@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2445 ------- Comment #4 from cjfields at uiuc.edu 2008-02-05 14:12 EST ------- We recently changed the way bits/score are used for specific BLAST reports. There was some inconsistencies about BLAST report parsing and HSP/Hit methods which Steve C and I discussed on the mail list in the answer to a question you raised: http://thread.gmane.org/gmane.comp.lang.perl.bio.general/16273/focus=16299 In HSPs, for instance, the NCBI BLAST hit table reports the bit score only, (or bits()), with the description giving both the bits and the score. WUBLAST flips that and reports the raw score in the hit table and both bits and score in the description. We simply changed that to be more consistent, so when you ask for bits() from an HSP, you get the bit score, and when you ask for score, you get the (raw) score. Similarly, in hits using significance should get you the best evalue/pvalue, using score() or raw_score() gets the best score, and bits() the best bit score. As mentioned in bug 2444, there appears to be a bug with HSP::significance which I'm still digging up. NCBI hit table: Score E Sequences producing significant alignments: (bits) Value gb|J00265.1|HUMINS01 Human insulin gene, complete cds 355 2e-95 .... NCBI description: >gb|J00265.1|HUMINS01 Human insulin gene, complete cds Length = 4044 Score = 355 bits (179), Expect = 2e-95 Identities = 179/179 (100%) Strand = Plus / Plus WUBLAST hit table: Smallest Sum High Probability Sequences producing High-scoring Segment Pairs: Score P(N) N gb|AAC73113.1| (AE000111) aspartokinase I, homoserine deh... 4141 0. 1 .... WUBLAST description: >gb|AAC73113.1| (AE000111) aspartokinase I, homoserine dehydrogenase I [Escherichia coli] Length = 820 Score = 4141 (1462.8 bits), Expect = 0., P = 0. Identities = 820/820 (100%), Positives = 820/820 (100%) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Feb 5 14:16:06 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 5 Feb 2008 14:16:06 -0500 Subject: [Bioperl-guts-l] [Bug 2445] Bio::SearchIO BLAST parser gets wrong score and evalue In-Reply-To: Message-ID: <200802051916.m15JG6Il021287@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2445 ------- Comment #5 from cjfields at uiuc.edu 2008-02-05 14:16 EST ------- (In reply to comment #3) > Just found the cause in blast.pm. > line 628: > elsif (/([\d\.\+\-eE]+)\s+([\d\.\+\-eE]+)(\s+\d+)?\s*$/) { > > This RE matches the following line wrong: > ref|YP_170868.1| hypothetical protein syc0158_c [Synechococcu... 34.3 > 10 > > now ($1, $2, $3) = ('...',34.3, 10 ), while: > my ( $score, $evalue ) = ( $1, $2 ); > > Deleting the last (\s+\d+)? solves the problem for the BLAST report. > Why is (\s+\d+) present? I suppose it's there for a reason. > The problem is that for E-values that match \d, such as 10, 12 etc the RegExp > shifts one slot to the left. This isn't present in bioperl-live. Can you try the latest code (download using Subversion) and check it out? -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Feb 5 14:39:05 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 5 Feb 2008 14:39:05 -0500 Subject: [Bioperl-guts-l] [Bug 2445] Bio::SearchIO BLAST parser gets wrong score and evalue In-Reply-To: Message-ID: <200802051939.m15Jd5Uc022198@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2445 ------- Comment #6 from bernd at bio.vu.nl 2008-02-05 14:39 EST ------- With the new SearchIO.pm and blast.pm from SVN the parsing is different. However $hit->score and $hit->raw_score do not return values. $hit_bits does work. while (my $result = $in->next_result ) { while ( my $hit = $result->next_hit) { print $hit->name, "\t"; print $hit->bits, "\t"; print $hit->significance, "\n"; } -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Feb 5 14:52:18 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 5 Feb 2008 14:52:18 -0500 Subject: [Bioperl-guts-l] [Bug 2445] Bio::SearchIO BLAST parser gets wrong score and evalue In-Reply-To: Message-ID: <200802051952.m15JqIVt022690@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2445 ------- Comment #7 from cjfields at uiuc.edu 2008-02-05 14:52 EST ------- (In reply to comment #6) > With the new SearchIO.pm and blast.pm from SVN the parsing is different. > However $hit->score and $hit->raw_score do not return values. > $hit_bits does work. > > while (my $result = $in->next_result ) { > while ( my $hit = $result->next_hit) { > print $hit->name, "\t"; > print $hit->bits, "\t"; > print $hit->significance, "\n"; > } > (In reply to comment #6) > With the new SearchIO.pm and blast.pm from SVN the parsing is different. > However $hit->score and $hit->raw_score do not return values. > $hit_bits does work. > > while (my $result = $in->next_result ) { > while ( my $hit = $result->next_hit) { > print $hit->name, "\t"; > print $hit->bits, "\t"; > print $hit->significance, "\n"; > } In general with complex fixes like this it's better to grab the whole distribution just in case. You'll have to at least replace Bio::Search::HSP::GenericHSP, Bio::Search::Hit::GenericHit, and Bio::SearchIO::SearchResultEventBuilder on top of what you already have, though there may be others. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Tue Feb 5 15:04:18 2008 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Tue, 5 Feb 2008 15:04:18 -0500 Subject: [Bioperl-guts-l] [14488] bioperl-live/trunk/Bio/Search/HSP/GenericHSP.pm: bug 2444 (HSP:: significance returning wrong value) Message-ID: <200802052004.m15K4I1C002217@dev.open-bio.org> Revision: 14488 Author: cjfields Date: 2008-02-05 15:04:18 -0500 (Tue, 05 Feb 2008) Log Message: ----------- bug 2444 (HSP::significance returning wrong value) Modified Paths: -------------- bioperl-live/trunk/Bio/Search/HSP/GenericHSP.pm Modified: bioperl-live/trunk/Bio/Search/HSP/GenericHSP.pm =================================================================== --- bioperl-live/trunk/Bio/Search/HSP/GenericHSP.pm 2008-02-05 17:59:55 UTC (rev 14487) +++ bioperl-live/trunk/Bio/Search/HSP/GenericHSP.pm 2008-02-05 20:04:18 UTC (rev 14488) @@ -900,12 +900,12 @@ # Override significance to return the e-value or, if this is # not defined (WU-BLAST), return the p-value. sub significance { - my $self = shift; - my $signif = $self->query->significance(@_) if @_; - if (!defined $signif) { - $signif = $self->pvalue || $self->query->significance; + my ($self, $val) = @_; + $self->query->significance($val) if $val; + if (!defined $self->{SIGNIFICANCE} || $val) { + $self->{SIGNIFICANCE} = $val || $self->pvalue || $self->evalue; } - return $signif; + return $self->{SIGNIFICANCE}; } =head2 score From bugzilla-daemon at portal.open-bio.org Tue Feb 5 15:26:09 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 5 Feb 2008 15:26:09 -0500 Subject: [Bioperl-guts-l] [Bug 2445] Bio::SearchIO BLAST parser gets wrong score and evalue In-Reply-To: Message-ID: <200802052026.m15KQ9xK025120@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2445 ------- Comment #8 from bernd at bio.vu.nl 2008-02-05 15:26 EST ------- hi chris, With newest bioperl-live parsing looks OK. However, I uncovered another related "score" issue for HTMLWriter: In HTMLResultWriter the raw_score or score is used as bit score (line 283-286 in HTMLResultWriter.pm): ($hit->raw_score ? $hit->raw_score : (defined $hsps[0] ? $hsps[0]->score : ' ')), can be replaced by $hit->bits. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Tue Feb 5 15:35:32 2008 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Tue, 5 Feb 2008 15:35:32 -0500 Subject: [Bioperl-guts-l] [14489] bioperl-live/trunk: * small rnamotif fix Message-ID: <200802052035.m15KZW7n002337@dev.open-bio.org> Revision: 14489 Author: cjfields Date: 2008-02-05 15:35:32 -0500 (Tue, 05 Feb 2008) Log Message: ----------- * small rnamotif fix * significance conforms to spec (evalue, then pvalue) * update tests Modified Paths: -------------- bioperl-live/trunk/Bio/Search/HSP/GenericHSP.pm bioperl-live/trunk/Bio/SearchIO/rnamotif.pm bioperl-live/trunk/t/RNA_SearchIO.t Modified: bioperl-live/trunk/Bio/Search/HSP/GenericHSP.pm =================================================================== --- bioperl-live/trunk/Bio/Search/HSP/GenericHSP.pm 2008-02-05 20:04:18 UTC (rev 14488) +++ bioperl-live/trunk/Bio/Search/HSP/GenericHSP.pm 2008-02-05 20:35:32 UTC (rev 14489) @@ -901,9 +901,9 @@ # not defined (WU-BLAST), return the p-value. sub significance { my ($self, $val) = @_; - $self->query->significance($val) if $val; if (!defined $self->{SIGNIFICANCE} || $val) { - $self->{SIGNIFICANCE} = $val || $self->pvalue || $self->evalue; + $self->query->significance($val) if $val; + $self->{SIGNIFICANCE} = $val || $self->evalue || $self->pvalue; } return $self->{SIGNIFICANCE}; } Modified: bioperl-live/trunk/Bio/SearchIO/rnamotif.pm =================================================================== --- bioperl-live/trunk/Bio/SearchIO/rnamotif.pm 2008-02-05 20:04:18 UTC (rev 14488) +++ bioperl-live/trunk/Bio/SearchIO/rnamotif.pm 2008-02-05 20:35:32 UTC (rev 14489) @@ -280,6 +280,7 @@ chomp $line; my $hspid = $1; my ($score, $strand, $start, $length , $seq) = ($2, $3, $4, $5, $6); + $score *= 1; # implicitly cast any odd '0.000' to float # sanity check ids unless ($hitid eq $hspid) { $self->throw("IDs do not match!"); Modified: bioperl-live/trunk/t/RNA_SearchIO.t =================================================================== --- bioperl-live/trunk/t/RNA_SearchIO.t 2008-02-05 20:04:18 UTC (rev 14488) +++ bioperl-live/trunk/t/RNA_SearchIO.t 2008-02-05 20:35:32 UTC (rev 14489) @@ -662,8 +662,8 @@ is($hit->num_hsps, 8, "Hit num_hsps"); is($hit->overlap, 0, "Hit overlap"); is($hit->rank, 1, "Hit rank"); -is($hit->raw_score, '', "Hit raw_score"); -is($hit->score, '', "Hit score"); +is($hit->raw_score, 0, "Hit raw_score"); +is($hit->score, 0, "Hit score"); is($hit->significance, undef, "Hit significance"); $hsp = $hit->next_hsp; From bugzilla-daemon at portal.open-bio.org Tue Feb 5 15:37:14 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 5 Feb 2008 15:37:14 -0500 Subject: [Bioperl-guts-l] [Bug 2444] Bio::SearchIO blast signifance contains comma In-Reply-To: Message-ID: <200802052037.m15KbEi4025798@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2444 ------- Comment #4 from cjfields at uiuc.edu 2008-02-05 15:37 EST ------- Added fixes for HSP::significance() to Subversion. Bernd, if this works could you close this out as well as bug 2445 if it also solves the issue there? -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Tue Feb 5 15:39:14 2008 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Tue, 5 Feb 2008 15:39:14 -0500 Subject: [Bioperl-guts-l] [14490] bioperl-live/trunk/Bio/SeqIO/genbank.pm: handles Bio:: Taxon as well as Bio::Species now ( might be worth propogating to other SeqIO modules?) Message-ID: <200802052039.m15KdEMf002367@dev.open-bio.org> Revision: 14490 Author: cjfields Date: 2008-02-05 15:39:14 -0500 (Tue, 05 Feb 2008) Log Message: ----------- handles Bio::Taxon as well as Bio::Species now (might be worth propogating to other SeqIO modules?) Modified Paths: -------------- bioperl-live/trunk/Bio/SeqIO/genbank.pm Modified: bioperl-live/trunk/Bio/SeqIO/genbank.pm =================================================================== --- bioperl-live/trunk/Bio/SeqIO/genbank.pm 2008-02-05 20:35:32 UTC (rev 14489) +++ bioperl-live/trunk/Bio/SeqIO/genbank.pm 2008-02-05 20:39:14 UTC (rev 14490) @@ -845,10 +845,24 @@ # Organism lines if (my $spec = $seq->species) { - my ($on, $sn, $cn) = ($spec->organelle, + my ($on, $sn, $cn) = ($spec->can('organelle') ? $spec->organelle : '', $spec->scientific_name, $spec->common_name); - + my @classification; + if ($spec->isa('Bio::Species')) { + @classification = $spec->classification; + shift(@classification); + } else { + # Bio::Taxon should have a DB handle of some type attached, so + # derive the classification from that + my $node = $spec; + while ($node) { + $node = $node->ancestor || last; + unshift(@classification, $node->node_name); + #$node eq $root && last; + } + @classification = reverse @classification; + } my $abname = $spec->name('abbreviated') ? # from genbank file $spec->name('abbreviated')->[0] : $sn; my $sl = $on ? "$on " : ''; @@ -856,8 +870,6 @@ $self->_write_line_GenBank_regex("SOURCE ", ' 'x12, $sl, "\\s\+\|\$",80); $self->_print(" ORGANISM ", $spec->scientific_name, "\n"); - my @classification = $spec->classification; - shift(@classification); my $OC = join('; ', (reverse(@classification))) .'.'; $self->_write_line_GenBank_regex(' 'x12,' 'x12, $OC,"\\s\+\|\$",80); From bugzilla-daemon at portal.open-bio.org Tue Feb 5 16:01:10 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 5 Feb 2008 16:01:10 -0500 Subject: [Bioperl-guts-l] [Bug 2445] Bio::SearchIO BLAST parser gets wrong score and evalue In-Reply-To: Message-ID: <200802052101.m15L1AOc026876@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2445 ------- Comment #9 from cjfields at uiuc.edu 2008-02-05 16:01 EST ------- (In reply to comment #8) > hi chris, > > With newest bioperl-live parsing looks OK. > However, I uncovered another related "score" issue for HTMLWriter: > > In HTMLResultWriter the raw_score or score is used as bit score (line 283-286 > in HTMLResultWriter.pm): > > ($hit->raw_score ? $hit->raw_score : > (defined $hsps[0] ? $hsps[0]->score : ' ')), > > can be replaced by $hit->bits. All the various writers will need updating to accommodate the latest fixes. I have already added one to TextResultWriter; I'll work on HTMLResultWriter next. I may not bother with the others as they're rarely used AFAIK. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From cjfields at dev.open-bio.org Tue Feb 5 16:39:18 2008 From: cjfields at dev.open-bio.org (Christopher John Fields) Date: Tue, 5 Feb 2008 16:39:18 -0500 Subject: [Bioperl-guts-l] [14491] bioperl-live/trunk/Bio/DB/SeqHound.pm: changed URL Message-ID: <200802052139.m15LdIHB002515@dev.open-bio.org> Revision: 14491 Author: cjfields Date: 2008-02-05 16:39:18 -0500 (Tue, 05 Feb 2008) Log Message: ----------- changed URL Modified Paths: -------------- bioperl-live/trunk/Bio/DB/SeqHound.pm Modified: bioperl-live/trunk/Bio/DB/SeqHound.pm =================================================================== --- bioperl-live/trunk/Bio/DB/SeqHound.pm 2008-02-05 20:39:14 UTC (rev 14490) +++ bioperl-live/trunk/Bio/DB/SeqHound.pm 2008-02-05 21:39:18 UTC (rev 14491) @@ -86,8 +86,9 @@ use POSIX qw(strftime); use base qw(Bio::DB::WebDBSeqI Bio::Root::Root); + BEGIN { - $HOSTBASE = 'http://seqhound.blueprint.org'; + $HOSTBASE = 'http://dogboxonline.unleashedinformatics.com'; $CGILOCATION = '/cgi-bin/seqrem?fnct='; $LOGFILENAME = 'shoundlog'; } From bugzilla-daemon at portal.open-bio.org Wed Feb 6 07:52:44 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 6 Feb 2008 07:52:44 -0500 Subject: [Bioperl-guts-l] [Bug 2445] Bio::SearchIO BLAST parser gets wrong score and evalue In-Reply-To: Message-ID: <200802061252.m16Cqi1h009982@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2445 bernd at bio.vu.nl changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #10 from bernd at bio.vu.nl 2008-02-06 07:52 EST ------- HTMLResultWriter would need changing of the indicated lines (line 283-286). With respect to BLAST reports: E-value is now OK, i.e. without the comma. However, the BLAST reports have hits as: >gb|AAA40756.1| ORF2 Length=336 Score = 597 bits (1540), Expect = 2e-169, Method: Compositional matrix adjust. The ", Method: Compositional matrix adjust." is lost in the HTMLResultWriter output. I'll close this bug. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Feb 6 07:53:09 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 6 Feb 2008 07:53:09 -0500 Subject: [Bioperl-guts-l] [Bug 2445] Bio::SearchIO BLAST parser gets wrong score and evalue In-Reply-To: Message-ID: <200802061253.m16Cr9ST010009@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2445 bernd at bio.vu.nl changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Feb 6 07:58:53 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 6 Feb 2008 07:58:53 -0500 Subject: [Bioperl-guts-l] [Bug 2444] Bio::SearchIO blast signifance contains comma In-Reply-To: Message-ID: <200802061258.m16CwrWl010266@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2444 ------- Comment #5 from bernd at bio.vu.nl 2008-02-06 07:58 EST ------- With the newest code (06-02-08; revision 14492). Significance is OK now: $hsp->evalue, $hsp->expect, $hsp->significance give the same "Evalue" now $hit->significance and $hit->expect give same output ($hit->evalue does not exist) $hit->score now is the same as $hit->raw_score Remark: was $hsp->significance intended to return the $hit->significance or the highest $hsp->evalue? -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Feb 6 08:45:22 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 6 Feb 2008 08:45:22 -0500 Subject: [Bioperl-guts-l] [Bug 2444] Bio::SearchIO blast signifance contains comma In-Reply-To: Message-ID: <200802061345.m16DjMf2012905@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2444 ------- Comment #6 from cjfields at uiuc.edu 2008-02-06 08:45 EST ------- (In reply to comment #5) > With the newest code (06-02-08; revision 14492). > Significance is OK now: > $hsp->evalue, $hsp->expect, $hsp->significance give the same "Evalue" now > > $hit->significance and $hit->expect give same output ($hit->evalue does not > exist) > $hit->score now is the same as $hit->raw_score > > Remark: was $hsp->significance intended to return the $hit->significance or the > highest $hsp->evalue? $hit->significance is supposed to give the best HSP pvalue or evalue, with pvalue taking precedence; if both exist (WUBLAST I think), it reports evalue $hsp->significance just gives the best pvalue or evalue for that HSP. I think 'significance' in this context is used to denote how significant the hit/HSP is for a particular report using one method (as opposed to doing '$val = $hsp->pvalue || $hsp->evalue'). Some reports only use pvalue, others use evalue, and some use both. I never use it personally (I use evalue or pvalue directly as it self-documenting and not as vague). Is this fixed now? Should I close it out? -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Feb 6 08:55:50 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 6 Feb 2008 08:55:50 -0500 Subject: [Bioperl-guts-l] [Bug 2444] Bio::SearchIO blast signifance contains comma In-Reply-To: Message-ID: <200802061355.m16Dtoxo013321@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2444 bernd at bio.vu.nl changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #7 from bernd at bio.vu.nl 2008-02-06 08:55 EST ------- it is fixed, you can close. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Feb 6 11:59:44 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 6 Feb 2008 11:59:44 -0500 Subject: [Bioperl-guts-l] [Bug 2445] Bio::SearchIO BLAST parser gets wrong score and evalue In-Reply-To: Message-ID: <200802061659.m16GxiMF022607@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2445 bernd at bio.vu.nl changed: What |Removed |Added ---------------------------------------------------------------------------- Status|CLOSED |REOPENED Resolution|FIXED | ------- Comment #11 from bernd at bio.vu.nl 2008-02-06 11:59 EST ------- With the new code the parser still gets wrong values for some entries. Example BLAST output will be uploaded. Problem could be mainly apparent when the entries in the description line do not have a corresponding alignment (and values to parse from there). Code line with problem: blast.pm line 643: if ($descline =~ /(? Message-ID: <200802061717.m16HHIG4023830@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2445 ------- Comment #12 from bernd at bio.vu.nl 2008-02-06 12:17 EST ------- Created an attachment (id=862) --> (http://bugzilla.open-bio.org/attachment.cgi?id=862&action=view) NCBI BLAST output Note output for e.g. 1DGH pdb|1DGH|B bits: 4-... sign:23.1 description line is: pdb|1DGH|A Chain A, Human Erythrocyte Catalase 3-Amino-1,2,4-... 23.1 64 -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Feb 7 06:10:23 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 7 Feb 2008 06:10:23 -0500 Subject: [Bioperl-guts-l] [Bug 2445] Bio::SearchIO BLAST parser