From cjfields at uiuc.edu Sun Oct 1 13:05:25 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 1 Oct 2006 12:05:25 -0500 Subject: [Bioperl-l] Tests involving remote databases In-Reply-To: <84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net> References: <000001c6e3e6$81630010$15327e82@pyrimidine> <6761B0A9-DEF0-4F33-9C89-39AFE104B61F@gmx.net> <79D77258-5A4E-4428-9322-00962D5C0419@uiuc.edu> <451E3707.4090400@sendu.me.uk> <0DA94AF7-DB0E-4DC5-8896-1048089D9AC5@uiuc.edu> <84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net> Message-ID: <3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu> On Sep 30, 2006, at 4:43 PM, Hilmar Lapp wrote: > > On Sep 30, 2006, at 10:57 AM, Chris Fields wrote: > >> There should be a failed test to let us know of the problem. As >> currently set up, the XEMBL server failure doesn't show up in >> Test::Harness test summaries. Biblio_biofetch.t had the similar >> problems before Brian's fixes. > > Just keep in mind that you may not want somebody's CPAN installation > to fail (or require a 'forced' install) just because some server > happens to be down for maintenance. > > -hilmar I don't think this would be a problem unless users specifically set BIOPERLDEBUG to 1, which is something most people don't bother with before installation (and probably not something we should promote for normal installation anyway). So, for CPAN installation we would suggest that BIOPERLDEBUG be 0 or not set at all, and outline the reasons why. The idea is to retain current behavior (remote DB access will not be run unless BIOPERLDEBUG is set to 1) and apply it to all tests requiring such access. Otherwise, just those tests are skipped (and not the rest of the tests, which occurs currently). If BIOPERLDEBUG is set, the next tests would check the URL, which passes/fails (based on the specific value of $@), and runs/skips tests based on the mere presence of $@, which indicates some URL issue. You can do this with Test::More, but I'm not sure this can be done with Test.pm or Test::Simple. The current behavior just skips all tests based on a single failed URL. Then, Test::Harness, as currently set, shows skipped tests as passed. The last run I posted previously where XEMBL_DB.t remote DB tests failed, I also ran all tests (make test) and get this, which doesn't tell us that the remote URL failed: ----------------------------------------- ... t/WABA.......................ok t/XEMBL_DB...................ok t/ztr........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping ztr.t tests ok All tests successful, 5 subtests skipped. ----------------------------------------- Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Sun Oct 1 13:17:24 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 1 Oct 2006 12:17:24 -0500 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: References: <7A592EAB-A869-4A6C-BFA8-F73F3DFD8F5B@gmx.net> <09FB1EB0-2C1C-4FCF-8339-E78556EFEFF2@uiuc.edu> <8D75FE6D-C02D-4A86-93FA-B7256050AF11@uiuc.edu> <40155903-555A-4662-BCCE-38E5E3784118@uiuc.edu> <54E79A5F-5446-4D8E-AD26-B70894048D60@gmx.net> <1D69005A-DF0E-4F37-93FE-7577A32CC625@gmx.net> Message-ID: The '-w' flag on the shebang line is the source of those errors. I never set it anymore on Windows due to this; I just use the 'use warnings' pragma. If you use 'perl -I. t/test.t' you can normally get around the '-w' assumed by using 'make test'. I will try running tests on bioperl-db and bioperl tomorrow on WinXP to confirm these. Chris On Sep 30, 2006, at 6:10 PM, Seth Johnson wrote: > How do I get rid of all of the warnings for "redefined subroutines" > during > the test?? It clutters the output and I can't see the errors. > > On 9/30/06, Hilmar Lapp wrote: >> >> It doesn't shed more light but it does raise an alert flag. All tests >> are supposed to pass. The fact that they don't means the problems you >> are seeing have nothing to do with your specific data or script. >> >> First off - can anyone else confirm those errors using the latest >> Bioperl-db and Bioperl? >> >> Second - Seth could you run those tests individually, e.g., using >> >> $ make test test_02species TEST_VERBOSE=1 >> >> and similarly for the other tests that have failures and post the >> output. Let's start with 02species and 03simpleseq. >> >> -hilmar >> >> On Sep 30, 2006, at 5:44 PM, Seth Johnson wrote: >> >>> There are errors during the test. Here's their summary: >>> ____________________________ >>> Failed Test Stat Wstat Total Fail Failed List of Failed >>> ------------------------------------------------------------- >>> t\02species.t 65 2 3.08% 63 65 >>> t\03simpleseq.t 1 256 59 106 179.66% 7-59 >>> t\04swiss.t 52 14 26.92% 25 27-34 38-42 >>> t\12ontology.t 2 512 738 1471 199.32% 3-738 >>> t\16obda.t 12 3 25.00% 10-12 >>> ____________________________ >>> >>> May be that can shed some light on the problem?!?! >>> >>> On 9/29/06, Hilmar Lapp < hlapp at gmx.net> wrote:This may in fact be >>> a knock-on effect of the fixes? >>> >>> Seth, did you run the test suite that comes with bioperl-db, and did >>> you get any errors? >>> >>> -hilmar >>> >>> On Sep 28, 2006, at 2:26 PM, Chris Fields wrote: >>> >>>> Seth, >>>> >>>> The organism issue is a bug and has been reported, though I thought >>>> it was fixed. >>>> >>>> The lack of the date and the version is a bit odd, but there have >>>> been a lot of changes lately to bioperl-live (core bioperl in CVS), >>>> and a few to bioperl-db. How old is your bioperl and bioperl-db >>>> installation. Hilmar, any additional thoughts? >>>> >>>> Chris >>>> >>>> On Sep 28, 2006, at 11:10 AM, Seth Johnson wrote: >>>> >>>>> Thank you. That takes care of that, however, I do have another >>>>> gripe. When >>>>> running my script, quoted before, with "my $out = >>>>> Bio::SeqIO->newFh('-format' => 'genbank');", I have several key >>>>> pieces of >>>>> information missing. The most important one is the version >>>>> number. There's >>>>> also a date missing, and source organism name is corrupted. >>>>> Here's what I >>>>> get: >>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>>>> LOCUS NM_014580 2145 bp dna linear >>>>> UNK >>>>> DEFINITION Homo sapiens solute carrier family 2, (facilitated >>>>> glucose >>>>> transporter) member 8 (SLC2A8), mRNA. >>>>> ACCESSION NM_014580 >>>>> SOURCE sapiens. >>>>> ORGANISM sapiens >>>>> Eukaryota; Fungi/Metazoa group; Metazoa; Eumetazoa; >>>>> Bilateria; >>>>> Coelomata; Deuterostomia; Chordata; Craniata; >>> Vertebrata; >>>>> Gnathostomata; Teleostomi; Euteleostomi; >>>>> Sarcopterygii; >>>>> Tetrapoda; >>>>> Amniota; Mammalia; Theria; Eutheria; Euarchontoglires; >>>>> Primates; >>>>> Haplorrhini; Simiiformes; Catarrhini; Hominoidea; >>>>> Hominidae; >>>>> Homo/Pan/Gorilla group; Homo. >>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>>>> >>>>> All of the missing information is stored in BioSQL and >>>>> theoretically should >>>>> be in the outpu. Here's how NCBI genbank file looks: >>>>> >>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>>>> >>>>> LOCUS NM_014580 2145 bp mRNA linear >>>>> PRI 17-OCT-2005 >>>>> DEFINITION Homo sapiens solute carrier family 2, (facilitated >>>>> glucose >>>>> transporter) member 8 (SLC2A8), mRNA. >>>>> ACCESSION NM_014580 >>>>> VERSION NM_014580.3 GI:51870928 >>>>> KEYWORDS . >>>>> SOURCE Homo sapiens (human) >>>>> ORGANISM Homo sapiens >>>>> >>>>> Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; >>>>> Euteleostomi; >>>>> Mammalia; Eutheria; Euarchontoglires; Primates; >>>>> Haplorrhini; >>>>> Catarrhini; Hominidae; Homo. >>>>> >>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>>>> >>>>> >>>>> On 9/28/06, Chris Fields wrote: >>>>>> >>>>>> Those are from the excessively paranoid '-w' flag on the shebang >>>>>> line. If you remove the flag but add the 'use warnings' pragma >>> the >>>>>> 'subroutine x redefined' warnings go away. This, BTW, is one >>> of the >>>>>> quirks of the ActivePerl distribution; other OSs don't have the >>> same >>>>>> problem. >>>>>> >>>>>> The 'solution' described on that page is actually a workaround, >>>>>> not a >>>>>> bugfix. It causes problems with stack traces with error handling >>>>>> but >>>>>> seems harmless beyond that. I haven't been able to find a >>>>>> satisfactory fix which works on all OS's. >>>>>> >>>>>> Chris >>>>>> >>>>>> >>>>>> On Sep 28, 2006, at 10:42 AM, Seth Johnson wrote: >>>>>> >>>>>>> This is under Windows, but using ActiveState Komodo 3.5 and >>>>>>> their >>>>>>> latest Perl for Windows and latest BioPerl & BioPerl-db from >>>>>>> CVS. >>>>>>> >>>>>>> I actually just stumbled upon a solution. It's described in the >>>>>>> "Installing Bioperl on Windows" by adding a comma after >>> $class: in >>>>>>> Bio::Root::Root throw() subroutine. Thanks for hinting me about >>>>>>> what I run it on. >>>>>>> >>>>>>> The code works now, BUT it spews whole bunch of warnings about >>>>>>> "Subroutine .... redefined": >>>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>>>>> Subroutine new redefined at c:/Perl/site/lib/Bio\BioEntry >>>>>>> .pm line 88. >>>>>>> Subroutine object_id redefined at c:/Perl/site/lib/Bio >>> \BioEntry.pm >>>>>>> line 128. >>>>>>> Subroutine version redefined at c:/Perl/site/lib/Bio\BioEntry.pm >>>>>>> line 150. >>>>>>> Subroutine authority redefined at c:/Perl/site/lib/Bio >>> \BioEntry.pm >>>>>>> line 171. >>>>>>> Subroutine namespace redefined at c:/Perl/site/lib/Bio >>> \BioEntry.pm >>>>>>> line 192. >>>>>>> Subroutine display_name redefined at c:/Perl/site/lib/Bio >>>>>>> \BioEntry.pm line 217. >>>>>>> Subroutine description redefined at c:/Perl/site/lib/Bio >>>>>>> \BioEntry.pm line 241. >>>>>>> Subroutine new redefined at c:/Perl/site/lib/Bio\Root\Root.pm >>> line >>>>>>> 201. >>>>>>> Subroutine verbose redefined at c:/Perl/site/lib/Bio\Root >>>>>>> \Root.pm >>>>>>> line 234. >>>>>>> Subroutine _register_for_cleanup redefined at c:/Perl/site/lib/ >>> Bio >>>>>>> \Root\Root.pm line 246. >>>>>>> Subroutine _unregister_for_cleanup redefined at c:/Perl/site/ >>>>>>> lib/ >>>>>>> Bio >>>>>>> \Root\Root.pm line 256. >>>>>>> Subroutine _cleanup_methods redefined at c:/Perl/site/lib/Bio >>> \Root >>>>>>> \Root.pm line 263. >>>>>>> Subroutine throw redefined at c:/Perl/site/lib/Bio\Root\Root.pm >>>>>>> line 316. >>>>>>> Subroutine debug redefined at c:/Perl/site/lib/Bio\Root\Root.pm >>>>>>> line 379. >>>>>>> Subroutine _load_module redefined at c:/Perl/site/lib/Bio\Root >>>>>>> \Root.pm line 398. >>>>>>> Subroutine DESTROY redefined at c:/Perl/site/lib/Bio\Root >>>>>>> \Root.pm >>>>>>> line 426. >>>>>>> Subroutine new redefined at c:/Perl/site/lib/Bio\Root\RootI.pm >>> line >>>>>>> 117. >>>>>>> Subroutine _initialize redefined at c:/Perl/site/lib/Bio\Root >>>>>>> \RootI.pm line 128. >>>>>>> ... >>>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>>>>> >>>>>>> >>>>>>> On 9/28/06, Chris Fields wrote: I had >>> problems >>>>>>> with bioperl-db on native WinXP (not cygwin), but I >>>>>>> did manage to get it running in cygwin with some effort. The >>> issue >>>>>>> on native WinXP was related to Bio::Root::Root::throw(), though. >>>>>>> >>>>>>> There is a bug and workaround filed on Bugzilla, but I haven't >>>>>>> worked >>>>>>> on it in a while (and the workaround has some problems as >>> well). I >>>>>>> may try running it again to see what happens. >>>>>>> >>>>>>> http://bugzilla.open-bio.org/show_bug.cgi?id=1938 >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On Sep 28, 2006, at 9:04 AM, Hilmar Lapp wrote: >>>>>>> >>>>>>>> Very odd. This is under Windows, presumably using Cygwin? >>>>>>>> >>>>>>>> The method Bio::Root::Root::throw() clearly exists, and >>>>>>>> PersistentObject inherits from it. The exception it was >>> trying to >>>>>>>> throw has nothing to do with failure or success to find the >>>>>>>> database >>>>>>>> row (actually it did succeed since otherwise it wouldn't >>> construct >>>>>>>> the object) but with dynamically loading a class, presumably >>>>>>>> Bio::DB::Persistent::Seq. >>>>>>>> >>>>>>>> Are you using the 1.5.x release of bioperl? >>>>>>>> >>>>>>>> Does anyone on the list have any experience with these sorts of >>>>>>>> things on Windows? >>>>>>>> >>>>>>>> (Seth, I've moved this thread to the bioperl list, since >>>>>>>> this is >>>>>>> what >>>>>>>> the problem is about.) >>>>>>>> >>>>>>>> -hilmar >>>>>>>> >>>>>>>> On Sep 27, 2006, at 1:39 PM, Seth Johnson wrote: >>>>>>>> >>>>>>>>> Hello guys, >>>>>>>>> >>>>>>>>> I successfully populated the biosql database, thanks to you. >>>>>>>>> Now, >>>>>>>>> I'm >>>>>>>>> trying to retrieve a sequence from it following the example >>> from >>>>>>>>> BOSC2003 >>>>>>>>> slides and ran into uninformative error (at least to me it >>>>>>>>> doesn't >>>>>>>>> mean >>>>>>>>> anyting). I suspect that I'm missing something and hope you >>> can >>>>>>>>> point me in >>>>>>>>> the right direction. Here's my source code: >>>>>>>>> >>>>>>> >>> ------------------------------------------------------------------- >>>>>>> -- >>>>>>>>> - >>>>>>>>> --- >>>>>>>>> #!/usr/bin/perl -w >>>>>>>>> use strict; >>>>>>>>> use warnings; >>>>>>>>> >>>>>>>>> use Bio::Seq; >>>>>>>>> use Bio::Seq::SeqFactory; >>>>>>>>> use Bio::DB::SimpleDBContext; >>>>>>>>> use Bio::DB::BioDB; >>>>>>>>> >>>>>>>>> my $dbc = Bio::DB::SimpleDBContext->new( >>>>>>>>> -driver => 'mysql', >>>>>>>>> -dbname => 'BioSQL_1', >>>>>>>>> -host => ' 192.168.1.3', >>>>>>>>> -user => 'xxxxx', >>>>>>>>> -pass => 'xxxxxx' >>>>>>>>> ); >>>>>>>>> >>>>>>>>> my $db = Bio::DB::BioDB->new(-database => 'biosql', >>>>>>>>> -dbcontext => $dbc); >>>>>>>>> >>>>>>>>> my $seq = Bio::Seq->new(-accession_number => 'NM_014580', - >>>>>>>>> namespace => >>>>>>>>> 'refseq_H_sapiens'); >>>>>>>>> my $seqfact = Bio::Seq::SeqFactory->new(-type => 'Bio::Seq'); >>>>>>>>> my $adp = $db->get_object_adaptor($seq); >>>>>>>>> my $dbseq = $adp->find_by_unique_key($seq, -obj_factory => >>>>>>> $seqfact); >>>>>>>>> >>>>>>>>> my $out = Bio::SeqIO->newFh('-format' => 'EMBL'); >>>>>>>>> print $out $dbseq; >>>>>>>>> >>>>>>>>> exit; >>>>>>>>> >>> ----------------------------------------------------------------- >>>>>>>>> >>>>>>>>> Just when the "find_by_unique_key" function is executed I >>> get the >>>>>>>>> following >>>>>>>>> error: >>>>>>>>> >>>>>>>>> ================================ >>>>>>>>> Undefined subroutine &Bio::Root::Root::throw called at >>>>>>>>> c:/Perl/site/lib/Bio/DB/Persistent/PersistentObject.pm line >>> 199. >>>>>>>>> ================================ >>>>>>>>> >>>>>>>>> The sequence does exist in the database. I checked that. Any >>>>>>>>> ideas??? >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Best Regards, >>>>>>>>> >>>>>>>>> >>>>>>>>> Seth Johnson >>>>>>>>> Senior Bioinformatics Associate >>>>>>>>> _______________________________________________ >>>>>>>>> BioSQL-l mailing list >>>>>>>>> BioSQL-l at lists.open-bio.org >>>>>>>>> http://lists.open-bio.org/mailman/listinfo/biosql-l >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> =========================================================== >>>>>>>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>>>>>>> =========================================================== >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Bioperl-l mailing list >>>>>>>> Bioperl-l at lists.open-bio.org >>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>>>> >>>>>>> Christopher Fields >>>>>>> Postdoctoral Researcher >>>>>>> Lab of Dr. Robert Switzer >>>>>>> Dept of Biochemistry >>>>>>> University of Illinois Urbana-Champaign >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best Regards, >>>>>>> >>>>>>> >>>>>>> Seth Johnson >>>>>>> Senior Bioinformatics Associate >>>>>>> >>>>>>> Ph: (202) 470-0900 >>>>>>> Fx: (775) 251-0358 >>>>>> >>>>>> Christopher Fields >>>>>> Postdoctoral Researcher >>>>>> Lab of Dr. Robert Switzer >>>>>> Dept of Biochemistry >>>>>> University of Illinois Urbana-Champaign >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Best Regards, >>>>> >>>>> >>>>> Seth Johnson >>>>> Senior Bioinformatics Associate >>>>> >>>>> Ph: (202) 470-0900 >>>>> Fx: (775) 251-0358 >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> Christopher Fields >>>> Postdoctoral Researcher >>>> Lab of Dr. Robert Switzer >>>> Dept of Biochemistry >>>> University of Illinois Urbana-Champaign >>>> >>>> >>>> >>> >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> >>> >>> >>> >>> >>> >>> >>> >>> -- >>> Best Regards, >>> >>> >>> Seth Johnson >>> Senior Bioinformatics Associate >>> >>> Ph: (202) 470-0900 >>> Fx: (775) 251-0358 >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> > > > -- > Best Regards, > > > Seth Johnson > Senior Bioinformatics Associate > > Ph: (202) 470-0900 > Fx: (775) 251-0358 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From osborne1 at optonline.net Sun Oct 1 17:49:47 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Sun, 01 Oct 2006 17:49:47 -0400 Subject: [Bioperl-l] bioperl-network warnings when loading the DIP dataset In-Reply-To: <20061001183214.GB12075@iucha.net> Message-ID: Florin, This is fixed in CVS now. What had happened is that the DIP file had some minimal protein (node) entries where the only id available was DIP's internal identifier. Not ideal to have to use these as accessions but there's no other choice. Thank you for the note, and in the future write to bioperl-l since there may be others who are interested in hearing about what you've encountered. Brian O. On 10/1/06 2:32 PM, "Florin Iucha" wrote: > Hello, > > I have downloaded a CVS snapshot [1] of your module, bioperl-network, and > I am using it to read the 20060402 edition release of the DIP [2] dataset. > > Starting with the simple program you show in the man page: > > my $io = Bio::Network::IO->new(-format => 'psi', > -file => $ARGV[0]); > > my $network = $io->next_network; > > I get 772 instances of: > > Use of uninitialized value in string eq at > /usr/local/share/perl/5.8.8/Bio/Network/IO/psi.pm line 326. > > I don't know if it is just an annoyance or something bad, so you might > want to take a look at it. > > Thank you for your work, > florin > > [1] http://cvs.open-bio.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-network/ > [2] http://dip.doe-mbi.ucla.edu/ From osborne1 at optonline.net Sun Oct 1 17:56:39 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Sun, 01 Oct 2006 17:56:39 -0400 Subject: [Bioperl-l] bioperl-network warnings when loading the DIP dataset In-Reply-To: <20061001211844.GC12075@iucha.net> Message-ID: Florin, I'm not seeing any segmentation fault using the same file you're using as input (dip20060402.mif). I'm assuming you don't see this error when you use smaller files as input, like those in the t/data directory. When I watch the script in top I see Perl using about 135Mb (RSIZE) right before the script exits. How much memory do you use? Thank you for the note, and in the future write to bioperl-l since there may be others who are interested in hearing about what you've encountered. Brian O. On 10/1/06 5:18 PM, "Florin Iucha" wrote: > On Sun, Oct 01, 2006 at 01:32:14PM -0500, Florin Iucha wrote: >> I have downloaded a CVS snapshot [1] of your module, bioperl-network, and >> I am using it to read the 20060402 edition release of the DIP [2] dataset. > > Using the attached script, I am getting a segmentation fault at the > end, right after printing "That's all, Folks!" Maybe some cleanup is > going off in a wrong direction. > > florin From florin at iucha.net Sun Oct 1 20:24:03 2006 From: florin at iucha.net (Florin Iucha) Date: Sun, 1 Oct 2006 19:24:03 -0500 Subject: [Bioperl-l] bioperl-network warnings when loading the DIP dataset In-Reply-To: References: <20061001211844.GC12075@iucha.net> Message-ID: <20061002002403.GD12075@iucha.net> On Sun, Oct 01, 2006 at 05:56:39PM -0400, Brian Osborne wrote: > I'm not seeing any segmentation fault using the same file you're using as > input (dip20060402.mif). I'm assuming you don't see this error when you use > smaller files as input, like those in the t/data directory. The t/data files are fine. Also the largest file (full_2_psi1.xml) from release 2006-09-19 of the MINT [1] database does not produce the crash. It has a new warning, however: Can't call method "text" on an undefined value at /usr/local/share/perl/5.8.8/Bio/Network/IO/psi.pm line 290. > When I watch the script in top I see Perl using about 135Mb (RSIZE) right > before the script exits. How much memory do you use? "ps ux" tells me VSZ = 272788 and RSZ = 254992. This is on x86-64 with 64 bit perl. The box has 2 GB of physical memory so these numbers don't seem to be a concern. > Thank you for the note, and in the future write to bioperl-l since there may > be others who are interested in hearing about what you've encountered. Do'h! You have the list address loud and clear in three places, but I got your contact info from the AUTHORS. Will use the proper channel from now on! Thanks, florin [1] ftp://mint.bio.uniroma2.it/pub/release/psi1/ -- If we wish to count lines of code, we should not regard them as lines produced but as lines spent. -- Edsger Dijkstra -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061001/901e447e/attachment.bin From cjfields at uiuc.edu Mon Oct 2 00:35:22 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 1 Oct 2006 23:35:22 -0500 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: Message-ID: <000001c6e5dc$2eceabe0$15327e82@pyrimidine> Seth, What version of MySQL and perl are you using? I'm using MySQL 5.0.18 (but am upgrading to 5.0.24 tomorrow) and ActivePerl 5.8.819. I ran into a few problems with bioperl-db tests which were unrelated the ones below, but I'm wondering if it is a difference in MySQL versions. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Seth Johnson > Sent: Saturday, September 30, 2006 6:35 PM > To: Hilmar Lapp > Cc: Chris Fields; Bioperl List > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > Here're complete test details: > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ... > FAILED tests 10-12 > Failed 3/12 tests, 75.00% okay > Failed Test Stat Wstat Total Fail Failed List of Failed > -------------------------------------------------------------------------- > ----- > t\02species.t 65 2 3.08% 63 65 > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > t\12ontology.t 2 512 738 1471 199.32% 3-738 > t\16obda.t 12 3 25.00% 10-12 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From torsten.seemann at infotech.monash.edu.au Mon Oct 2 02:06:50 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Mon, 02 Oct 2006 16:06:50 +1000 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: <2A31B5F3-8D44-45C3-89C6-E42D802C5D14@gmx.net> References: <451C8ED8.2060003@infotech.monash.edu.au> <451CC40D.2030401@sendu.me.uk> <2A31B5F3-8D44-45C3-89C6-E42D802C5D14@gmx.net> Message-ID: <4520AC7A.1050009@infotech.monash.edu.au> >>> I have removed all use/@ISA Bio::Root::Object references from >>> bioperl-live, except for those in Bio::Root::* itself: >> So I'd say they're both relics that can be removed. In fact I was >> planning on getting rid off all references to both of these modules >> before you did, so thanks! :) > I think they can go. It's probably a pre-1.0 deprecation that somehow > was never followed through on. Today I did a fresh CVS checkout of bioperl-live, and deleted the following modules and tests, and all tests passed with BIOPERLDEBUG=0 * Bio::Root::Err * Bio::Root::Global * Bio::Root::IOManager * Bio::Root::Object * Bio::Root::Storable * Bio::Root::Utilities # may be used by third parties? * Bio::Root::Vector * Bio::Root::Xref * t/Root-Utilities.t # need to keep if we keep Utilities.pm * t/RootStorable.t Should we schedule for deprecation, or deprecate immediately as Hilmar suggested they were meant to be deprecated long ago ? -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From bix at sendu.me.uk Mon Oct 2 05:40:02 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 02 Oct 2006 10:40:02 +0100 Subject: [Bioperl-l] Tests involving remote databases In-Reply-To: <3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu> References: <000001c6e3e6$81630010$15327e82@pyrimidine> <6761B0A9-DEF0-4F33-9C89-39AFE104B61F@gmx.net> <79D77258-5A4E-4428-9322-00962D5C0419@uiuc.edu> <451E3707.4090400@sendu.me.uk> <0DA94AF7-DB0E-4DC5-8896-1048089D9AC5@uiuc.edu> <84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net> <3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu> Message-ID: <4520DE72.4000603@sendu.me.uk> Chris Fields wrote: > > The idea is to retain current behavior (remote DB access will not be > run unless BIOPERLDEBUG is set to 1) and apply it to all tests > requiring such access. Otherwise, just those tests are skipped (and > not the rest of the tests, which occurs currently). If BIOPERLDEBUG > is set, the next tests would check the URL, which passes/fails (based > on the specific value of $@), and runs/skips tests based on the mere > presence of $@, which indicates some URL issue. You can do this with > Test::More, but I'm not sure this can be done with Test.pm or > Test::Simple. Firstly, BIOPERLDEBUG should not be abused; it should be used only when you want to see extra debugging messages. There should be another variable that you can set to choose if network-requiring tests are run, and it should also be a configurable choice when you run perl Makefile.PL. (But changing this isn't going to happen for 1.5.2) When the server problem is ambiguous we should not fail the test. Just make the skip message visible and pass all ok... > The current behavior just skips all tests based on a single failed > URL. Then, Test::Harness, as currently set, shows skipped tests as > passed. The last run I posted previously where XEMBL_DB.t remote DB > tests failed, I also ran all tests (make test) and get this, which > doesn't tell us that the remote URL failed: > > ----------------------------------------- > > ... > t/WABA.......................ok > t/XEMBL_DB...................ok > t/ztr........................Bio::SeqIO::staden::read of bioperl-ext > is not installed or is installed incorrectly - skipping ztr.t tests > ok > All tests successful, 5 subtests skipped. All you have to do to make it visible is start the skip message with the work 'Skip': skip('Skip server may be down',1); ... t/WABA.......................ok t/XEMBL_DB...................ok 1/9 skipped: server may be down t/ztr........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping ztr.t tests t/ztr........................ok It's nicer when using Test::More. From bix at sendu.me.uk Mon Oct 2 05:55:27 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 02 Oct 2006 10:55:27 +0100 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: <4520AC7A.1050009@infotech.monash.edu.au> References: <451C8ED8.2060003@infotech.monash.edu.au> <451CC40D.2030401@sendu.me.uk> <2A31B5F3-8D44-45C3-89C6-E42D802C5D14@gmx.net> <4520AC7A.1050009@infotech.monash.edu.au> Message-ID: <4520E20F.6040406@sendu.me.uk> Torsten Seemann wrote: > >>> I have removed all use/@ISA Bio::Root::Object references from > >>> bioperl-live, except for those in Bio::Root::* itself: > > >> So I'd say they're both relics that can be removed. In fact I was > >> planning on getting rid off all references to both of these modules > >> before you did, so thanks! :) > >> I think they can go. It's probably a pre-1.0 deprecation that somehow >> was never followed through on. > > Today I did a fresh CVS checkout of bioperl-live, and deleted the > following modules and tests, and all tests passed with BIOPERLDEBUG=0 > > * Bio::Root::Err > * Bio::Root::Global > * Bio::Root::IOManager > * Bio::Root::Object > * Bio::Root::Storable > * Bio::Root::Utilities # may be used by third parties? > * Bio::Root::Vector > * Bio::Root::Xref > * t/Root-Utilities.t # need to keep if we keep Utilities.pm > * t/RootStorable.t > > Should we schedule for deprecation, or deprecate immediately as Hilmar > suggested they were meant to be deprecated long ago ? I'm happy to get rid of them all straight away. Does anyone object? From florin at iucha.net Sun Oct 1 21:40:07 2006 From: florin at iucha.net (Florin Iucha) Date: Sun, 1 Oct 2006 20:40:07 -0500 Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext on AMD64 Message-ID: <20061002014007.GG12075@iucha.net> Hello, I am trying to install bioperl-network from CVS. I found this to require bioperl from CVS, which requires bioperl-ext from CVS. I have compiled and installed io_lib 1.10.1. After running "perl Makefile.PL; make test" in bioperl-ext I see a lot sources being compiled, then: cc -c -I./libs -D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -O2 -DVERSION=\"1.5.1\" -DXS_VERSION=\"1.5.1\" -fPIC "-I/usr/lib/perl/5.8/CORE" -DPOSIX -DNOERROR Align.c Running Mkbootstrap for Bio::Ext::Align () chmod 644 Align.bs rm -f ../blib/arch/auto/Bio/Ext/Align/Align.so cc -shared -L/usr/local/lib Align.o -o ../blib/arch/auto/Bio/Ext/Align/Align.so libs/libsw.a \ -lm \ /usr/bin/ld: libs/libsw.a(aln.o): relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC libs/libsw.a: could not read symbols: Bad value collect2: ld returned 1 exit status make[1]: *** [../blib/arch/auto/Bio/Ext/Align/Align.so] Error 1 make[1]: Leaving directory `/scratch/dmbio/tools/bioperl-ext/Bio/Ext/Align' make: *** [subdirs] Error 2 This is on a Debian AMD64 box: florin at zeus $ gcc -v Using built-in specs. Target: x86_64-linux-gnu Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr --enable-checking=release x86_64-linux-gnu Thread model: posix gcc version 4.1.2 20060901 (prerelease) (Debian 4.1.1-13) florin at zeus $ perl -V Summary of my perl5 (revision 5 version 8 subversion 8) configuration: Platform: osname=linux, osvers=2.6.16-1-vserver-amd64-k8, archname=x86_64-linux-gnu-thread-multi uname='linux excelsior 2.6.16-1-vserver-amd64-k8 #2 smp tue apr 4 03:40:49 utc 2006 x86_64 gnulinux ' config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=x86_64-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.8 -Darchlib=/usr/lib/perl/5.8 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.8.8 -Dsitearch=/usr/local/lib/perl/5.8.8 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Uusesfio -Uusenm -Duseshrplib -Dlibperl=libperl.so.5.8.8 -Dd_dosuid -des' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=define use64bitall=define uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2', cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include' ccversion='', gccversion='4.1.2 20060729 (prerelease) (Debian 4.1.1-10)', gccosandvers='' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='cc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt perllibs=-ldl -lm -lpthread -lc -lcrypt libc=/lib/libc-2.3.6.so, so=so, useshrplib=true, libperl=libperl.so.5.8.8 gnulibc_version='2.3.6' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: MULTIPLICITY PERL_IMPLICIT_CONTEXT PERL_MALLOC_WRAP THREADS_HAVE_PIDS USE_64_BIT_ALL USE_64_BIT_INT USE_ITHREADS USE_LARGE_FILES USE_PERLIO USE_REENTRANT_API The compiler command line for aln.o is lacking -fPIC: cc -D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DPOSIX -DNOERROR -c -o aln.o aln.c Adding -fPIC to the CCFLAGS variable in Bio/Ext/Align/Makefile and Makefile seems to take build further, but it fails with a similar error in Bio/SeqIO/staden/_Inline/build/Bio/SeqIO/staden/read. That Makefile seems to be regenerated every time I run 'make test' in the top level directory. The error in ../staden/read is: rm -f blib/arch/auto/Bio/SeqIO/staden/read/read.so cc -shared -L/usr/local/lib read.o -o blib/arch/auto/Bio/SeqIO/staden/read/read.so \ -L/usr/local/lib -lread -lz \ /usr/bin/ld: /usr/local/lib/libread.a(libread_a-Read.o): relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC /usr/local/lib/libread.a: could not read symbols: Bad value collect2: ld returned 1 exit status make: *** [blib/arch/auto/Bio/SeqIO/staden/read/read.so] Error 1 So, the questions appears to be: - should "-fPIC" be appended to CFLAGS in the generated Makefiles? - is there anything wrong with io_lib flags? - has anybody built bioperl-ext on AMD64? I can help with debugging or testing if given a gentle nudge in the right direction, but I have little experience with the interactions between perl and static libraries on 64 bit. Thanks, florin -- If we wish to count lines of code, we should not regard them as lines produced but as lines spent. -- Edsger Dijkstra -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061001/bc134c7e/attachment.bin From bix at sendu.me.uk Mon Oct 2 06:52:47 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 02 Oct 2006 11:52:47 +0100 Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext on AMD64 In-Reply-To: <20061002014007.GG12075@iucha.net> References: <20061002014007.GG12075@iucha.net> Message-ID: <4520EF7F.40908@sendu.me.uk> Florin Iucha wrote: > Hello, > > I am trying to install bioperl-network from CVS. I found this to > require bioperl from CVS, which requires bioperl-ext from CVS. I can't help with the compile problems you encountered (other than to say I also have problems under AMD64), but from where did you get the idea that bioperl (live/core) requires bioperl-ext? It doesn't, though recent changes to Makefile.PL may give that impression... From cjfields at uiuc.edu Mon Oct 2 08:26:57 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 07:26:57 -0500 Subject: [Bioperl-l] Tests involving remote databases In-Reply-To: <4520DE72.4000603@sendu.me.uk> References: <000001c6e3e6$81630010$15327e82@pyrimidine> <6761B0A9-DEF0-4F33-9C89-39AFE104B61F@gmx.net> <79D77258-5A4E-4428-9322-00962D5C0419@uiuc.edu> <451E3707.4090400@sendu.me.uk> <0DA94AF7-DB0E-4DC5-8896-1048089D9AC5@uiuc.edu> <84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net> <3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu> <4520DE72.4000603@sendu.me.uk> Message-ID: On Oct 2, 2006, at 4:40 AM, Sendu Bala wrote: > Chris Fields wrote: >> >> The idea is to retain current behavior (remote DB access will not be >> run unless BIOPERLDEBUG is set to 1) and apply it to all tests >> requiring such access. Otherwise, just those tests are skipped (and >> not the rest of the tests, which occurs currently). If BIOPERLDEBUG >> is set, the next tests would check the URL, which passes/fails (based >> on the specific value of $@), and runs/skips tests based on the mere >> presence of $@, which indicates some URL issue. You can do this with >> Test::More, but I'm not sure this can be done with Test.pm or >> Test::Simple. > > Firstly, BIOPERLDEBUG should not be abused; it should be used only > when > you want to see extra debugging messages. There should be another > variable that you can set to choose if network-requiring tests are > run, > and it should also be a configurable choice when you run perl > Makefile.PL. > > (But changing this isn't going to happen for 1.5.2) > > When the server problem is ambiguous we should not fail the test. Just > make the skip message visible and pass all ok... I agree, as well as with your assessment of BIOPERLDEBUG (which I alluded to in a previous post). Torsten suggested creating a new env. variable for network tests. It's obvious this won't be done before 1.5.2, but we can make plans towards the next release. >> The current behavior just skips all tests based on a single failed >> URL. Then, Test::Harness, as currently set, shows skipped tests as >> passed. The last run I posted previously where XEMBL_DB.t remote DB >> tests failed, I also ran all tests (make test) and get this, which >> doesn't tell us that the remote URL failed: >> >> ----------------------------------------- >> >> ... >> t/WABA.......................ok >> t/XEMBL_DB...................ok >> t/ztr........................Bio::SeqIO::staden::read of bioperl-ext >> is not installed or is installed incorrectly - skipping ztr.t tests >> ok >> All tests successful, 5 subtests skipped. > > All you have to do to make it visible is start the skip message > with the > work 'Skip': > > skip('Skip server may be down',1); > > ... > t/WABA.......................ok > > t/XEMBL_DB...................ok > > 1/9 skipped: server may be down > t/ztr........................Bio::SeqIO::staden::read of bioperl- > ext is > not installed or is installed incorrectly - skipping ztr.t tests > t/ztr........................ok > > > It's nicer when using Test::More. Okay, if Test::Harness picks that up it would be okay. We could use skip blocks to skip subsets of tests that require remote access (like SeqFeature.t) as opposed to skipping all tests. I think we want to avoid promoting running tests with BIOPERLDEBUG (or similar) upon installation for everyday installation anyway (such as from CPAN, which Hilmar points out). It's not something everybody installing a new BioPerl should be running unless they run into problems. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From florin at iucha.net Mon Oct 2 08:15:06 2006 From: florin at iucha.net (Florin Iucha) Date: Mon, 2 Oct 2006 07:15:06 -0500 Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext on AMD64 In-Reply-To: <4520EF7F.40908@sendu.me.uk> References: <20061002014007.GG12075@iucha.net> <4520EF7F.40908@sendu.me.uk> Message-ID: <20061002121506.GB14409@iucha.net> On Mon, Oct 02, 2006 at 11:52:47AM +0100, Sendu Bala wrote: > Florin Iucha wrote: > > I am trying to install bioperl-network from CVS. I found this to > > require bioperl from CVS, which requires bioperl-ext from CVS. > > I can't help with the compile problems you encountered (other than to > say I also have problems under AMD64), but from where did you get the > idea that bioperl (live/core) requires bioperl-ext? It doesn't, though > recent changes to Makefile.PL may give that impression... Running the tests for bioperl-live mention in some places that 'this test has been skipped since $foo is not available' and I found the 'foos' in bioperl-ext. florin -- If we wish to count lines of code, we should not regard them as lines produced but as lines spent. -- Edsger Dijkstra -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061002/8fc9df03/attachment.bin From bix at sendu.me.uk Mon Oct 2 10:05:11 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 02 Oct 2006 15:05:11 +0100 Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext on AMD64 In-Reply-To: <20061002121506.GB14409@iucha.net> References: <20061002014007.GG12075@iucha.net> <4520EF7F.40908@sendu.me.uk> <20061002121506.GB14409@iucha.net> Message-ID: <45211C97.2060800@sendu.me.uk> Florin Iucha wrote: > On Mon, Oct 02, 2006 at 11:52:47AM +0100, Sendu Bala wrote: >> Florin Iucha wrote: >>> I am trying to install bioperl-network from CVS. I found this to >>> require bioperl from CVS, which requires bioperl-ext from CVS. >> I can't help with the compile problems you encountered (other than to >> say I also have problems under AMD64), but from where did you get the >> idea that bioperl (live/core) requires bioperl-ext? It doesn't, though >> recent changes to Makefile.PL may give that impression... > > Running the tests for bioperl-live mention in some places that 'this > test has been skipped since $foo is not available' and I found the > 'foos' in bioperl-ext. Right, yes. The idea is, you'd only need to install bioperl-ext if you wanted to use the modules that the complaining tests test. So if none of the things that were skipped matter to you, don't install ext. I guess this needs to be clarified in documentation somewhere. From cjfields at uiuc.edu Mon Oct 2 10:13:56 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 09:13:56 -0500 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: <4520AC7A.1050009@infotech.monash.edu.au> Message-ID: <001801c6e62d$02c883d0$15327e82@pyrimidine> > >>> I have removed all use/@ISA Bio::Root::Object references from > >>> bioperl-live, except for those in Bio::Root::* itself: > > >> So I'd say they're both relics that can be removed. In fact I was > >> planning on getting rid off all references to both of these modules > >> before you did, so thanks! :) > > > I think they can go. It's probably a pre-1.0 deprecation that somehow > > was never followed through on. > > Today I did a fresh CVS checkout of bioperl-live, and deleted the > following modules and tests, and all tests passed with BIOPERLDEBUG=0 > > * Bio::Root::Err > * Bio::Root::Global > * Bio::Root::IOManager > * Bio::Root::Object > * Bio::Root::Storable > * Bio::Root::Utilities # may be used by third parties? > * Bio::Root::Vector > * Bio::Root::Xref > * t/Root-Utilities.t # need to keep if we keep Utilities.pm > * t/RootStorable.t > > Should we schedule for deprecation, or deprecate immediately as Hilmar > suggested they were meant to be deprecated long ago ? I vote for quick deprecation; I had also noticed that these were superfluous and added them as possible deprecations to the wiki page. However, we need to be careful about that 'third-party use' caveat you have for Bio::Root::Utilities; there's another one with Bio::Root::Storable and Ensembl: http://thread.gmane.org/gmane.comp.lang.perl.bio.general/2924/focus=2924 and it seems to have it's users: http://thread.gmane.org/gmane.comp.lang.perl.bio.general/8242/focus=8242 The others (including Bio::Root::Utilities) haven't had any major threads on the mail lists in a very long time. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -- > Dr Torsten Seemann http://www.vicbioinformatics.com > Victorian Bioinformatics Consortium, Monash University, Australia > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Mon Oct 2 10:16:31 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 09:16:31 -0500 Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-exton AMD64 In-Reply-To: <20061002121506.GB14409@iucha.net> Message-ID: <001901c6e62d$5c4fac80$15327e82@pyrimidine> They're not absolutely necessary; the tests are skipped w/o failure because bioperl-ext is optional. These are only necessary if you want the ability to read sequence trace files. BTW, you might have a rough time on trying to install bioperl-ext depending on your platform. Note the following bug report: http://bugzilla.open-bio.org/show_bug.cgi?id=2074 Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Florin Iucha > Sent: Monday, October 02, 2006 7:15 AM > To: bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] Failure to compile the CVS snapshot of bioperl- > exton AMD64 > > On Mon, Oct 02, 2006 at 11:52:47AM +0100, Sendu Bala wrote: > > Florin Iucha wrote: > > > I am trying to install bioperl-network from CVS. I found this to > > > require bioperl from CVS, which requires bioperl-ext from CVS. > > > > I can't help with the compile problems you encountered (other than to > > say I also have problems under AMD64), but from where did you get the > > idea that bioperl (live/core) requires bioperl-ext? It doesn't, though > > recent changes to Makefile.PL may give that impression... > > Running the tests for bioperl-live mention in some places that 'this > test has been skipped since $foo is not available' and I found the > 'foos' in bioperl-ext. > > florin > > -- > If we wish to count lines of code, we should not regard them as lines > produced but as lines spent. -- Edsger Dijkstra From osborne1 at optonline.net Mon Oct 2 10:14:13 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Mon, 02 Oct 2006 10:14:13 -0400 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: <4520E20F.6040406@sendu.me.uk> Message-ID: Sendu, No objection but someone should check the scripts in examples/root to make sure that they are not used there. Brian O. On 10/2/06 5:55 AM, "Sendu Bala" wrote: > Torsten Seemann wrote: >>>>> I have removed all use/@ISA Bio::Root::Object references from >>>>> bioperl-live, except for those in Bio::Root::* itself: >> >>>> So I'd say they're both relics that can be removed. In fact I was >>>> planning on getting rid off all references to both of these modules >>>> before you did, so thanks! :) >> >>> I think they can go. It's probably a pre-1.0 deprecation that somehow >>> was never followed through on. >> >> Today I did a fresh CVS checkout of bioperl-live, and deleted the >> following modules and tests, and all tests passed with BIOPERLDEBUG=0 >> >> * Bio::Root::Err >> * Bio::Root::Global >> * Bio::Root::IOManager >> * Bio::Root::Object >> * Bio::Root::Storable >> * Bio::Root::Utilities # may be used by third parties? >> * Bio::Root::Vector >> * Bio::Root::Xref >> * t/Root-Utilities.t # need to keep if we keep Utilities.pm >> * t/RootStorable.t >> >> Should we schedule for deprecation, or deprecate immediately as Hilmar >> suggested they were meant to be deprecated long ago ? > > I'm happy to get rid of them all straight away. Does anyone object? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From johnson.biotech at gmail.com Mon Oct 2 10:21:50 2006 From: johnson.biotech at gmail.com (Seth Johnson) Date: Mon, 2 Oct 2006 10:21:50 -0400 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: <000001c6e5dc$2eceabe0$15327e82@pyrimidine> References: <000001c6e5dc$2eceabe0$15327e82@pyrimidine> Message-ID: I'm using MySQL 5.0.19 and Perl v5.8.7 [MSWin32-x86-multi-thread] On 10/2/06, Chris Fields wrote: > > Seth, > > What version of MySQL and perl are you using? I'm using MySQL 5.0.18 (but > am upgrading to 5.0.24 tomorrow) and ActivePerl 5.8.819. > > I ran into a few problems with bioperl-db tests which were unrelated the > ones below, but I'm wondering if it is a difference in MySQL versions. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > -----Original Message----- > > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > > bounces at lists.open-bio.org] On Behalf Of Seth Johnson > > Sent: Saturday, September 30, 2006 6:35 PM > > To: Hilmar Lapp > > Cc: Chris Fields; Bioperl List > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > > > Here're complete test details: > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > ... > > > FAILED tests 10-12 > > Failed 3/12 tests, 75.00% okay > > Failed Test Stat Wstat Total Fail Failed List of Failed > > > -------------------------------------------------------------------------- > > ----- > > t\02species.t 65 2 3.08% 63 65 > > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > > t\12ontology.t 2 512 738 1471 199.32% 3-738 > > t\16obda.t 12 3 25.00% 10-12 > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- Best Regards, Seth Johnson Senior Bioinformatics Associate Ph: (202) 470-0900 Fx: (775) 251-0358 From osborne1 at optonline.net Mon Oct 2 10:08:50 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Mon, 02 Oct 2006 10:08:50 -0400 Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext on AMD64 In-Reply-To: <20061002014007.GG12075@iucha.net> Message-ID: Florian, Minor correction here, the Bioperl package does not require bioperl-ext. However we see there is a problem compiling bioperl-ext... Brian O. On 10/1/06 9:40 PM, "Florin Iucha" wrote: > I am trying to install bioperl-network from CVS. I found this to > require bioperl from CVS, which requires bioperl-ext from CVS. From JK at novozymes.com Mon Oct 2 10:05:34 2006 From: JK at novozymes.com (JK (Jesper Agerbo Krogh)) Date: Mon, 2 Oct 2006 16:05:34 +0200 Subject: [Bioperl-l] Blast parser. Message-ID: <934F95E71B6C9347A873C42AE3C196190B84C5D6@NZT0004E.dknz.nzcorp.net> Hi. I've tried to use the blast-parser but I cannot get the original alignment out of the parser. Is it possible to get that out of the Bio::Search::HSP::BlastHSP in some way. It seems quite odd to get a clustalw alignment out when it isn't that type of alignment people are used to get from blast. Thanks Jesper From cjfields at uiuc.edu Mon Oct 2 10:36:31 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 09:36:31 -0500 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: Message-ID: <001d01c6e630$27792fb0$15327e82@pyrimidine> > Sendu, > > No objection but someone should check the scripts in examples/root to make > sure that they are not used there. > > Brian O. I suppose it's also possible that the other bioperl distributions (like bioperl-run) could use them as well. If they do we can take care of them as they pop up. These are really old and haven't been revised in a long time. The only one I worry about is Bio::Root::Storable b/c of Ensembl. Does anyone know where Will Spooner is? He's the maintainer for Bio::Root::Storable. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 2 11:01:44 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 10:01:44 -0500 Subject: [Bioperl-l] Blast parser. In-Reply-To: <934F95E71B6C9347A873C42AE3C196190B84C5D6@NZT0004E.dknz.nzcorp.net> Message-ID: <000001c6e633$ad0a6ce0$15327e82@pyrimidine> The alignment that you get should come from GenericHSP, not BLASTHSP. Either way, the HSP alignment that is retrieved using $hsp->get_aln() should be a Bio::SimpleAlign object. You can then output that to the proper AlignIO format using an AlignIO stream object or use the Bio::SimpleAlign methods for further analysis. my $aln = $hsp->get_aln(); my $alnout = Bio::AlignIO->new(-format => 'msf', -fh => \*STDOUT); $alnout->write_aln($aln); Quick note: not all AlignIO formats have write_aln() support at this time, but most do. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of JK (Jesper Agerbo Krogh) > Sent: Monday, October 02, 2006 9:06 AM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] Blast parser. > > > Hi. > > I've tried to use the blast-parser but I cannot get the original alignment > out of the parser. Is it possible to get that out of the > Bio::Search::HSP::BlastHSP in some way. It seems quite odd to get a > clustalw alignment out when it isn't that type of alignment people are > used to get from blast. > > Thanks > > Jesper > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From whs at ebi.ac.uk Mon Oct 2 12:00:19 2006 From: whs at ebi.ac.uk (Will Spooner) Date: Mon, 2 Oct 2006 17:00:19 +0100 (BST) Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: <001d01c6e630$27792fb0$15327e82@pyrimidine> References: <001d01c6e630$27792fb0$15327e82@pyrimidine> Message-ID: On Mon, 2 Oct 2006, Chris Fields wrote: >> Sendu, >> >> No objection but someone should check the scripts in examples/root to make >> sure that they are not used there. >> >> Brian O. > > I suppose it's also possible that the other bioperl distributions (like > bioperl-run) could use them as well. > > If they do we can take care of them as they pop up. These are really old > and haven't been revised in a long time. > > The only one I worry about is Bio::Root::Storable b/c of Ensembl. Does > anyone know where Will Spooner is? He's the maintainer for > Bio::Root::Storable. > Hi Chris, I'm still lurking... If the tests for Bio::Root::Storable still pass (I assume that they do), then the module is working as advertised. The idea behind Storable is very simple; object instances of any inhereting class can be serialised/retrieved from disk. BioPerl objects will probably not want this functionality by default, but it is trival to implement if needed. Will From cjfields at uiuc.edu Mon Oct 2 13:58:15 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 12:58:15 -0500 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: Message-ID: <000601c6e64c$5746f990$15327e82@pyrimidine> > On Mon, 2 Oct 2006, Chris Fields wrote: > > >> Sendu, > >> > >> No objection but someone should check the scripts in examples/root to > make > >> sure that they are not used there. > >> > >> Brian O. > > > > I suppose it's also possible that the other bioperl distributions (like > > bioperl-run) could use them as well. > > > > If they do we can take care of them as they pop up. These are really > old > > and haven't been revised in a long time. > > > > The only one I worry about is Bio::Root::Storable b/c of Ensembl. Does > > anyone know where Will Spooner is? He's the maintainer for > > Bio::Root::Storable. > > > > Hi Chris, > > I'm still lurking... > > If the tests for Bio::Root::Storable still pass (I assume that they do), > then the module is working as advertised. > > The idea behind Storable is very simple; object instances of any > inhereting class can be serialised/retrieved from disk. BioPerl objects > will probably not want this functionality by default, but it is trival to > implement if needed. > > Will Okay, nice to know you're listening in! Based on that we should keep it in. The rest that Torsten mentioned could probably be removed right away. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From osborne1 at optonline.net Mon Oct 2 13:59:58 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Mon, 02 Oct 2006 13:59:58 -0400 Subject: [Bioperl-l] bioperl-network warnings when loading the DIP dataset In-Reply-To: <20061002002403.GD12075@iucha.net> Message-ID: Florin, OK, this is fixed in CVS now. The problem is that there's some variability in how the PSI MI "standard" is used. In this case there was a species that was not given a value for its scientific name ("fullName"), I had to use common name in its place. Fortunately there's an NCBI taxon id behind all this. Thanks again, Brian O. On 10/1/06 8:24 PM, "Florin Iucha" wrote: > Also the largest file (full_2_psi1.xml) from release 2006-09-19 of the > MINT [1] database does not produce the crash. It has a new warning, however: > > Can't call method "text" on an undefined value at > /usr/local/share/perl/5.8.8/Bio/Network/IO/psi.pm line 290. From mmacho at gmail.com Mon Oct 2 13:43:13 2006 From: mmacho at gmail.com (ende) Date: Mon, 2 Oct 2006 19:43:13 +0200 Subject: [Bioperl-l] Variable scope Message-ID: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com> Hi this may be a typical perl topic and then out of this list center topic. My apologize for any inconvenience. It is a annoying problem that is making me waste lot of time. I have a package with its new object, etc... and constants in it like: #----- use constant False => 0; use constant True => 1; our %CLRFG = ( PLASMIDO => RED, POLY_A => GREEN, RESTR_SITES => BLUE, CONECTORS => MAGENTA, CONTAMINANTS => CYAN, ); our %CLRBG = ( PLASMIDO => "", POLY_A => "", RESTR_SITES => "", CONECTORS => "", CONTAMINANTS => "", ); #------ this constants are include with require "h.pl" from the main package file. I use this module from the mail command line driver to test it "using" it. In the command line driver I can use with no gripe the constants False and True directly, for example "return True", etc without any reference to the origin of that constant. But, with respect to the variables (I would like they also were constants.. but how?), %CLRFG and %CLRBG I can't find the way of refering those int the module. Finally I have desisted and _copy_ the definitions where I have needed it (in the sub were I print Ansi terminal colouring seqs...). I don't find how to refer those variables out of the module. I have tried %modulename::CLRFG, for example, but Perl gives me errors. Any help? -- Juan Falgueras Profesor del Depto. de Lenguajes y Ciencias de la Computaci?n Universidad de M?laga From cjfields at uiuc.edu Mon Oct 2 16:52:11 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 15:52:11 -0500 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: Message-ID: <000001c6e664$a25538d0$15327e82@pyrimidine> I have updated the Deprecation page with the Bio::Root::* modules that we plan on deprecating (note that I have them being removed for rel. 1.5.2). I have left out Bio::Root::Storable for now based on Will's response. http://www.bioperl.org/wiki/Deprecated_modules I'll update the DEPRECATED doc in CVS as well. There is a tentative schedule for when warnings are added for modules before they are removed. In relation to the recent trend for house-cleaning, I noticed that all of the Bio::Tools::BP* BLAST-related modules all are still present but haven't been modified or had deprecation warnings added. BPLite was marked for deprecation around rel 1.5 since the functionality is present in Bio::SearchIO, as well as the others. Judging by the mail list, no one has used these in quite a while, and everyone has been redirected to use Bio::SearchIO instead. Based on that I have added warnings in CVS for deprecation to BPlite and the related modules BPpsilite and BPbl2seq. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Brian Osborne > Sent: Monday, October 02, 2006 9:14 AM > To: Sendu Bala; bioperl-l > Subject: Re: [Bioperl-l] Do we need Bio::Root::Object anymore? > > Sendu, > > No objection but someone should check the scripts in examples/root to make > sure that they are not used there. > > Brian O. > > > On 10/2/06 5:55 AM, "Sendu Bala" wrote: > > > Torsten Seemann wrote: > >>>>> I have removed all use/@ISA Bio::Root::Object references from > >>>>> bioperl-live, except for those in Bio::Root::* itself: > >> > >>>> So I'd say they're both relics that can be removed. In fact I was > >>>> planning on getting rid off all references to both of these modules > >>>> before you did, so thanks! :) > >> > >>> I think they can go. It's probably a pre-1.0 deprecation that somehow > >>> was never followed through on. > >> > >> Today I did a fresh CVS checkout of bioperl-live, and deleted the > >> following modules and tests, and all tests passed with BIOPERLDEBUG=0 > >> > >> * Bio::Root::Err > >> * Bio::Root::Global > >> * Bio::Root::IOManager > >> * Bio::Root::Object > >> * Bio::Root::Storable > >> * Bio::Root::Utilities # may be used by third parties? > >> * Bio::Root::Vector > >> * Bio::Root::Xref > >> * t/Root-Utilities.t # need to keep if we keep Utilities.pm > >> * t/RootStorable.t > >> > >> Should we schedule for deprecation, or deprecate immediately as Hilmar > >> suggested they were meant to be deprecated long ago ? > > > > I'm happy to get rid of them all straight away. Does anyone object? > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From florin at iucha.net Mon Oct 2 16:47:01 2006 From: florin at iucha.net (Florin Iucha) Date: Mon, 2 Oct 2006 15:47:01 -0500 Subject: [Bioperl-l] Variable scope In-Reply-To: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com> References: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com> Message-ID: <20061002204701.GG14409@iucha.net> On Mon, Oct 02, 2006 at 07:43:13PM +0200, ende wrote: > It is a annoying problem that is making me waste lot of time. > > I have a package with its new object, etc... and constants in it like: > > #----- > use constant False => 0; > use constant True => 1; > > our %CLRFG = ( > PLASMIDO => RED, > POLY_A => GREEN, > RESTR_SITES => BLUE, > CONECTORS => MAGENTA, > CONTAMINANTS => CYAN, > ); > > our %CLRBG = ( > PLASMIDO => "", > POLY_A => "", > RESTR_SITES => "", > CONECTORS => "", > CONTAMINANTS => "", > ); > #------ > > this constants are include with require "h.pl" from the main package > file. > > I use this module from the mail command line driver to test it > "using" it. In the command line driver I can use with no gripe the > constants False and True directly, for example "return True", etc > without any reference to the origin of that constant. It is possible you get them from somewhere else. > But, with respect to the variables (I would like they also were > constants.. but how?), %CLRFG and %CLRBG I can't find the way of > refering those int the module. Finally I have desisted and _copy_ > the definitions where I have needed it (in the sub were I print Ansi > terminal colouring seqs...). I don't find how to refer those > variables out of the module. > > I have tried %modulename::CLRFG, for example, but Perl gives me errors. Did you actually declare a package name in "h.pl" ? Is there any reason you don't call the file ".pm" and load it with "use"? I have attached a small example of importing that works. florin -- If we wish to count lines of code, we should not regard them as lines produced but as lines spent. -- Edsger Dijkstra -------------- next part -------------- A non-text attachment was scrubbed... Name: one.pm Type: text/x-perl Size: 118 bytes Desc: not available Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061002/cf339845/attachment.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: two.pl Type: text/x-perl Size: 69 bytes Desc: not available Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061002/cf339845/attachment-0001.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061002/cf339845/attachment-0002.bin From Kevin.M.Brown at asu.edu Mon Oct 2 19:44:50 2006 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Mon, 2 Oct 2006 16:44:50 -0700 Subject: [Bioperl-l] ClustalW alignment and Bioperl::Run module Message-ID: <1A4207F8295607498283FE9E93B775B4021960CD@EX02.asurite.ad.asu.edu> Well, for anyone that wants to know, I found a way to capture the output of ClustalW to get at things like the score. Copy STDOUT to another handle open(OUTCOPY, ">&STDOUT") or die "Couldn't dup STDOUT: $!"; Change where STDOUT goes open(STDOUT, ">log.test") or die "Couldn't open log.test: $!"; Run the alignment and its output will be captured by the STDOUT redirection $aln, $factory->align(\@seq); Restore STDOUT to its normal location for the rest of the script close STDOUT; open(STDOUT, ">&OUTCOPY"); I guess I can understand why most of this is just dropped by the ClustalW.pm module since there doesn't seem to be a way to hold it all in a SimpleAlign object. > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Kevin Brown > Sent: Thursday, September 28, 2006 2:48 PM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] ClustalW alignment and Bioperl::Run module > > I've gotten a very simple script to run using bioperl that creates an > alignment using clustalw of two sequences. I see that clustal outputs > to stdout information like the score, but I don't see any way to store > that or retrieve that from the alignment object that is > returned (unless > I'm just blind). What follows is my very basic script which used code > found in the Wiki. > > print $aln->score() spits out an error about using an uninitialized > value. > > > #!/usr/bin/perl -w > > use strict; > use Bio::SeqIO; > use Bio::Perl; > use Bio::AlignIO; > use Getopt::Long qw(:config no_ignore_case bundling pass_through); > use POSIX; > use Bio::Tools::Run::Alignment::Clustalw; > > my $fileName = ""; # filename(s) to be parsed for > information > my $output_dir = ""; > my $format = 'fasta'; # default format for SeqIO module > > GetOptions( > 'file=s' => \$fileName, > 'output=s' => \$output_dir, > ); > > # Parse the input file for the needed information > # SeqIO supports several normal formats including , and > > > my @files = split(/\|/, $fileName); > my @seq_array; > > my $stream_out = > Bio::AlignIO->new(-file => '>test.msf', -format => 'msf', -flush => > 0); > > foreach my $fileName (@files) > { > my $file = Bio::SeqIO->new(-format => $format, -file => > $fileName); > my $seq; > while ($seq = $file->next_seq()) > { > push(@seq_array, $seq); > } > } > > my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM'); > my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params); > my $ktuple = 3; > $factory->ktuple($ktuple); # change the parameter before executing > # where @seq_array is an array of {{PM|Bio::Seq}} objects > > open my $out, ">seq.txt"; > > for (my $i = 1 ; $i <= $#seq_array ; $i++) > { > my @seq = ($seq_array[0], $seq_array[$i]); > my $aln = $factory->align(\@seq); > $stream_out->write_aln($aln); > print $aln->score; > for my $seq ($aln->each_seq) { > print $out $seq->display_id() ."\t". $seq->seq()."\n"; > } > } > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From bix at sendu.me.uk Mon Oct 2 19:48:34 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 03 Oct 2006 00:48:34 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 Message-ID: <4521A552.60301@sendu.me.uk> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll upload tar.gz files when I have access to the server, then reply here with links. In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for instructions on getting and testing this RC. Developers: Make sure you're in the AUTHORS file in all 4 packages, as appropriate. Users: Even though 1.5.2 is a 'developer' release, we consider it the most stable and capable version of Bioperl, and recommend that you use it in all but the most critical production environments. Please try it out and let us know of any problems or difficulties you run into. Thank you, Sendu. From lincoln.stein at gmail.com Mon Oct 2 17:53:38 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Mon, 2 Oct 2006 21:53:38 +0000 Subject: [Bioperl-l] Variable scope In-Reply-To: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com> References: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com> Message-ID: <6dce9a0b0610021453va2132c7u73747b9253211a66@mail.gmail.com> Hi, Read the documentation in Export. It is much better to formally export constants, variables and functions and to import them with "use" than to use "require". Also be sure that you understand how namespaces and modules work. This is not a BioPerl topic and should have been directed to a general Perl discussion list, such as Perl Monks. Lincoln On 10/2/06, ende wrote: > > > Hi > > this may be a typical perl topic and then out of this list center > topic. My apologize for any inconvenience. > > It is a annoying problem that is making me waste lot of time. > > I have a package with its new object, etc... and constants in it like: > > #----- > use constant False => 0; > use constant True => 1; > > our %CLRFG = ( > PLASMIDO => RED, > POLY_A => GREEN, > RESTR_SITES => BLUE, > CONECTORS => MAGENTA, > CONTAMINANTS => CYAN, > ); > > our %CLRBG = ( > PLASMIDO => "", > POLY_A => "", > RESTR_SITES => "", > CONECTORS => "", > CONTAMINANTS => "", > ); > #------ > > this constants are include with require "h.pl" from the main package > file. > > I use this module from the mail command line driver to test it > "using" it. In the command line driver I can use with no gripe the > constants False and True directly, for example "return True", etc > without any reference to the origin of that constant. > > But, with respect to the variables (I would like they also were > constants.. but how?), %CLRFG and %CLRBG I can't find the way of > refering those int the module. Finally I have desisted and _copy_ > the definitions where I have needed it (in the sub were I print Ansi > terminal colouring seqs...). I don't find how to refer those > variables out of the module. > > I have tried %modulename::CLRFG, for example, but Perl gives me errors. > > Any help? > > > > > -- > Juan Falgueras > Profesor del Depto. de Lenguajes y Ciencias de la Computaci?n > Universidad de M?laga > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From florin at iucha.net Mon Oct 2 22:30:31 2006 From: florin at iucha.net (Florin Iucha) Date: Mon, 2 Oct 2006 21:30:31 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <4521A552.60301@sendu.me.uk> References: <4521A552.60301@sendu.me.uk> Message-ID: <20061003023031.GI14409@iucha.net> On Tue, Oct 03, 2006 at 12:48:34AM +0100, Sendu Bala wrote: > Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll > upload tar.gz files when I have access to the server, then reply here > with links. > > In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for > instructions on getting and testing this RC. [I won't create a wiki account just to report this.] Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG not set. Lots of warnings about missing packages and all, but this looks interesting: Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423. Otherwise: Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, 99.99% okay. The failed test is: t/ESEfinder..................dubious Test returned status 255 (wstat 65280, 0xff00) DIED. FAILED test 15 florin -- If we wish to count lines of code, we should not regard them as lines produced but as lines spent. -- Edsger Dijkstra From cjfields at uiuc.edu Mon Oct 2 23:50:47 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 22:50:47 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <4521A552.60301@sendu.me.uk> References: <4521A552.60301@sendu.me.uk> Message-ID: <7F5CF5FF-5A1D-4DEA-A861-D4C47756F67D@uiuc.edu> So far all tests pass on Mac OS X. I'll add this to the release page. This RC will throw warnings for four tests I didn't remove in time (BPlite.t, BPpsilite, BPbl2seq.t, and RestrictionEnzyme.t), which correspond to their namesake deprecated Bio::Tools modules. These are no longer in CVS HEAD so should be gone by the next RC, and the relevant modules marked for deprecation. I can verify the Bio::DB::SeqFeature.t warning on Mac OS X that Florin reported, but ESEFinder.t works fine: t/BioDBSeqFeature............Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423. ok .... I'll report WinXP tests tomorrow on the wiki. Chris On Oct 2, 2006, at 6:48 PM, Sendu Bala wrote: > Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll > upload tar.gz files when I have access to the server, then reply here > with links. > > In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for > instructions on getting and testing this RC. > > Developers: > Make sure you're in the AUTHORS file in all 4 packages, as > appropriate. > > Users: > Even though 1.5.2 is a 'developer' release, we consider it the most > stable and capable version of Bioperl, and recommend that you use > it in all but the most critical production environments. Please > try it out and let us know of any problems or difficulties you run > into. > > > Thank you, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 2 23:54:29 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 22:54:29 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <20061003023031.GI14409@iucha.net> References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net> Message-ID: > [I won't create a wiki account just to report this.] > > Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG > not set. Lots of warnings about missing packages and all, but this > looks interesting: > > Argument "+" isn't numeric in numeric lt (<) at Bio/DB/ > SeqFeature/Segment.pm line 423. This is verified on Mac OS X. > Otherwise: > > Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, > 99.99% okay. > > The failed test is: > > t/ESEfinder..................dubious > Test returned status 255 (wstat 65280, 0xff00) > DIED. FAILED test 15 What do you get when you run that set of tests using 'perl -I. -w t/ ESEFinder.t'? The bad status code is odd and could be a remote server issue. Chris > > florin > > -- > If we wish to count lines of code, we should not regard them as lines > produced but as lines spent. -- Edsger Dijkstra > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From torsten.seemann at infotech.monash.edu.au Tue Oct 3 00:30:06 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Tue, 03 Oct 2006 14:30:06 +1000 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm Message-ID: <4521E74E.1040404@infotech.monash.edu.au> My understanding is that all Bioperl-compliant classes should inherit from Bio::Root::Root, not Bio::Root::RootI. Additionally, if functions such as throw() or _rearrange() are to be used without a class instance reference, they are to be used as class methods via Bio::Root::Root, not Bio::Root::RootI. Is this correct? My naive audit of bioperl-live CVS brought up the following statistics: # Root.pm /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l 26 /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | wc -l 346 # RootI.pm /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l 9 /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | wc -l 79 My guess would be that all RootI should be changed to plain Root ? Any help appreciated, -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From jason at bioperl.org Tue Oct 3 02:03:17 2006 From: jason at bioperl.org (Jason Stajich) Date: Mon, 2 Oct 2006 23:03:17 -0700 Subject: [Bioperl-l] t/ESEFinder.t fixed on branch Message-ID: <8927E580-997E-402A-BA81-D09FF9AD5D9A@bioperl.org> Looks like good work everyone. All tests pass for me on OSX perl 5.8.6. w and w/o BIOPERLDEBUG=1 with RC1 except for the t/ESEFinder problem which I've fixed. It skipped too few tests when BIOPERLDEBUG=0. Don't forget to merge branch changes back to head for this test when it is done. I don't want to muddy water so I'm holding off migrating the changes to main trunk as the files is substantially different (I presume pre-Test::More adoption?). -jason From bix at sendu.me.uk Tue Oct 3 03:28:48 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 03 Oct 2006 08:28:48 +0100 Subject: [Bioperl-l] t/ESEFinder.t fixed on branch In-Reply-To: <8927E580-997E-402A-BA81-D09FF9AD5D9A@bioperl.org> References: <8927E580-997E-402A-BA81-D09FF9AD5D9A@bioperl.org> Message-ID: <45221130.2060405@sendu.me.uk> Jason Stajich wrote: > Looks like good work everyone. > > All tests pass for me on OSX perl 5.8.6. w and w/o BIOPERLDEBUG=1 > with RC1 except for the t/ESEFinder problem which I've fixed. > > It skipped too few tests when BIOPERLDEBUG=0. > > Don't forget to merge branch changes back to head for this test when > it is done. I don't want to muddy water so I'm holding off > migrating the changes to main trunk as the files is substantially > different (I presume pre-Test::More adoption?). Actually, it was the same until Torsten made his own (different) fixes to HEAD but not to branch. It was my mistake and I've corrected in yet a third way, and now branch and HEAD match. No harm done :) From bix at sendu.me.uk Tue Oct 3 03:31:10 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 03 Oct 2006 08:31:10 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <7F5CF5FF-5A1D-4DEA-A861-D4C47756F67D@uiuc.edu> References: <4521A552.60301@sendu.me.uk> <7F5CF5FF-5A1D-4DEA-A861-D4C47756F67D@uiuc.edu> Message-ID: <452211BE.6080107@sendu.me.uk> Chris Fields wrote: > So far all tests pass on Mac OS X. I'll add this to the release page. > > This RC will throw warnings for four tests I didn't remove in time > (BPlite.t, BPpsilite, BPbl2seq.t, and RestrictionEnzyme.t), which > correspond to their namesake deprecated Bio::Tools modules. These > are no longer in CVS HEAD so should be gone by the next RC, and the > relevant modules marked for deprecation. Thanks Chris. Sorry I missed these. From bix at sendu.me.uk Tue Oct 3 03:32:08 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 03 Oct 2006 08:32:08 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <20061003023031.GI14409@iucha.net> References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net> Message-ID: <452211F8.8040104@sendu.me.uk> Florin Iucha wrote: > On Tue, Oct 03, 2006 at 12:48:34AM +0100, Sendu Bala wrote: >> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll >> upload tar.gz files when I have access to the server, then reply here >> with links. >> >> In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for >> instructions on getting and testing this RC. > > [I won't create a wiki account just to report this.] > > Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG > not set. Lots of warnings about missing packages and all, but this > looks interesting: > > Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423. > > Otherwise: > > Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, 99.99% okay. > > The failed test is: > > t/ESEfinder..................dubious > Test returned status 255 (wstat 65280, 0xff00) > DIED. FAILED test 15 Thanks for your feedback Florin. The ESEfinder fail will be fixed in the next RC. From bix at sendu.me.uk Tue Oct 3 04:29:37 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 03 Oct 2006 09:29:37 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <4521A552.60301@sendu.me.uk> References: <4521A552.60301@sendu.me.uk> Message-ID: <45221F71.40206@sendu.me.uk> Sendu Bala wrote: > Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll > upload tar.gz files when I have access to the server, then reply here > with links. Live/core: http://bioperl.org/DIST/bioperl-1.5.2-RC1.tar.gz http://bioperl.org/DIST/bioperl-1.5.2-RC1.tar.bz2 http://bioperl.org/DIST/bioperl-1.5.2-RC1.zip Run: http://bioperl.org/DIST/bioperl-run-1.5.2-RC1.tar.gz http://bioperl.org/DIST/bioperl-run-1.5.2-RC1.tar.bz2 http://bioperl.org/DIST/bioperl-run-1.5.2-RC1.zip DB: http://bioperl.org/DIST/bioperl-db-1.5.2-RC1.tar.gz http://bioperl.org/DIST/bioperl-db-1.5.2-RC1.tar.bz2 http://bioperl.org/DIST/bioperl-db-1.5.2-RC1.zip Network: http://bioperl.org/DIST/bioperl-network-1.5.2-RC1.tar.gz http://bioperl.org/DIST/bioperl-network-1.5.2-RC1.tar.bz2 http://bioperl.org/DIST/bioperl-network-1.5.2-RC1.zip Md5 checksums are in: http://bioperl.org/DIST/SIGNATURES.md5 From jason at bioperl.org Tue Oct 3 02:11:30 2006 From: jason at bioperl.org (Jason Stajich) Date: Mon, 2 Oct 2006 23:11:30 -0700 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm Message-ID: <87F9B64E-8BDA-464B-814D-3F117AA646A1@bioperl.org> I only briefly saw your question - but RootI is for interfaces, Root.pm is for instantiated objects. From florin at iucha.net Tue Oct 3 07:39:12 2006 From: florin at iucha.net (Florin Iucha) Date: Tue, 3 Oct 2006 06:39:12 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net> Message-ID: <20061003113912.GJ14409@iucha.net> On Mon, Oct 02, 2006 at 10:54:29PM -0500, Chris Fields wrote: > >Otherwise: > > > > Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, > >99.99% okay. > > > >The failed test is: > > > > t/ESEfinder..................dubious > > Test returned status 255 (wstat 65280, 0xff00) > > DIED. FAILED test 15 $ perl -I. -w t/ESEfinder.t 1..15 ok 1 - use Bio::Tools::Analysis::DNA::ESEfinder; ok 2 - use Data::Dumper; ok 3 - use Bio::PrimarySeq; ok 4 - use Bio::Seq; ok 5 ok 6 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 7 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 8 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 9 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 10 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 11 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 12 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 13 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 14 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test # Looks like you planned 15 tests but only ran 14. $ grep Id t/ESEfinder.t # $Id: ESEfinder.t,v 1.13.6.2 2006/10/02 23:10:39 sendu Exp $ florin -- If we wish to count lines of code, we should not regard them as lines produced but as lines spent. -- Edsger Dijkstra From hlapp at gmx.net Tue Oct 3 08:27:46 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 3 Oct 2006 08:27:46 -0400 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <4521E74E.1040404@infotech.monash.edu.au> References: <4521E74E.1040404@infotech.monash.edu.au> Message-ID: <338034D7-BCB0-4653-9C80-2E7EC9E09E9B@gmx.net> The interface classes (those ending in 'I') should actually inherit from RootI, not Root. In reality this recommendation is more theoretical than it makes that much of a difference I think. The motivation is that interface classes should not determine the actual implementation of a class (hash ref, array ref, whatever), and since Root.pm contains lots of implementation using a hash ref that decision will basically have been made. On the contrary though, RootI contains implementation too, although I'm not sure it would prescribe the object implementation as opposed to merely implementing static methods (like throw(), warn(), etc). That would need to be checked. -hilmar On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote: > My understanding is that all Bioperl-compliant classes should inherit > from Bio::Root::Root, not Bio::Root::RootI. > > Additionally, if functions such as throw() or _rearrange() are to be > used without a class instance reference, they are to be used as class > methods via Bio::Root::Root, not Bio::Root::RootI. > > Is this correct? > > My naive audit of bioperl-live CVS brought up the following > statistics: > > # Root.pm > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l > 26 > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | wc -l > 346 > > # RootI.pm > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l > 9 > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | wc -l > 79 > > My guess would be that all RootI should be changed to plain Root ? > > Any help appreciated, > > -- > Dr Torsten Seemann http://www.vicbioinformatics.com > Victorian Bioinformatics Consortium, Monash University, Australia > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Tue Oct 3 08:33:37 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 3 Oct 2006 07:33:37 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <20061003113912.GJ14409@iucha.net> References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net> <20061003113912.GJ14409@iucha.net> Message-ID: <44724E16-74CD-4778-B04F-529475B47E37@uiuc.edu> Florin, Looks like this is fixed and should be working in the next release. Chris On Oct 3, 2006, at 6:39 AM, Florin Iucha wrote: > On Mon, Oct 02, 2006 at 10:54:29PM -0500, Chris Fields wrote: >>> Otherwise: >>> >>> Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, >>> 99.99% okay. >>> >>> The failed test is: >>> >>> t/ESEfinder..................dubious >>> Test returned status 255 (wstat 65280, 0xff00) >>> DIED. FAILED test 15 > > $ perl -I. -w t/ESEfinder.t > 1..15 > ok 1 - use Bio::Tools::Analysis::DNA::ESEfinder; > ok 2 - use Data::Dumper; > ok 3 - use Bio::PrimarySeq; > ok 4 - use Bio::Seq; > ok 5 > ok 6 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 7 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 8 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 9 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 10 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 11 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 12 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 13 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 14 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > # Looks like you planned 15 tests but only ran 14. > $ grep Id t/ESEfinder.t > # $Id: ESEfinder.t,v 1.13.6.2 2006/10/02 23:10:39 sendu Exp $ > > florin > > -- > If we wish to count lines of code, we should not regard them as lines > produced but as lines spent. -- Edsger Dijkstra > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 3 10:29:51 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 3 Oct 2006 09:29:51 -0500 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <338034D7-BCB0-4653-9C80-2E7EC9E09E9B@gmx.net> Message-ID: <002101c6e6f8$67b4ae10$15327e82@pyrimidine> > The interface classes (those ending in 'I') should actually inherit > from RootI, not Root. > > In reality this recommendation is more theoretical than it makes that > much of a difference I think. The motivation is that interface > classes should not determine the actual implementation of a class > (hash ref, array ref, whatever), and since Root.pm contains lots of > implementation using a hash ref that decision will basically have > been made. > > On the contrary though, RootI contains implementation too, although > I'm not sure it would prescribe the object implementation as opposed > to merely implementing static methods (like throw(), warn(), etc). > That would need to be checked. > > -hilmar The constructor in Bio::Root::RootI lets one know that its use is deprecated, so you shouldn't have any cases of 'our qw(Bio::Root::RootI)'; there should be some way of inheriting Root directly or indirectly. I would say that any direct use of RootI is not good practice, though. For the current implementation we should only inherit Bio::Root::Root, which implements RootI. Is there any reason to shut off the warning with BIOPERLDEBUG? >From RootI: sub new { my $class = shift; my @args = @_; unless ( $ENV{'BIOPERLDEBUG'} ) { carp("Use of new in Bio::Root::RootI is deprecated. Please use Bio::Root::Root instead"); } eval "require Bio::Root::Root"; return Bio::Root::Root->new(@args); } Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > > On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote: > > > My understanding is that all Bioperl-compliant classes should inherit > > from Bio::Root::Root, not Bio::Root::RootI. > > > > Additionally, if functions such as throw() or _rearrange() are to be > > used without a class instance reference, they are to be used as class > > methods via Bio::Root::Root, not Bio::Root::RootI. > > > > Is this correct? > > > > My naive audit of bioperl-live CVS brought up the following > > statistics: > > > > # Root.pm > > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l > > 26 > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | wc -l > > 346 > > > > # RootI.pm > > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l > > 9 > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | wc -l > > 79 > > > > My guess would be that all RootI should be changed to plain Root ? > > > > Any help appreciated, > > > > -- > > Dr Torsten Seemann http://www.vicbioinformatics.com > > Victorian Bioinformatics Consortium, Monash University, Australia > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From slenk at emich.edu Tue Oct 3 13:31:47 2006 From: slenk at emich.edu (Stephen Gordon Lenk) Date: Tue, 03 Oct 2006 13:31:47 -0400 Subject: [Bioperl-l] Perl 6 has 'roles' - may be cleanly applicable to the Root/RootI issue Message-ID: <5147da5514e402.514e4025147da5@emich.edu> I looked at the Perl6 site, there is an RFC on interfaces: http://dev.perl.org/perl6/rfc/265.html Roles seem to be the Perl 6 answer to the Root/RootI issue in Bioperl. Maybe it is too early to suggest this. http://dev.perl.org/perl6/doc/design/apo/A12.html: The primary role of a class is to manage instances, that is, objects. So a class must worry about object creation and destruction, and everything that happens in between. Classes have a secondary role as units of software reuse, in that they can be inherited from or delegated to. However, because this is a secondary role, and because of weaknesses in models of inheritance, composition, and delegation, Perl 6 will split out the notion of software reuse into a separate class-like entity called a "role". Roles are an abstraction mechanism for use by classes that don't care about the secondary aspects of software reuse, or that (looking at it the other way) care so much about it that they want to encapsulate any decisions about implementation, composition, delegation, and maybe even inheritance. Sounds fancy, but just think of them as includes of partial classes, with some safety checks. Roles don't manage objects. They manage interfaces and other abstract behavior (like default implementations), and they help classes manage objects. As such, a role may only be composed into a class or into another role, never inherited from or delegated to. That's what classes are for. From slenk at emich.edu Tue Oct 3 12:45:15 2006 From: slenk at emich.edu (Stephen Gordon Lenk) Date: Tue, 03 Oct 2006 12:45:15 -0400 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm Message-ID: <5120d6a511f5a7.511f5a75120d6a@emich.edu> The separation of interface and implementation is generally regarded as a good idea. Right now the Bioperl community is doing this as part of the implementation of Bioperl. I suggest that this is an example of something which you might want to have as part of the Perl implementation. If Perl 6 (or even Perl 5) does not have this as a core part of the language or as a standard package (reusable by all in a common fashion), you may want to suggest to the Perl implementers that a way for interface/implementation distinctions be made part of the core language. My 2 cents, as you people are the experts on your own code. ----- Original Message ----- From: Chris Fields Date: Tuesday, October 3, 2006 10:29 am Subject: Re: [Bioperl-l] Use of Root.pm versus RootI.pm > > The interface classes (those ending in 'I') should actually inherit > > from RootI, not Root. > > > > In reality this recommendation is more theoretical than it makes > that> much of a difference I think. The motivation is that interface > > classes should not determine the actual implementation of a class > > (hash ref, array ref, whatever), and since Root.pm contains lots of > > implementation using a hash ref that decision will basically have > > been made. > > > > On the contrary though, RootI contains implementation too, although > > I'm not sure it would prescribe the object implementation as opposed > > to merely implementing static methods (like throw(), warn(), etc). > > That would need to be checked. > > > > -hilmar > > The constructor in Bio::Root::RootI lets one know that its use is > deprecated, so you shouldn't have any cases of 'our > qw(Bio::Root::RootI)';there should be some way of inheriting Root > directly or indirectly. I would > say that any direct use of RootI is not good practice, though. > For the > current implementation we should only inherit Bio::Root::Root, which > implements RootI. > > Is there any reason to shut off the warning with BIOPERLDEBUG? > > >From RootI: > > sub new { > my $class = shift; > my @args = @_; > unless ( $ENV{'BIOPERLDEBUG'} ) { > carp("Use of new in Bio::Root::RootI is deprecated. Please use > Bio::Root::Root instead"); > } > eval "require Bio::Root::Root"; > return Bio::Root::Root->new(@args); > } > > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > > > > > On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote: > > > > > My understanding is that all Bioperl-compliant classes should > inherit> > from Bio::Root::Root, not Bio::Root::RootI. > > > > > > Additionally, if functions such as throw() or _rearrange() are > to be > > > used without a class instance reference, they are to be used > as class > > > methods via Bio::Root::Root, not Bio::Root::RootI. > > > > > > Is this correct? > > > > > > My naive audit of bioperl-live CVS brought up the following > > > statistics: > > > > > > # Root.pm > > > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l > > > 26 > > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | > wc -l > > > 346 > > > > > > # RootI.pm > > > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l > > > 9 > > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | > wc -l > > > 79 > > > > > > My guess would be that all RootI should be changed to plain > Root ? > > > > > > Any help appreciated, > > > > > > -- > > > Dr Torsten Seemann http://www.vicbioinformatics.com > > > Victorian Bioinformatics Consortium, Monash University, Australia > > > > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > -- > > =========================================================== > > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > > =========================================================== > > > > > > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at uiuc.edu Tue Oct 3 13:49:35 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 3 Oct 2006 12:49:35 -0500 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <5120d6a511f5a7.511f5a75120d6a@emich.edu> Message-ID: <000001c6e714$4c2cbb80$15327e82@pyrimidine> Perl6 already has added flexibility for separation of implementation/interface (I believe they are called roles). http://dev.perl.org/perl6/doc/design/syn/S12.html To tell the truth, I'm not sure about Perl 5, except the way the Bioperl devs have up the distinction between interface and implementation. However, I find the way we use interfaces is very simple (set up interface with some/all methods as unimplemented, use the module as an abstract base class, then override the unimplemented methods). It works for me. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: Stephen Gordon Lenk [mailto:slenk at emich.edu] > Sent: Tuesday, October 03, 2006 11:45 AM > To: Chris Fields > Cc: 'Hilmar Lapp'; 'Torsten Seemann'; 'bioperl-l' > Subject: Re: [Bioperl-l] Use of Root.pm versus RootI.pm > > The separation of interface and implementation is generally > regarded as a good idea. Right now the Bioperl community is > doing this as part of the implementation of Bioperl. I suggest > that this is an example of something which you might want to > have as part of the Perl implementation. If Perl 6 (or even > Perl 5) does not have this as a core part of the language or > as a standard package (reusable by all in a common fashion), > you may want to suggest to the Perl implementers that a way > for interface/implementation distinctions be made part of the > core language. My 2 cents, as you people are the experts on > your own code. > > > ----- Original Message ----- > From: Chris Fields > Date: Tuesday, October 3, 2006 10:29 am > Subject: Re: [Bioperl-l] Use of Root.pm versus RootI.pm > > > > The interface classes (those ending in 'I') should actually inherit > > > from RootI, not Root. > > > > > > In reality this recommendation is more theoretical than it makes > > that> much of a difference I think. The motivation is that interface > > > classes should not determine the actual implementation of a class > > > (hash ref, array ref, whatever), and since Root.pm contains lots of > > > implementation using a hash ref that decision will basically have > > > been made. > > > > > > On the contrary though, RootI contains implementation too, although > > > I'm not sure it would prescribe the object implementation as > opposed > > > to merely implementing static methods (like throw(), warn(), etc). > > > That would need to be checked. > > > > > > -hilmar > > > > The constructor in Bio::Root::RootI lets one know that its use is > > deprecated, so you shouldn't have any cases of 'our > > qw(Bio::Root::RootI)';there should be some way of inheriting Root > > directly or indirectly. I would > > say that any direct use of RootI is not good practice, though. > > For the > > current implementation we should only inherit Bio::Root::Root, which > > implements RootI. > > > > Is there any reason to shut off the warning with BIOPERLDEBUG? > > > > >From RootI: > > > > sub new { > > my $class = shift; > > my @args = @_; > > unless ( $ENV{'BIOPERLDEBUG'} ) { > > carp("Use of new in Bio::Root::RootI is deprecated. Please use > > Bio::Root::Root instead"); > > } > > eval "require Bio::Root::Root"; > > return Bio::Root::Root->new(@args); > > } > > > > > > Christopher Fields > > Postdoctoral Researcher - Switzer Lab > > Dept. of Biochemistry > > University of Illinois Urbana-Champaign > > > > > > > > > > > > On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote: > > > > > > > My understanding is that all Bioperl-compliant classes should > > inherit> > from Bio::Root::Root, not Bio::Root::RootI. > > > > > > > > Additionally, if functions such as throw() or _rearrange() are > > to be > > > > used without a class instance reference, they are to be used > > as class > > > > methods via Bio::Root::Root, not Bio::Root::RootI. > > > > > > > > Is this correct? > > > > > > > > My naive audit of bioperl-live CVS brought up the following > > > > statistics: > > > > > > > > # Root.pm > > > > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l > > > > 26 > > > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | > > wc -l > > > > 346 > > > > > > > > # RootI.pm > > > > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l > > > > 9 > > > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | > > wc -l > > > > 79 > > > > > > > > My guess would be that all RootI should be changed to plain > > Root ? > > > > > > > > Any help appreciated, > > > > > > > > -- > > > > Dr Torsten Seemann http://www.vicbioinformatics.com > > > > Victorian Bioinformatics Consortium, Monash University, Australia > > > > > > > > _______________________________________________ > > > > Bioperl-l mailing list > > > > Bioperl-l at lists.open-bio.org > > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > > > -- > > > =========================================================== > > > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > > > =========================================================== > > > > > > > > > > > > > > > > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cmlapid at up.edu.ph Tue Oct 3 22:06:06 2006 From: cmlapid at up.edu.ph (Carlo Lapid) Date: Wed, 4 Oct 2006 10:06:06 +0800 Subject: [Bioperl-l] genbank mirror Message-ID: Hi, I'm trying to set up a local mirror of a large part of the Genbank database. For users to access the local database, I need to create a web-based search tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank flat files I've downloaded based on a query entered by the user. I'm trying to use Bioperl to create this from scratch, but I'm having a very hard time, especially since I want the user to have reasonable flexibility in customizing his search. The best that I've been able to accomplish is a search function that retrieves genbank sequence objects based on their primary IDs or accession numbers; by using the fetch method of the Bio::Index::GenBank module. But this doesn't help users who don't know the exact IDs for the sequences they want. Can anybody suggest a way to use Bioperl to search for an ordinary word or phrase, like "16S gene", which could be matched against the description field, or the entire genbank entry? (Alternatively, is there some other freely available tool or software that can do this?) I've been scouring the Bioperl documentation, but I couldn't find anything. I just need to be pointed in the right direction. What I thought was a relatively simple problem has been driving me crazy for days; if anybody has any suggestions I would really, really appreciate it. From torsten.seemann at infotech.monash.edu.au Tue Oct 3 22:58:03 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Wed, 04 Oct 2006 12:58:03 +1000 Subject: [Bioperl-l] genbank mirror In-Reply-To: References: Message-ID: <4523233B.7030505@infotech.monash.edu.au> > I'm trying to set up a local mirror of a large part of the Genbank database. > For users to access the local database, I need to create a web-based search > tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank > flat files I've downloaded based on a query entered by the user. Have you coinsidered bioperl-db / BioSQL ? http://www.bioperl.org/wiki/BioPerl_db http://lists.open-bio.org/pipermail/biosql-l/ -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From osborne1 at optonline.net Tue Oct 3 23:16:20 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Tue, 03 Oct 2006 23:16:20 -0400 Subject: [Bioperl-l] genbank mirror In-Reply-To: Message-ID: Carlo, You might want to look at the Bio::DB::Query::GenBank module: http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_dat abase However this works through NCBI's own eutils API, setting it up to query a local mirror may be very difficult. Brian O. On 10/3/06 10:06 PM, "Carlo Lapid" wrote: > Hi, > > I'm trying to set up a local mirror of a large part of the Genbank database. > For users to access the local database, I need to create a web-based search > tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank > flat files I've downloaded based on a query entered by the user. > > I'm trying to use Bioperl to create this from scratch, but I'm having a very > hard time, especially since I want the user to have reasonable flexibility > in customizing his search. The best that I've been able to accomplish is a > search function that retrieves genbank sequence objects based on their > primary IDs or accession numbers; by using the fetch method of the > Bio::Index::GenBank module. But this doesn't help users who don't know the > exact IDs for the sequences they want. > > Can anybody suggest a way to use Bioperl to search for an ordinary word or > phrase, like "16S gene", which could be matched against the description > field, or the entire genbank entry? (Alternatively, is there some other > freely available tool or software that can do this?) I've been scouring the > Bioperl documentation, but I couldn't find anything. I just need to be > pointed in the right direction. What I thought was a relatively simple > problem has been driving me crazy for days; if anybody has any suggestions I > would really, really appreciate it. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From osborne1 at optonline.net Tue Oct 3 23:28:06 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Tue, 03 Oct 2006 23:28:06 -0400 Subject: [Bioperl-l] genbank mirror In-Reply-To: <4523233B.7030505@infotech.monash.edu.au> Message-ID: Torsten and Carlo, Right. For some simple examples of using Bio::DB::Query::BioQuery to query a BioSQL db take a look at Bio::DB::BioSQL::OBDA. You may also want to take a look at NCBI's eutils API, it's quite powerful but not local. Or the ENSEMBL API, people have set up their own local ENSEMBL dbs. There's an example of this API here: http://www.bioperl.org/wiki/Getting_Genomic_Sequences Brian O. On 10/3/06 10:58 PM, "Torsten Seemann" wrote: >> I'm trying to set up a local mirror of a large part of the Genbank database. >> For users to access the local database, I need to create a web-based search >> tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank >> flat files I've downloaded based on a query entered by the user. > > Have you coinsidered bioperl-db / BioSQL ? > > http://www.bioperl.org/wiki/BioPerl_db > http://lists.open-bio.org/pipermail/biosql-l/ From torsten.seemann at infotech.monash.edu.au Wed Oct 4 01:21:24 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Wed, 04 Oct 2006 15:21:24 +1000 Subject: [Bioperl-l] Clean-up of Bio::Root::IO Message-ID: <452344D4.8070908@infotech.monash.edu.au> Hi all, Now that we have Perl 5.6.1 as a minimum, the following modules are standard: File::Spec, File::Temp, File::Path Bio::Root::IO has functions catfile(), tmpdir(), tmpfile(), rmtree() which currently dispatch to the File:: version, or try to emulate it. We don't need to emulate anymore. Jason Stajich suggested in a previous post that they should be deprecated, and that users should use directly the File:: functions themselves. I have an uncommitted simplified version of Bio::Root::IO which does this, and "all tests pass". The functions currently (silently) dispatch directly to their native counterparts. The only tricky function is tempfile() which is *mostly* like File::Temp::tempfile(), but does some voodoo of converting (TEMPLATE=>'xxx') to the non-hash first parameter of the File:: version, so I'm hesitant to commit. It may do other magic - Hilmar? Comments? -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From gianluca.debellis at itb.cnr.it Wed Oct 4 05:25:26 2006 From: gianluca.debellis at itb.cnr.it (Gianluca De Bellis) Date: Wed, 04 Oct 2006 11:25:26 +0200 Subject: [Bioperl-l] Bioperl under WinXP Message-ID: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB> I'm trying to use Bioperl under WinXP-SP2 (novice) Bioperl has been just downloaded (v 1.2.3) Even the simplest program with a single command (use Bio::Perl;) ends up in an error of the Perl interpreter with these details AppName: perl.exe AppVer: 5.8.8.819 ModName: win32.dll ModVer: 0.0.0.0 Offset: 00003294 Coming from the windos reporting system Where is the problem? Thanks in advance From epsteinj at mail.nih.gov Wed Oct 4 07:25:57 2006 From: epsteinj at mail.nih.gov (Epstein, Jonathan A (NIH/NICHD) [E]) Date: Wed, 4 Oct 2006 07:25:57 -0400 Subject: [Bioperl-l] genbank mirror References: Message-ID: <42504F69898FE546B3F0238C9BD03275532603@NIHCESMLBX7.nih.gov> There's Seqhound: http://seqhound.blueprint.org/report.html We set this up locally, and it's probably the most comprehensive free solution out there, but it's non-trivial to setup. Also, since the Blueprint&BIND have lost most of their funding, I'm not sure how long you can count on SeqHound to remain operational (although for now it is being updated). Jonathan -----Original Message----- From: Carlo Lapid [mailto:cmlapid at up.edu.ph] Sent: Tue 10/3/2006 10:06 PM To: bioperl-l at bioperl.org Subject: [Bioperl-l] genbank mirror Hi, I'm trying to set up a local mirror of a large part of the Genbank database. For users to access the local database, I need to create a web-based search tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank flat files I've downloaded based on a query entered by the user. I'm trying to use Bioperl to create this from scratch, but I'm having a very hard time, especially since I want the user to have reasonable flexibility in customizing his search. The best that I've been able to accomplish is a search function that retrieves genbank sequence objects based on their primary IDs or accession numbers; by using the fetch method of the Bio::Index::GenBank module. But this doesn't help users who don't know the exact IDs for the sequences they want. Can anybody suggest a way to use Bioperl to search for an ordinary word or phrase, like "16S gene", which could be matched against the description field, or the entire genbank entry? (Alternatively, is there some other freely available tool or software that can do this?) I've been scouring the Bioperl documentation, but I couldn't find anything. I just need to be pointed in the right direction. What I thought was a relatively simple problem has been driving me crazy for days; if anybody has any suggestions I would really, really appreciate it. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From bix at sendu.me.uk Wed Oct 4 09:19:45 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 04 Oct 2006 14:19:45 +0100 Subject: [Bioperl-l] Bioperl under WinXP In-Reply-To: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB> References: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB> Message-ID: <4523B4F1.3010305@sendu.me.uk> Gianluca De Bellis wrote: > I'm trying to use Bioperl under WinXP-SP2 (novice) > > Bioperl has been just downloaded (v 1.2.3) > > Even the simplest program with a single command (use Bio::Perl;) ends up in > an error of the Perl interpreter with these details > > AppName: perl.exe AppVer: 5.8.8.819 ModName: win32.dll > > ModVer: 0.0.0.0 Offset: 00003294 > > Coming from the windos reporting system > > Where is the problem? Hard to say. Do non-bioperl scripts work? Make sure to follow the Bioperl installation instructions carefully: http://bioperl.org/wiki/Installing_Bioperl_on_Windows And make sure to install at least version 1.4. 1.2.3 is ancient and effectively unsupported. From cjfields at uiuc.edu Wed Oct 4 10:03:34 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 4 Oct 2006 09:03:34 -0500 Subject: [Bioperl-l] Bioperl under WinXP In-Reply-To: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB> Message-ID: <000601c6e7bd$e22ad190$15327e82@pyrimidine> If you're using PPM, you can install a (much) newer version of BioPerl from here: http://www.gmod.org/ggb/ppm/ Add that as one of your repositories in PPM4 (seeing that you are using ActivePerl 5.8.8.819), then search for bioperl. The version should be 1.512. In a few weeks we'll be releasing a new developer release. A WinXP PPM is expected, as well as a bundled package to install all prerequisites. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Gianluca De Bellis > Sent: Wednesday, October 04, 2006 4:25 AM > To: bioperl-l at bioperl.org > Subject: [Bioperl-l] Bioperl under WinXP > > I'm trying to use Bioperl under WinXP-SP2 (novice) > > Bioperl has been just downloaded (v 1.2.3) > > Even the simplest program with a single command (use Bio::Perl;) ends up > in > an error of the Perl interpreter with these details > > AppName: perl.exe AppVer: 5.8.8.819 ModName: win32.dll > > ModVer: 0.0.0.0 Offset: 00003294 > > Coming from the windos reporting system > > Where is the problem? > > > > Thanks in advance > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From hlapp at gmx.net Wed Oct 4 10:25:23 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 4 Oct 2006 10:25:23 -0400 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <002101c6e6f8$67b4ae10$15327e82@pyrimidine> References: <002101c6e6f8$67b4ae10$15327e82@pyrimidine> Message-ID: <28FB0A46-24C9-4EDD-B89C-BF7A108A1EDC@gmx.net> On Oct 3, 2006, at 10:29 AM, Chris Fields wrote: > The constructor in Bio::Root::RootI lets one know that its use is > deprecated, so you shouldn't have any cases of 'our qw > (Bio::Root::RootI)'; Don't confuse the constructor with the inheritance tree. Interface classes should never be instantiated, hence the constructor, consistent with the documentation, should never get executed. > there should be some way of inheriting Root directly or > indirectly. I would > say that any direct use of RootI is not good practice, though. I don't know what you mean by 'directly' or 'indirectly' but inheritance from interfaces, and interfaces extending (inheriting from) other interfaces, is certainly standard practice. I'm not sure at all why it would be a bad one. > For the current implementation we should only inherit > Bio::Root::Root, which > implements RootI. For the implementation classes, yes. For the interface classes, no. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Wed Oct 4 10:43:54 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 4 Oct 2006 10:43:54 -0400 Subject: [Bioperl-l] Clean-up of Bio::Root::IO In-Reply-To: <452344D4.8070908@infotech.monash.edu.au> References: <452344D4.8070908@infotech.monash.edu.au> Message-ID: <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net> On Oct 4, 2006, at 1:21 AM, Torsten Seemann wrote: > Bio::Root::IO has functions catfile(), tmpdir(), tmpfile(), rmtree() > which currently dispatch to the File:: version, or try to emulate > it. We > don't need to emulate anymore. Jason Stajich suggested in a previous > post that they should be deprecated, and that users should use > directly > the File:: functions themselves. I don't think there's a need to deprecate - if the methods just plain delegate to whatever File:: module is appropriate their implementation (supposedly) will become very simple and hence won't pose a maintenance burden anymore. One can still recommend for all new scripts or modules or code written to use the File:: modules directly, just I'm not sure there's a need to tell users that they should start changing their existing stuff. > > I have an uncommitted simplified version of Bio::Root::IO which does > this, and "all tests pass". The functions currently (silently) > dispatch > directly to their native counterparts. > > The only tricky function is tempfile() which is *mostly* like > File::Temp::tempfile(), but does some voodoo of converting > (TEMPLATE=>'xxx') to the non-hash first parameter of the File:: > version, > so I'm hesitant to commit. It may do other magic - Hilmar? Not that I would know of. If the tests pass (without having to change them!) I'd give it a try. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Oct 4 11:35:16 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 4 Oct 2006 10:35:16 -0500 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <28FB0A46-24C9-4EDD-B89C-BF7A108A1EDC@gmx.net> Message-ID: <001901c6e7ca$b12fd5b0$15327e82@pyrimidine> ... > Don't confuse the constructor with the inheritance tree. > > Interface classes should never be instantiated, hence the > constructor, consistent with the documentation, should never get > executed. I know that interfaces shouldn't be instantiated. I had noticed there are cases of 'our qw (Bio::Root::RootI)' where it is completely acceptable to inherit the interface. Makes sense to me now. > > there should be some way of inheriting Root directly or > > indirectly. I would > > say that any direct use of RootI is not good practice, though. > > I don't know what you mean by 'directly' or 'indirectly' but > inheritance from interfaces, and interfaces extending (inheriting > from) other interfaces, is certainly standard practice. I'm not sure > at all why it would be a bad one. I was talking specifically about inheriting RootI, and not about all Bioperl interfaces in general. I completely understand the use of interface/implementation in Bioperl. However, I missed one small fact until yesterday (of course AFTER I posed my reply), which was that interfaces may inherit RootI directly. My oops. I had understood that, in general, any Bioperl implementation should not inherit the RootI interface directly (they should inherit Root, since that implements RootI). The 'constructor' present in RootI is essentially to make sure that no one inherits from the wrong class. Probably a bad use of the terms 'direct' and 'indirect', so maybe I didn't get that across very well. What I meant was that all classes inherit Root in some way, either 'directly' (as the direct parent class) or 'indirectly' (through the inheritance tree). Probably comes from being primarily a molecular microbiologist and not a computer scientist. OT, but it would be nice to have an updated class diagram to sort out the inheritance hierarchy a bit easier. In the meantime, the Deobfuscator does help quite a bit. > > For the current implementation we should only inherit > > Bio::Root::Root, which > > implements RootI. > > For the implementation classes, yes. For the interface classes, no. I agree (see above). That's the one small bit about interfaces I missed along the way. Makes sense; they use throw_not_implemented(), which is a RootI method. > -hilmar Chris From pmiguel at purdue.edu Wed Oct 4 15:38:51 2006 From: pmiguel at purdue.edu (Phillip San Miguel) Date: Wed, 04 Oct 2006 15:38:51 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <4521A552.60301@sendu.me.uk> References: <4521A552.60301@sendu.me.uk> Message-ID: <45240DCB.2080204@purdue.edu> Sendu Bala wrote: > Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll > upload tar.gz files when I have access to the server, then reply here > with links. > > In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for > instructions on getting and testing this RC. > > Developers: > Make sure you're in the AUTHORS file in all 4 packages, as > appropriate. > > Users: > Even though 1.5.2 is a 'developer' release, we consider it the most > stable and capable version of Bioperl, and recommend that you use > it in all but the most critical production environments. Please > try it out and let us know of any problems or difficulties you run > into. > > > Thank you, > Sendu. > I didn't see any tests done under solaris, so I asked our sys admin to do the install on one of our machines. Just another data point: He installed this release candidate on a Sun E450 box running solaris. uname -a gives: SunOS descartes 5.10 Generic_118833-18 sun4u sparc SUNW,Ultra-4 perl -v gives: This is perl, v5.8.8 built for sun4-solaris (etc.) $ time make test PERL_DL_NONLAZY=1 /usr/local/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t t/AAChange...................ok t/AAReverseMutate............ok t/abi........................Bio::SeqIO::staden::read from bioperl-ext is not installed or is installed incorrectly - skipping abi.t tests t/abi........................ok t/ace........................ok t/AlignIO....................ok t/AlignStats.................ok t/AlignUtil..................ok t/alignUtilities.............ok t/Allele.....................ok t/Alphabet...................ok t/Annotation.................ok t/AnnotationAdaptor..........ok t/asciitree..................ok t/Assembly...................ok 1/19 skipped: t/Biblio.....................ok t/Biblio_biofetch............ok t/Biblio_eutils..............ok t/BiblioReferences...........ok t/BioDBGFF...................ok t/BioDBSeqFeature............ok 1/46Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423. t/BioDBSeqFeature............ok t/BioDBSeqFeature_BDB........ok t/BioDBSeqFeature_mysql......ok 3/46prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 prepare_cached(SELECT sequence,offset FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname= ? AND offset >= ? AND offset <= ? ORDER BY offset ) statement handle DBI::st=HASH(0x8c5048) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 prepare_cached(SELECT sequence,offset FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname= ? AND offset >= ? AND offset <= ? ORDER BY offset ) statement handle DBI::st=HASH(0x8c5048) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 t/BioDBSeqFeature_mysql......ok t/BioFetch_DB................ok t/BioGraphics................ok t/BlastIndex.................ok 1/13 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BlastIndex.................ok t/BPbl2seq................... -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPbl2seq...................ok 1/108 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPbl2seq...................ok t/BPlite.....................ok 1/97 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPlite.....................ok 52/97 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPlite.....................ok 88/97 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead STACK Bio::Tools::BPlite::new /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/Tools/BPlite.pm:197 STACK toplevel t/BPlite.t:127 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPlite.....................ok t/BPpsilite.................. -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPpsilite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPpsilite..................ok 4/11 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPpsilite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPpsilite..................ok t/bsml_sax...................ok t/Chain......................ok t/chaosxml...................ok t/cigarstring................ok t/ClusterIO..................ok t/Coalescent.................ok t/CodonTable.................ok t/Compatible.................ok t/consed.....................ok t/CoordinateGraph............ok t/CoordinateMapper...........ok t/Correlate..................ok t/ctf........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping ctf.t tests t/ctf........................ok t/CytoMap....................ok t/DB.........................skipped all skipped: Skipping all tests since they require network access, set BIOPERLDEBUG=1 to test t/DBCUTG.....................ok 11/34 skipped: Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test t/DBFasta....................ok t/DNAMutation................ok t/Domcut.....................ok t/ECnumber...................ok t/ELM........................ok 1/13 -------------------- WARNING --------------------- MSG: sleeping for 1 seconds --------------------------------------------------- t/ELM........................ok t/embl.......................ok t/EMBL_DB....................ok t/EMBOSS_Tools...............ok t/EncodedSeq.................ok t/entrezgene.................ok 491/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467. t/entrezgene.................ok 695/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467. t/entrezgene.................ok 723/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467. t/entrezgene.................ok 824/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467. t/entrezgene.................ok t/ePCR.......................ok t/ESEfinder..................ok 1/15# Looks like you planned 15 tests but only ran 14. t/ESEfinder..................dubious Test returned status 255 (wstat 65280, 0xff00) DIED. FAILED test 15 Failed 1/15 tests, 93.33% okay (less 9 skipped tests: 5 okay, 33.33%) t/est2genome.................ok t/EUtilities.................skipped all skipped: Set BIOPERLDEBUG=1 to run tests t/Exception..................ok t/Exonerate..................ok t/exp........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping exp.t tests t/exp........................ok t/fasta......................ok t/FeatureIO..................ok 7/33 -------------------- WARNING --------------------- MSG: '##feature-ontology' directive handling not yet implemented --------------------------------------------------- -------------------- WARNING --------------------- MSG: '##attribute-ontology' directive handling not yet implemented --------------------------------------------------- -------------------- WARNING --------------------- MSG: '##source-ontology' directive handling not yet implemented --------------------------------------------------- t/FeatureIO..................ok t/flat.......................ok t/FootPrinter................ok t/game.......................ok t/GbrowseGFF.................ok t/gcg........................ok t/GDB........................ok t/Gel........................ok t/genbank....................ok t/GeneCoordinateMapper.......ok t/Geneid.....................ok t/Genewise...................ok 2/51 skipped: t/Genomewise.................ok t/Genpred....................ok t/GFF........................ok t/GOR4.......................ok t/GOterm.....................ok t/GraphAdaptor...............ok t/GuessSeqFormat.............ok t/hmmer......................ok t/hmmer_pull.................ok t/HNN........................ok t/HtSNP......................ok t/Index......................ok t/InstanceSite...............ok t/interpro...................ok t/InterProParser.............ok t/IUPAC......................ok t/kegg.......................ok t/largefasta.................ok t/LargeLocatableSeq..........ok t/largepseq..................ok t/lasergene..................ok t/LinkageMap.................ok t/LiveSeq....................ok t/LocatableSeq...............ok t/Location...................ok t/LocationFactory............ok t/LocusLink..................ok t/lucy.......................ok t/Map........................ok t/MapIO......................ok t/masta......................ok t/Matrix.....................ok t/Measure....................ok t/MeSH.......................ok t/metafasta..................ok t/MetaSeq....................ok t/MicrosatelliteMarker.......ok t/MiniMIMentry...............ok t/MitoProt...................ok t/Molphy.....................ok t/MultiFile..................ok t/multiple_fasta.............ok t/Mutation...................ok t/Mutator....................ok t/NetPhos....................ok 10/14 skipped: Skipping tests which require network access, set BIOPERLDEBUG=1 to test t/Node.......................ok t/obo_parser.................ok t/OddCodes...................ok t/OMIMentry..................ok t/OMIMentryAllelicVariant....ok t/OMIMparser.................ok t/Ontology...................ok t/OntologyEngine.............ok t/OntologyStore..............ok t/PAML.......................ok t/Perl.......................ok t/phd........................ok t/Phenotype..................ok t/PhylipDist.................ok t/PhysicalMap................ok t/pICalculator...............ok t/Pictogram..................ok t/pir........................ok t/pln........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping pln.t tests t/pln........................ok t/PopGen.....................ok 2/89 skipped: t/PopGenSims.................ok t/primaryqual................ok t/PrimarySeq.................ok t/primedseq..................ok t/Primer.....................ok t/primer3....................ok t/Promoterwise...............ok t/ProtDist...................ok t/protgraph..................ok t/ProtMatrix.................ok t/ProtPsm....................ok t/Pseudowise.................ok t/psm........................ok t/QRNA.......................ok t/qual.......................ok t/RandDistFunctions..........ok t/RandomTreeFactory..........ok t/Range......................ok t/RangeI.....................ok t/raw........................ok t/RefSeq.....................ok t/Registry...................ok t/Relationship...............ok t/RelationshipType...........ok t/RemoteBlast................ok 11/13 skipped: to avoid timeout t/RepeatMasker...............ok t/RestrictionAnalysis........ok t/RestrictionEnzyme..........ok 1/14 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::RestrictionEnzyme is deprecatedUse Bio::Restriction classes instead --------------------------------------------------- t/RestrictionEnzyme..........ok t/RestrictionIO..............ok t/RNAChange..................ok t/rnamotif...................ok t/RootI......................ok t/RootIO.....................ok 2/27 skipped: various reasons t/RootStorable...............ok t/Scansite...................ok t/scf........................ok t/SearchDist.................ok t/SearchIO...................ok t/Seg........................ok t/Seq........................ok t/seq_quality................ok t/SeqAnalysisParser..........ok t/SeqBuilder.................ok t/SeqDiff....................ok t/SeqFeatCollection..........ok t/SeqFeature.................ok t/seqfeaturePrimer...........ok t/SeqHound_DB................ok 4/14Writing into 'shoundlog' log file. t/SeqHound_DB................ok t/SeqIO......................ok t/SeqPattern.................ok t/seqread_fail...............ok t/SeqStats...................ok t/SequenceFamily.............ok t/sequencetrace..............ok t/SeqUtils...................ok t/SeqVersion.................ok t/seqwithquality.............ok t/SeqWords...................ok t/Sigcleave..................ok t/Signalp....................ok t/Sim4.......................ok t/SimilarityPair.............ok t/SimpleAlign................ok t/simpleGOparser.............ok t/singlet....................ok t/sirna......................ok t/SiteMatrix.................ok t/SNP........................ok t/Sopma......................ok t/Species....................ok 5/20 skipped: Skipping tests which require network access, set BIOPERLDEBUG=1 to test t/Spidey.....................ok t/splicedseq.................ok t/StandAloneBlast............ok t/StructIO...................ok t/Structure..................ok t/swiss......................ok t/Symbol.....................ok t/tab........................ok t/table......................ok t/TagHaplotype...............ok t/Taxonomy...................ok 44/98 skipped: Skipping tests which require network access, set BIOPERLDEBUG=1 to test t/TaxonTree..................ok t/Tempfile...................ok t/Term.......................ok t/tigrxml....................ok t/tinyseq....................ok t/Tmhmm......................ok t/Tools......................ok t/Tree.......................ok t/TreeBuild..................ok t/TreeIO.....................ok t/trim.......................ok t/tRNAscanSE.................ok t/UCSCParsers................ok t/Unflattener................ok t/Unflattener2...............ok t/UniGene....................ok t/Variation_IO...............ok t/WABA.......................ok t/XEMBL_DB...................ok 1/9 skipped: server may be down t/ztr........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping ztr.t tests t/ztr........................ok Failed Test Stat Wstat Total Fail Failed List of Failed ------------------------------------------------------------------------------- t/ESEfinder.t 255 65280 15 2 13.33% 15 2 tests and 98 subtests skipped. Failed 1/240 test scripts, 99.58% okay. 1/11910 subtests failed, 99.99% okay. *** Error code 29 make: Fatal error: Command failed for target `test_dynamic' real 13m10.064s user 11m14.891s sys 0m45.417s $ TEST_VERBOSE=1 perl t/ESEfinder.t 1..15 ok 1 - use Bio::Tools::Analysis::DNA::ESEfinder; ok 2 - use Data::Dumper; ok 3 - use Bio::PrimarySeq; ok 4 - use Bio::Seq; ok 5 ok 6 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 7 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 8 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 9 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 10 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 11 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 12 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 13 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 14 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test # Looks like you planned 15 tests but only ran 14. From bix at sendu.me.uk Thu Oct 5 03:19:39 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 08:19:39 +0100 Subject: [Bioperl-l] EUtilities term handling Message-ID: <4524B20B.5010703@sendu.me.uk> This is actually a general question and not limited to EUtilities. As I see it EUtiltiies lets you do queries in Bioperl that you can do on a website. The question is, should a Bioperl module always work with queries that the website it is a front-end to works with? So for example, Bio::DB::EUtilities::esearch in -db mode 'gene' is essentially a frontend onto: http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?retmode=xml&db=gene&term= With a web-browser you can complete that url by supplying a term. For example, the term 'BRCA2+9606[taxid]' works and returns results: http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?retmode=xml&db=gene&term=BRCA2+9606[taxid] If you supply the exact same term to EUtilities::esearch like so: my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db => "gene", -term "BRCA2+9606[taxid]"); The search fails. From my 'user' perspective this is highly unexpected. Chris (the author) and I both understand /why/ it fails, but Chris doesn't think it is a bug, or at least something than can/should be changed. What do other people think? At the very least, if something unexpected happens, I'd suggest making a note of it in the POD somewhere. Eg. "Do not use + in term strings, even though they might work on the website". Chris: what is the disadvantage of always submitting '+' as '+' to the server? From bix at sendu.me.uk Thu Oct 5 03:24:45 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 08:24:45 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <4524B20B.5010703@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> Message-ID: <4524B33D.9070607@sendu.me.uk> Sendu Bala wrote: > > With a web-browser you can complete that url by supplying a term. For > example, the term 'BRCA2+9606[taxid]' works and returns results: > > http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?retmode=xml&db=gene&term=BRCA2+9606[taxid] > > > If you supply the exact same term to EUtilities::esearch like so: > > my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db => > "gene", -term "BRCA2+9606[taxid]"); *cough* my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db => "gene", -term => "BRCA2+9606[taxid]"); > The search fails. From m.weimer at dkfz-heidelberg.de Thu Oct 5 08:15:53 2006 From: m.weimer at dkfz-heidelberg.de (Marc Weimer) Date: Thu, 05 Oct 2006 14:15:53 +0200 Subject: [Bioperl-l] Bio::DB::SwissProt Error Message-ID: <1160050554.18691.11.camel@localhost> When running -------------------------------------------------------------- #! /usr/bin/perl -w use strict; use Bio::DB::SwissProt; my $db_obj = new Bio::DB::SwissProt(-verbose=>1); my $seq_obj = $db_obj->get_Seq_by_acc('P43780'); ------------------------------------------------------------- using Bioperl 1.4-1 I get the error message --------------------------------------------------------------------------------- request is POST http://www.ebi.ac.uk/cgi-bin/dbfetch Content-Length: 45 Content-Type: application/x-www-form-urlencoded format=swissprot&db=swall&style=raw&id=P43780 ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: swissprot stream with no ID. Not swissprot in my book STACK: Error::throw STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328 STACK Bio::SeqIO::swiss::next_seq /usr/share/perl5/Bio/SeqIO/swiss.pm:179 STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc /usr/share/perl5/Bio/DB/WebDBSeqI.pm:187 STACK: ./putativeGele.pl:8 ----------------------------------------------------------- -------------------------------------------------------------------------------- Any suggestions? Thanks, Marc From bix at sendu.me.uk Thu Oct 5 09:21:23 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 14:21:23 +0100 Subject: [Bioperl-l] Bio::DB::SwissProt Error In-Reply-To: <1160050554.18691.11.camel@localhost> References: <1160050554.18691.11.camel@localhost> Message-ID: <452506D3.5050501@sendu.me.uk> Marc Weimer wrote: [snip] > my $db_obj = new Bio::DB::SwissProt(-verbose=>1); > > my $seq_obj = $db_obj->get_Seq_by_acc('P43780'); [snip] > using Bioperl 1.4-1 I get the error message [snip] > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: swissprot stream with no ID. Not swissprot in my book [snip] > Any suggestions? It works with the latest Bioperl. I'm not sure if 1.5.1 works (the most recent official release), but 1.5.2 does (http://bioperl.org/wiki/Release_1.5.2), as does a checkout from CVS (http://bioperl.org/wiki/Getting_BioPerl#CVS). From m.weimer at dkfz-heidelberg.de Thu Oct 5 09:35:06 2006 From: m.weimer at dkfz-heidelberg.de (Marc Weimer) Date: Thu, 05 Oct 2006 15:35:06 +0200 Subject: [Bioperl-l] Bio::DB::SwissProt Error In-Reply-To: <452506D3.5050501@sendu.me.uk> References: <1160050554.18691.11.camel@localhost> <452506D3.5050501@sendu.me.uk> Message-ID: <1160055306.18691.14.camel@localhost> Works fine with 1.5.2 Thanks, Marc > Marc Weimer wrote: > [snip] > > my $db_obj = new Bio::DB::SwissProt(-verbose=>1); > > > > my $seq_obj = $db_obj->get_Seq_by_acc('P43780'); > [snip] > > using Bioperl 1.4-1 I get the error message > [snip] > > ------------- EXCEPTION: Bio::Root::Exception ------------- > > MSG: swissprot stream with no ID. Not swissprot in my book > [snip] > > Any suggestions? > > It works with the latest Bioperl. I'm not sure if 1.5.1 works (the most > recent official release), but 1.5.2 does > (http://bioperl.org/wiki/Release_1.5.2), as does a checkout from CVS > (http://bioperl.org/wiki/Getting_BioPerl#CVS). -- ######################################## Dr. Marc Weimer German Cancer Research Center Central Unit Biostatistics Im Neuenheimer Feld 280 D-69120 Heidelberg Phone: +49 (0) 6221/42-2387 Fax: +49 (0) 6221/42-2397 ######################################## From hlapp at gmx.net Thu Oct 5 09:55:58 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 5 Oct 2006 09:55:58 -0400 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <4524B20B.5010703@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> Message-ID: <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> On Oct 5, 2006, at 3:19 AM, Sendu Bala wrote: > This is actually a general question and not limited to EUtilities. > As I > see it EUtiltiies lets you do queries in Bioperl that you can do on a > website. The question is, should a Bioperl module always work with > queries that the website it is a front-end to works with? I think yes, but stick to this definition. Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez website it will actually not work. Hence, it should be no surprise that it doesn't work either using Bio::DB::EUtilities. The URL you are using to make your point is much more an example for using a web-service (SOAP, REST, or not) than it is for using a website. Using the web-service URL with a space in place of the '+' works, but yields a different result (just searches for BRCA2), so if tested for correct result the test fails. I.e., you don't expect an input form on a website to accept URL- encoded input. Instead, you expect it to do any URL-encoding for you that needs to be done. Conversely, if you are using a URL to retrieve stuff using e.g. wget or curl, it is clear that you will need to do URL encoding yourself unless there is a command line option that lets you instruct the querying program to do so. I would be careful with mangling the two definitions into one, resulting in a module that needs to serve two masters. You could consider providing an option though that lets you turn off the URL encoding on demand. Aside from that, one of the advantages of having the service wrapped in Bioperl is in fact that you can have it accept a wider variety of parameters that the actual service would allow you to have, e.g., arrays, hashes, or whatever seems appropriate. My $0.02. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Thu Oct 5 10:08:01 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 15:08:01 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> Message-ID: <452511C1.5020709@sendu.me.uk> Hilmar Lapp wrote: > > On Oct 5, 2006, at 3:19 AM, Sendu Bala wrote: > >> This is actually a general question and not limited to EUtilities. As I >> see it EUtiltiies lets you do queries in Bioperl that you can do on a >> website. The question is, should a Bioperl module always work with >> queries that the website it is a front-end to works with? > > I think yes, but stick to this definition. > > Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez > website it will actually not work. Hence, it should be no surprise that > it doesn't work either using Bio::DB::EUtilities. On the contrary, I find it a surprise because EUtilities is an interface to NCBI's eutils, not the entrez website. If I had previously read instructions on using eutils: http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.section.constructing-urls I might (do) expect that I /should/ use + in my term. > Aside from that, one of the advantages of having the service wrapped in > Bioperl is in fact that you can have it accept a wider variety of > parameters that the actual service would allow you to have, e.g., > arrays, hashes, or whatever seems appropriate. I was going to suggest that terms be supplied as an array, leaving Bioperl code to decide how to 'AND' all the terms (elements in the array) together. It would also further force the user not to think of how eutils normally works, but to only consider the Bioperl instructions on how to form a query. But I'm not sure of the value of all that. From cjfields at uiuc.edu Thu Oct 5 10:06:50 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 09:06:50 -0500 Subject: [Bioperl-l] Bio::DB::SwissProt Error In-Reply-To: <452506D3.5050501@sendu.me.uk> References: <1160050554.18691.11.camel@localhost> <452506D3.5050501@sendu.me.uk> Message-ID: <1AC863FF-3C44-4017-B20F-BD6DB413B318@uiuc.edu> On Oct 5, 2006, at 8:21 AM, Sendu Bala wrote: > Marc Weimer wrote: > [snip] >> my $db_obj = new Bio::DB::SwissProt(-verbose=>1); >> >> my $seq_obj = $db_obj->get_Seq_by_acc('P43780'); > [snip] >> using Bioperl 1.4-1 I get the error message > [snip] >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: swissprot stream with no ID. Not swissprot in my book > [snip] >> Any suggestions? > > It works with the latest Bioperl. I'm not sure if 1.5.1 works (the > most > recent official release), but 1.5.2 does > (http://bioperl.org/wiki/Release_1.5.2), as does a checkout from CVS > (http://bioperl.org/wiki/Getting_BioPerl#CVS). Mark, you'll have to update to 1.5.2 or CVS, as Sendu suggested. There were server changes for biofetch which were fixed about 4-6 months ago (post rel. 1.5.1); I think several changes were made to Bio::SeqIO::swiss as well during this period. I think the error here results from Bio::SeqIO::swiss trying to parse an empty byte stream. Sendu, do you think that Bio::SeqIO::swiss (and other SeqIO parsers) should throw a more specific message for getting an empty byte stream? Or is it more trouble than it's worth? Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Thu Oct 5 10:14:40 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 15:14:40 +0100 Subject: [Bioperl-l] Bio::DB::SwissProt Error In-Reply-To: <1AC863FF-3C44-4017-B20F-BD6DB413B318@uiuc.edu> References: <1160050554.18691.11.camel@localhost> <452506D3.5050501@sendu.me.uk> <1AC863FF-3C44-4017-B20F-BD6DB413B318@uiuc.edu> Message-ID: <45251350.5030608@sendu.me.uk> Chris Fields wrote: > >>> ------------- EXCEPTION: Bio::Root::Exception ------------- >>> MSG: swissprot stream with no ID. Not swissprot in my book [snip] > I think the error here results from Bio::SeqIO::swiss trying to parse an > empty byte stream. Sendu, do you think that Bio::SeqIO::swiss (and > other SeqIO parsers) should throw a more specific message for getting an > empty byte stream? Or is it more trouble than it's worth? Trouble wise, I've no idea without looking into it. Generally speaking though I can say that the error message is pretty useless and I'm always in favour of better error messages. From hlapp at gmx.net Thu Oct 5 10:21:49 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 5 Oct 2006 10:21:49 -0400 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <452511C1.5020709@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> Message-ID: On Oct 5, 2006, at 10:08 AM, Sendu Bala wrote: >> >> On Oct 5, 2006, at 3:19 AM, Sendu Bala wrote: >> >>> This is actually a general question and not limited to >>> EUtilities. As I >>> see it EUtiltiies lets you do queries in Bioperl that you can do >>> on a >>> website. The question is, should a Bioperl module always work with >>> queries that the website it is a front-end to works with? >> >> I think yes, but stick to this definition. >> >> Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez >> website it will actually not work. Hence, it should be no surprise >> that >> it doesn't work either using Bio::DB::EUtilities. > > On the contrary, I find it a surprise because EUtilities is an > interface > to NCBI's eutils, not the entrez website. > > If I had previously read instructions on using eutils: > http://www.ncbi.nlm.nih.gov/books/bv.fcgi? > rid=coursework.section.constructing-urls > I might (do) expect that I /should/ use + in my term. This is my point - stick to your definitions. Are you wrapping a query form on a website or are you wrapping a web service (i.e., a URL)? The examples you give are about wrapping a web-service. Your original question was about wrapping a website. Yet another question is what the author of Bio::DB::EUtilities intended to wrap. The other thing to consider is user-friendliness. If you are wrapping a web-service, do you still make not URL-encoding the user input the default? What will 90% of the users probably want or expect to be able to do? URL-encode all input themselves or expect the module to do this for them unless they turn it off? As far as I'm concerned, I'll happily count myself among those who are lazy and ignorant, don't read NCBI's documentation, don't want to know how to URL encode and why this needs to be done, but just want it to work. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Thu Oct 5 10:31:06 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 09:31:06 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <4524B20B.5010703@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> Message-ID: On Oct 5, 2006, at 2:19 AM, Sendu Bala wrote: > This is actually a general question and not limited to EUtilities. > As I > see it EUtiltiies lets you do queries in Bioperl that you can do on a > website. The question is, should a Bioperl module always work with > queries that the website it is a front-end to works with? > > So for example, Bio::DB::EUtilities::esearch in -db mode 'gene' is > essentially a frontend onto: > > http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi? > retmode=xml&db=gene&term= > > With a web-browser you can complete that url by supplying a term. For > example, the term 'BRCA2+9606[taxid]' works and returns results: > > http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi? > retmode=xml&db=gene&term=BRCA2+9606[taxid] > > If you supply the exact same term to EUtilities::esearch like so: > > my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db => > "gene", -term "BRCA2+9606[taxid]"); > > The search fails. From my 'user' perspective this is highly > unexpected. > Chris (the author) and I both understand /why/ it fails, but Chris > doesn't think it is a bug, or at least something than can/should be > changed. What do other people think? At the very least, if something > unexpected happens, I'd suggest making a note of it in the POD > somewhere. Eg. "Do not use + in term strings, even though they might > work on the website". > > Chris: what is the disadvantage of always submitting '+' as '+' to the > server? A few reasons: 1) According to NCBI, you can use '+' in queries, but not as a boolean. Global changes of '+' to a space may change the meaning of the query in a few rare occasions. So, if you really wanted to search for the string 'BRCA2+ATG', NCBI looks for that term literally. 2) '+' is a URI reserved symbol for a space delimiter. Therefore, any parameters containing '+' are URI-encoded into %2B, which is decoded on NCBI's end back to '+' (The is demonstrable with current EUtilities output and the returned XML data). 3) Why not just use a space (implicit AND)? Or an explicit boolean? Or '&' (which apparently works but is not specified in the NCBI Entrez docs)? The bug is in the query and not in the code, i.e. is is a user- generated bug, not an EUtilities bug. And it shouldn't be unexpected, as NCBI has very specific rules for building queries for Entrez (just like any other database). If I were to use nonstandard queries for MySQL, BioFetch, UCSC, or anything else, I would expect to get bad results. As the old saying goes, garbage in, garbage out. The following link has their updated rules: http://www.ncbi.nlm.nih.gov/books/bv.fcgi? rid=helpentrez.chapter.EntrezHelp Here is their old one: http://www.ncbi.nlm.nih.gov/entrez/query/static/help/helpdoc.html We could, of course, put something in POD, but you never presented that option to me before. I'll grant that the EUtilities API needs some cleaning up, not easy to do when the returned data varies from each utility. But it does get the URL encoding correct, at least in this case. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Thu Oct 5 10:32:49 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 15:32:49 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> Message-ID: <45251791.9040409@sendu.me.uk> Hilmar Lapp wrote: > > On Oct 5, 2006, at 10:08 AM, Sendu Bala wrote: > >> On the contrary, I find it a surprise because EUtilities is an interface >> to NCBI's eutils, not the entrez website. >> >> If I had previously read instructions on using eutils: >> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.section.constructing-urls >> >> I might (do) expect that I /should/ use + in my term. > > This is my point - stick to your definitions. Are you wrapping a query > form on a website or are you wrapping a web service (i.e., a URL)? > > The examples you give are about wrapping a web-service. Your original > question was about wrapping a website. Right... I don't see that that changes the answer to my question though does it? "The question is, should a Bioperl module always work with queries that the web-service it is a front-end to works with?" For me, the answer is still yes. > As far as I'm concerned, I'll happily count myself among those who are > lazy and ignorant, don't read NCBI's documentation, don't want to know > how to URL encode and why this needs to be done, but just want it to work. That's a reasonable attitude to take. Which comes back to the question I asked of Chris - naively, if you send + as + you can please everyone, can't you? Both people who have read the docs on the web-service and those who haven't? Or are there real queries in which a user may want to search for a phrase with a literal + in it (and where such a search works via eutils)? From bix at sendu.me.uk Thu Oct 5 10:44:33 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 15:44:33 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: References: <4524B20B.5010703@sendu.me.uk> Message-ID: <45251A51.6020802@sendu.me.uk> Chris Fields wrote: > The bug is in the query and not in the code, i.e. is is a > user-generated bug, not an EUtilities bug. And it shouldn't be > unexpected, as NCBI has very specific rules for building queries for > Entrez (just like any other database). So I guess this comes down to something Hilmar mentioned and I never even considered before. You consider your EUtilities stuff as a frontend to entrez, and therefore consider valid queries as queries that are valid for entrez and not eutils? If that's the case, fine. I understand why you don't think this is a bug. Again, something that might warrant a mention in the POD. Currently the naming of the modules and the explicit references to eutils (and me knowing the implementation uses eutils) got me confused. From cjfields at uiuc.edu Thu Oct 5 10:51:28 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 09:51:28 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <452511C1.5020709@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> Message-ID: <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> On Oct 5, 2006, at 9:08 AM, Sendu Bala wrote: >>> This is actually a general question and not limited to >>> EUtilities. As I >>> see it EUtiltiies lets you do queries in Bioperl that you can do >>> on a >>> website. The question is, should a Bioperl module always work with >>> queries that the website it is a front-end to works with? >> >> I think yes, but stick to this definition. >> >> Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez >> website it will actually not work. Hence, it should be no surprise >> that >> it doesn't work either using Bio::DB::EUtilities. > > On the contrary, I find it a surprise because EUtilities is an > interface > to NCBI's eutils, not the entrez website. It uses NCBI's CGI interface for eutils, not the SOAP interface. Very different. I have considered using the NCBI SOAP-based interface, but the web services are still somewhat incomplete, unlike the CGI interface. > If I had previously read instructions on using eutils: > http://www.ncbi.nlm.nih.gov/books/bv.fcgi? > rid=coursework.section.constructing-urls > I might (do) expect that I /should/ use + in my term. You are looking at part of the naked URL on that page. Here's what that page says: "When constructing URLs for the eUtils, please use lowercase characters for all parameters except &WebEnv. There is no required order for the URL parameters in an eUtils URL, and null values or inappropriate parameters are ignored. Avoid placing spaces in the URLs, particularly in queries. If a space is required, use a plus sign (+) instead of a space: * Incorrect: &id=352, 25125, 234, ... * Correct: &id=352,25125,234,... * Incorrect: &term=biomol mrna[properties] AND mouse[organism] * Correct: &term=biomol+mrna[properties]+AND+mouse[organism] Other special characters, such as the # symbol used in referring to a query key on the History server, should be represented by their URL encodings (%23 for #).top link" I use URI for building the URL with the parameters. URI specifically encodes all of this for you, so spaces convert to '+' and '+' converts to %2B. >> Aside from that, one of the advantages of having the service >> wrapped in >> Bioperl is in fact that you can have it accept a wider variety of >> parameters that the actual service would allow you to have, e.g., >> arrays, hashes, or whatever seems appropriate. > > I was going to suggest that terms be supplied as an array, leaving > Bioperl code to decide how to 'AND' all the terms (elements in the > array) together. It would also further force the user not to think of > how eutils normally works, but to only consider the Bioperl > instructions > on how to form a query. But I'm not sure of the value of all that. Why do we need to intuit what the user is thinking at an particular time? How would I know that someone actually wanted to search using the literal string 'abc+123' as opposed to 'abc 123'? I see value in your last suggestion but I think a class or set of classes would be best suited for that: MySQL Query | in out | MySQL Query Entrez Query |-----> Generic Query class----->| Entrez Query SRS Query | | SRS Query ad infinitum... The generic query object could then be used in DB searches as an option besides using a raw string. Though it would get tricky with SQL's complexity... Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Thu Oct 5 10:54:04 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 5 Oct 2006 10:54:04 -0400 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45251791.9040409@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <45251791.9040409@sendu.me.uk> Message-ID: <9916EDEE-EA3C-4C55-A004-A46F37B559BF@gmx.net> On Oct 5, 2006, at 10:32 AM, Sendu Bala wrote: >> The examples you give are about wrapping a web-service. Your >> original question was about wrapping a website. > > Right... I don't see that that changes the answer to my question > though does it? > > "The question is, should a Bioperl module always work with > queries that the web-service it is a front-end to works with?" > > For me, the answer is still yes. The answer is still yes. My point was the query that works with a website is not necessarily the query that works with a web-service, even if that web-service also powers the website. > >> As far as I'm concerned, I'll happily count myself among those who >> are lazy and ignorant, don't read NCBI's documentation, don't want >> to know how to URL encode and why this needs to be done, but just >> want it to work. > > That's a reasonable attitude to take. Which comes back to the > question I asked of Chris - naively, if you send + as + you can > please everyone, can't you? Both people who have read the docs on > the web-service and those who haven't? Or are there real queries in > which a user may want to search for a phrase with a literal + in it > (and where such a search works via eutils)? So are you suggesting to URL-encode some characters but not others? This would move you into muddy waters and I'm wondering what the gain is from that, and for whom it is a gain. It sounds like it will mostly benefit those who have studied the NCBI documentation and know exactly the URL they want to send and want to ignore the EUtilities POD. My humble guess is the far majority of people will either not read any documentation, or read the module's POD. Maybe a better way to serve both types of people is to accept a parameter -querystring that is expected to include everything from 'term=' onwards (including 'term=' itself) which gives you complete control and freedom if you know what you are doing, and otherwise implement what you suggested before: > I was going to suggest that terms be supplied as an array, leaving > Bioperl code to decide how to 'AND' all the terms (elements in the > array) together. It would also further force the user not to think of > how eutils normally works, but to only consider the Bioperl > instructions > on how to form a query. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Thu Oct 5 11:02:01 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 16:02:01 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> Message-ID: <45251E69.7040507@sendu.me.uk> Chris Fields wrote: > > On Oct 5, 2006, at 9:08 AM, Sendu Bala wrote: > >> On the contrary, I find it a surprise because EUtilities is an interface >> to NCBI's eutils, not the entrez website. > > It uses NCBI's CGI interface for eutils, not the SOAP interface. Very > different. I have considered using the NCBI SOAP-based interface, but > the web services are still somewhat incomplete, unlike the CGI interface. I don't know anything about the SOAP interface. I'm talking about the CGI interface that you use. >> If I had previously read instructions on using eutils: >> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.section.constructing-urls >> >> I might (do) expect that I /should/ use + in my term. > > You are looking at part of the naked URL on that page. Here's what that > page says: I know what it says... > * Correct: &term=biomol+mrna[properties]+AND+mouse[organism] The correct query is the one that has +s in it. > I use URI for building the URL with the parameters. URI specifically > encodes all of this for you, so spaces convert to '+' and '+' converts > to %2B. Well, yes. This causes what I thought of as a bug. It prevents me from submitting a /correct/ eutils term. However it isn't a bug if you explain to users they shouldn't be submitting valid eutils terms, but only valid /entrez/ terms. From cjfields at uiuc.edu Thu Oct 5 11:15:49 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 10:15:49 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45251A51.6020802@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <45251A51.6020802@sendu.me.uk> Message-ID: On Oct 5, 2006, at 9:44 AM, Sendu Bala wrote: > Chris Fields wrote: >> The bug is in the query and not in the code, i.e. is is a user- >> generated bug, not an EUtilities bug. And it shouldn't be >> unexpected, as NCBI has very specific rules for building queries >> for Entrez (just like any other database). > > So I guess this comes down to something Hilmar mentioned and I > never even considered before. You consider your EUtilities stuff as > a frontend to entrez, and therefore consider valid queries as > queries that are valid for entrez and not eutils? The eutils tools access the same databases as the web page, in the same way, using the same search terms. From the EUtilities docs: "The eUtils access the core search and retrieval engine of the Entrez system and, therefore, are only capable of retrieving data that are already in Entrez." > If that's the case, fine. I understand why you don't think this is > a bug. Again, something that might warrant a mention in the POD. > Currently the naming of the modules and the explicit references to > eutils (and me knowing the implementation uses eutils) got me > confused. I'll note that in there is URI encoding in POD, but that should be a no-brainer. I don't think every Bio::DB* class specifies this, mainly because it is taken for granted. Pretty much anything that builds URL strings needs to encode based on the URI standard, and any server that accepts URLs is expected to decode using the same standard. So, again, why does that have to be specifically outlined in POD? Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Oct 5 11:24:39 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 10:24:39 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45251E69.7040507@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> Message-ID: >> I use URI for building the URL with the parameters. URI >> specifically encodes all of this for you, so spaces convert to '+' >> and '+' converts to %2B. > > Well, yes. This causes what I thought of as a bug. It prevents me > from submitting a /correct/ eutils term. However it isn't a bug if > you explain to users they shouldn't be submitting valid eutils > terms, but only valid /entrez/ terms. I can specify in POD that URI encoding is in effect if that placates you, and maybe add a bit about how terms are to be built (based on the website). I also noticed that the esearch POD doesn't have a demo in the SYNOPSIS yet (my fault). However, I think this is all a bit silly. This is something most people already realize and take for granted (it's standard for any CGI interface to use URI encoding). Also, most Entrez users do not use a term like 'BRCA2+Human [ORGANISM]'. They use 'BRCA2 AND Human[ORGANISM]' or 'BRCA2 Human [ORGANISM]', the latter which is implicit. All of this is on the Entrez website. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From MEC at stowers-institute.org Thu Oct 5 11:12:02 2006 From: MEC at stowers-institute.org (Cook, Malcolm) Date: Thu, 5 Oct 2006 10:12:02 -0500 Subject: [Bioperl-l] using nfreeze instead of freeze in Bio::SeqFeature::Store Message-ID: Lincoln, I committed a change to Bio::SeqFeature::Store to use nfreeze instead of freeze which should allow SeqFeature objects to survive database freeze/thaw cycles across architectures. I hope I was not presumptuous or in error in doing this.... Regards, Malcolm Cook Database Applications Manager - Bioinformatics Stowers Institute for Medical Research - Kansas City, Missouri From bix at sendu.me.uk Thu Oct 5 11:28:55 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 16:28:55 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: References: <4524B20B.5010703@sendu.me.uk> <45251A51.6020802@sendu.me.uk> Message-ID: <452524B7.5080003@sendu.me.uk> Chris Fields wrote: > > On Oct 5, 2006, at 9:44 AM, Sendu Bala wrote: > >> Chris Fields wrote: >>> The bug is in the query and not in the code, i.e. is is a >>> user-generated bug, not an EUtilities bug. And it shouldn't be >>> unexpected, as NCBI has very specific rules for building queries for >>> Entrez (just like any other database). >> >> So I guess this comes down to something Hilmar mentioned and I never >> even considered before. You consider your EUtilities stuff as a >> frontend to entrez, and therefore consider valid queries as queries >> that are valid for entrez and not eutils? > > The eutils tools access the same databases as the web page, in the same > way, using the same search terms. It doesn't. The eutils interface behaves differently with +s than does the entrez website interface. In eutils + means space, whilst in entrez, + means the plus symbol. >> If that's the case, fine. I understand why you don't think this is a >> bug. Again, something that might warrant a mention in the POD. >> Currently the naming of the modules and the explicit references to >> eutils (and me knowing the implementation uses eutils) got me confused. > > I'll note that in there is URI encoding in POD, but that should be a > no-brainer. Just that it is URI encoded isn't the problem. The problem is the difference in behaviour outlined above. > I don't think every Bio::DB* class specifies this, mainly > because it is taken for granted. Pretty much anything that builds URL > strings needs to encode based on the URI standard, and any server that > accepts URLs is expected to decode using the same standard. > > So, again, why does that have to be specifically outlined in POD? Because they're different. If I construct a valid eutils query it might not work. You ought to explain why. "EUtilities takes any valid entrez query and transforms it into a valid eutils query for submission. Do not try and provide a valid eutils query of your own, or the extra transformation will result in no results" From bix at sendu.me.uk Thu Oct 5 11:30:44 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 16:30:44 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> Message-ID: <45252524.7030006@sendu.me.uk> Chris Fields wrote: >>> I use URI for building the URL with the parameters. URI specifically >>> encodes all of this for you, so spaces convert to '+' and '+' >>> converts to %2B. >> >> Well, yes. This causes what I thought of as a bug. It prevents me from >> submitting a /correct/ eutils term. However it isn't a bug if you >> explain to users they shouldn't be submitting valid eutils terms, but >> only valid /entrez/ terms. > > I can specify in POD that URI encoding is in effect if that placates > you, and maybe add a bit about how terms are to be built (based on the > website). I also noticed that the esearch POD doesn't have a demo in > the SYNOPSIS yet (my fault). > > However, I think this is all a bit silly. This is something most people > already realize and take for granted (it's standard for any CGI > interface to use URI encoding). > > Also, most Entrez users do not use a term like 'BRCA2+Human[ORGANISM]'. > They use 'BRCA2 AND Human[ORGANISM]' or 'BRCA2 Human[ORGANISM]', the > latter which is implicit. All of this is on the Entrez website. Exactly. You're assuming an entrez user and expecting an entrez query. I don't think its silly given the name of the modules for the user to assume the code needs an eutils query, which is a different thing with different behaviour /independent/ of URI encoding. From cjfields at uiuc.edu Thu Oct 5 11:50:51 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 10:50:51 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45251E69.7040507@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> Message-ID: <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu> > I know what it says... Ah, that's the Sendu I know and love. > >> * Correct: &term=biomol+mrna[properties]+AND+mouse[organism] > > The correct query is the one that has +s in it. Yes, that's because it's a URL, not a raw search term string (it has been URI-encoded so spaces are converted to '+'). If you use that as a direct query in Entrez you will not get the same response. You do get something if you use the new NCBI global query form on the main page, but clicking on the nucleotide or PMC hits reveals that the URL is malformed and no term is present. That is exactly the same response in EUtilities: 0 0 0 Note the QueryTranslation tag is empty. The only noticeable difference is using egquery (which I just fixed in CVS yesterday). The returned XML gives no hits for any database, which is true based on individual esearch queries for those database, and is actually more consistent than the website version. >> I use URI for building the URL with the parameters. URI specifically >> encodes all of this for you, so spaces convert to '+' and '+' >> converts >> to %2B. > > Well, yes. This causes what I thought of as a bug. It prevents me from > submitting a /correct/ eutils term. However it isn't a bug if you > explain to users they shouldn't be submitting valid eutils terms, but > only valid /entrez/ terms. If you mean that most users will actually use a URL-like search term, then I would say you have a point. But that simply isn't the case. If clarifying the docs makes it better, then so be it. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Oct 5 11:59:53 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 10:59:53 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45252524.7030006@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> Message-ID: <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> On Oct 5, 2006, at 10:30 AM, Sendu Bala wrote: > Chris Fields wrote: >>>> I use URI for building the URL with the parameters. URI >>>> specifically encodes all of this for you, so spaces convert to >>>> '+' and '+' converts to %2B. >>> >>> Well, yes. This causes what I thought of as a bug. It prevents me >>> from submitting a /correct/ eutils term. However it isn't a bug >>> if you explain to users they shouldn't be submitting valid eutils >>> terms, but only valid /entrez/ terms. >> I can specify in POD that URI encoding is in effect if that >> placates you, and maybe add a bit about how terms are to be built >> (based on the website). I also noticed that the esearch POD >> doesn't have a demo in the SYNOPSIS yet (my fault). >> However, I think this is all a bit silly. This is something most >> people already realize and take for granted (it's standard for any >> CGI interface to use URI encoding). >> Also, most Entrez users do not use a term like 'BRCA2+Human >> [ORGANISM]'. They use 'BRCA2 AND Human[ORGANISM]' or 'BRCA2 Human >> [ORGANISM]', the latter which is implicit. All of this is on the >> Entrez website. > > Exactly. You're assuming an entrez user and expecting an entrez > query. I don't think its silly given the name of the modules for > the user to assume the code needs an eutils query, which is a > different thing with different behaviour /independent/ of URI > encoding. It's a silly distinction. The POD for Bio::DB::EUtilities states: Bio::DB::EUtilities - interface for handling web queries and data retrieval from NCBI's Entrez Utilities. My question is this : why would anyone (particularly the everyday bioperl user) want to use URL-encoded parameters for a query? That seems to be your main argument here. If so, wouldn't I just paste them together then send them off NCBI eutils? Would I devote ~ 10 classes to that? I could do that in a short program using an array, join, and LWP::Simple. The purpose is quite clearly stated, but if you feel that by badgering me to add something to POD I consider common sense, then you're right. You've succeeded. Bravo. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Thu Oct 5 12:02:05 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 17:02:05 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu> Message-ID: <45252C7D.3050009@sendu.me.uk> Chris Fields wrote: > >>> * Correct: &term=biomol+mrna[properties]+AND+mouse[organism] >> >> The correct query is the one that has +s in it. > > Yes, that's because it's a URL, not a raw search term string (it has > been URI-encoded so spaces are converted to '+'). If you use that as a > direct query in Entrez you will not get the same response. But we're not doing Entrez queries. We're using a module called EUtilities to do an eutils query, which involves forming a url in which spaces should to be converted to +. That's the source of confusion. Is the user supposed to do this, or is EUtilities? All you had to do 8 emails ago is tell me that EUtilities is supposed to do that. You /still/ haven't told me that. I give up. From cjfields at uiuc.edu Thu Oct 5 12:12:11 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 11:12:11 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45252C7D.3050009@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu> <45252C7D.3050009@sendu.me.uk> Message-ID: On Oct 5, 2006, at 11:02 AM, Sendu Bala wrote: > Chris Fields wrote: >> >>>> * Correct: &term=biomol+mrna[properties]+AND+mouse[organism] >>> >>> The correct query is the one that has +s in it. >> Yes, that's because it's a URL, not a raw search term string (it >> has been URI-encoded so spaces are converted to '+'). If you use >> that as a direct query in Entrez you will not get the same response. > > But we're not doing Entrez queries. We're using a module called > EUtilities to do an eutils query, which involves forming a url in > which spaces should to be converted to +. That's the source of > confusion. Is the user supposed to do this, or is EUtilities? > > All you had to do 8 emails ago is tell me that EUtilities is > supposed to do that. You /still/ haven't told me that. I give up. It should be apparent from the documentation and the URLs posted in debugging output the first few times you used it. Again, why would I dedicate ~ 10 classes to pasting together URI-encoded strings? Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Thu Oct 5 12:22:36 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 17:22:36 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> Message-ID: <4525314C.7020205@sendu.me.uk> Chris Fields wrote: > > On Oct 5, 2006, at 10:30 AM, Sendu Bala wrote: > >> Exactly. You're assuming an entrez user and expecting an entrez query. >> I don't think its silly given the name of the modules for the user to >> assume the code needs an eutils query, which is a different thing with >> different behaviour /independent/ of URI encoding. > > It's a silly distinction. The POD for Bio::DB::EUtilities states: > > Bio::DB::EUtilities - interface for handling web queries and data > retrieval from NCBI's Entrez Utilities. > > My question is this : why would anyone (particularly the everyday > bioperl user) want to use URL-encoded parameters for a query? Well I'll tell you why I was trying to use URL-encoded parameters, if that helps you any. I read the pod for EUtilities but all the examples have very simple -term s defined with just a single word. So I wonder how I'm supposed to make an 'AND' term. I also have no idea what utilities I'm supposed to use, or what databases etc. I need to get the answer I want. The POD points me here: http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html Combined with the EUtilities synopsis I know I'm supposed to start with esearch so I look at: http://eutils.ncbi.nlm.nih.gov/entrez/query/static/esearch_help.html And figure out what my terms are supposed to be. Then I test some example terms in my web browser using the esearch base url (http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?) to see if they work, and copy/paste the terms into my EUtilities-using perl script, replacing variable terms with perl variables. Then I find that my terms don't work, ask you about it, and you fail to tell me I should be testing my terms at http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=search&DB=gene. If you think I'm stupid, fine, but I'm probably not the only stupid person on the planet. Which is why I suggested a POD addition. You don't have to make any POD change if you don't want to. I simply thought it might help avoid anyone 'badgering' you in the future with a similar problem. From bix at sendu.me.uk Thu Oct 5 12:28:51 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 17:28:51 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu> <45252C7D.3050009@sendu.me.uk> Message-ID: <452532C3.9030804@sendu.me.uk> Chris Fields wrote: > > On Oct 5, 2006, at 11:02 AM, Sendu Bala wrote: > >> Chris Fields wrote: >>> >>>>> * Correct: &term=biomol+mrna[properties]+AND+mouse[organism] >>>> >>>> The correct query is the one that has +s in it. >>> Yes, that's because it's a URL, not a raw search term string (it has >>> been URI-encoded so spaces are converted to '+'). If you use that as >>> a direct query in Entrez you will not get the same response. >> >> But we're not doing Entrez queries. We're using a module called >> EUtilities to do an eutils query, which involves forming a url in >> which spaces should to be converted to +. That's the source of >> confusion. Is the user supposed to do this, or is EUtilities? >> >> All you had to do 8 emails ago is tell me that EUtilities is supposed >> to do that. You /still/ haven't told me that. I give up. > > It should be apparent from the documentation and the URLs posted in > debugging output the first few times you used it. Again, why would I > dedicate ~ 10 classes to pasting together URI-encoded strings? I'm not sure how not doing URI-encoding would suddenly make your classes worthless. I find them to be very useful (even when I didn't know there was any URI-encoding, was incorrectly using +s and it happened to work anyway). From bernd.web at gmail.com Thu Oct 5 10:09:38 2006 From: bernd.web at gmail.com (Bernd Web) Date: Thu, 5 Oct 2006 16:09:38 +0200 Subject: [Bioperl-l] Eutilities Batch Message-ID: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com> Hi, I am using the new EUtilities. It looks great. I was trying to use epost followed by elink but i get an error. The same error is actually given with the example on http://doc.bioperl.org/bioperl-live/Bio/DB/EUtilities/elink.html: Can't call method "get_databases" on an undefined value at EU.pl line 25. For completeness, the code is shown below too. Any suggestions what is going wrong? Regards, Bernd # chain EUtilities for complex queries use Bio::DB::EUtilities; my $esearch = Bio::DB::EUtilities->new(-eutil => 'esearch', -db => 'pubmed', -term => 'hutP', -usehistory => 'y'); $esearch->get_response; # parse the response, fetch a cookie my $elink = Bio::DB::EUtilities->new(-eutil => 'elink', -db => 'protein,taxonomy', -dbfrom => 'pubmed', -cookie => $esearch->next_cookie, -cmd => 'neighbor'); # this retrieves the Bio::DB::EUtilities::ElinkData object my ($linkset) = $elink->next_linkset; my @ids; # step through IDs for each linked database in the ElinkData object for my $db ($linkset->get_databases) { @ids = $linkset->get_LinkIds_by_db($db); #returns primary ID's # do something here } From cjfields at uiuc.edu Thu Oct 5 13:31:33 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 12:31:33 -0500 Subject: [Bioperl-l] Eutilities Batch In-Reply-To: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com> References: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com> Message-ID: I'll look into it. I'm busy updating the EUtilities tools now. Chris On Oct 5, 2006, at 9:09 AM, Bernd Web wrote: > Hi, > > I am using the new EUtilities. It looks great. > I was trying to use epost followed by elink but i get an error. The > same error is actually given with the example on > http://doc.bioperl.org/bioperl-live/Bio/DB/EUtilities/elink.html: > Can't call method "get_databases" on an undefined value at EU.pl > line 25. > > For completeness, the code is shown below too. > > Any suggestions what is going wrong? > > Regards, > Bernd > > # chain EUtilities for complex queries > > use Bio::DB::EUtilities; > > my $esearch = Bio::DB::EUtilities->new(-eutil => 'esearch', > -db => 'pubmed', > -term => 'hutP', > -usehistory => 'y'); > > $esearch->get_response; # parse the response, fetch a cookie > > my $elink = Bio::DB::EUtilities->new(-eutil => 'elink', > -db => > 'protein,taxonomy', > -dbfrom => 'pubmed', > -cookie => $esearch- > >next_cookie, > -cmd => 'neighbor'); > > # this retrieves the Bio::DB::EUtilities::ElinkData object > > my ($linkset) = $elink->next_linkset; > my @ids; > > # step through IDs for each linked database in the ElinkData object > > for my $db ($linkset->get_databases) { > @ids = $linkset->get_LinkIds_by_db($db); #returns primary ID's > # do something here > } > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From daniel.lang at biologie.uni-freiburg.de Thu Oct 5 13:12:02 2006 From: daniel.lang at biologie.uni-freiburg.de (Daniel Lang) Date: Thu, 05 Oct 2006 19:12:02 +0200 Subject: [Bioperl-l] Bio::DB::SeqFeature Message-ID: <45253CE2.1070208@biologie.uni-freiburg.de> Hi, we are storing Bio::SeqFeature::Gene::GeneStructure objects (with multiple transcripts) using Bio::DB::SeqFeature::Store in a mysql db (latest bioperl-live checkout). The Bio::SeqFeature::Gene::GeneStructure's are generated from scratch out of a database. The first observation is that is seems to work (fetched objects behave like Bio::SeqFeature::Gene::GeneStructure's) despite the fact that we get these warnings: Can't store item CODE(0x113db10) at lib/Storable.pm (autosplit into lib/auto/Storable/_freeze.al) line 287, line 1. Can't store item CODE(0x11786f0) at lib/Storable.pm (autosplit into lib/auto/Storable/_freeze.al) line 287, line 1. (in cleanup) Not a CODE reference at /home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, line 1. prepare_cached(SELECT f.id,f.object FROM feature as f WHERE ( f.seqid=? AND f.end>=? AND f.start<=? AND ((f.tier=? AND f.bin between ? AND ?) OR (f.tier=? AND f.bin between ? AND ?) OR (f.tier=? AND f.bin between ? AND ?) OR (f.tier=? AND f.bin between ? AND ?) OR (f.tier=? AND f.bin between ? AND ?) OR (f.tier=? AND f.bin between ? AND ?) OR (f.tier=? AND f.bin between ? AND ?)) ) ) statement handle DBI::st=HASH(0x1c317cf0) still Active at /home/lang/bioperl/bioperl-live//Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 (in cleanup) Not a CODE reference at /home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, line 1. Is this something serious? Does this mean that the stored object doesn't have everything it had before freezing? Or are we using Bio::DB::SeqFeature inappropriately? The other question would be, if we can visualize these stored feature objects easily using gbrowse? I didn't find a hint mentioning Bio::DB::SeqFeature as being supported by gbrowse on the gmod pages... Is it working already? Will it? Thanks in advance, Daniel -- Daniel Lang University of Freiburg, Plant Biotechnology Schaenzlestr. 1, D-79104 Freiburg fax: +49 761 203 6945 phone: +49 761 203 6974 homepage: http://www.plant-biotech.net/ e-mail: daniel.lang at biologie.uni-freiburg.de ################################################# My software never has bugs. It just develops random features. ################################################# From cjfields at uiuc.edu Thu Oct 5 13:45:40 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 12:45:40 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <452532C3.9030804@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu> <45252C7D.3050009@sendu.me.uk> <452532C3.9030804@sendu.me.uk> Message-ID: <003DD8C4-6E59-44C2-9A1C-117E036D93BC@uiuc.edu> On Oct 5, 2006, at 11:28 AM, Sendu Bala wrote: > I'm not sure how not doing URI-encoding would suddenly make your > classes worthless. I find them to be very useful (even when I > didn't know there was any URI-encoding, was incorrectly using +s > and it happened to work anyway). That's not my point (and sincerest apologies for the 'badgering' bit). If you made the assumption that all the parameters had to be URI-encoded, why couldn't I do something like: my %param = (#make up your list of parameters here#); my $eutil = 'esearch'; my $url = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/$eutil.fcgi"; # join the key value pairs with '=', then join all those with & # add to end of url # post and retrieve via LWP::Simple It's more user-friendly to set up the parameters so that you wouldn't have to encode everything yourself, esp. when the most reliable way to encode URI strings is to 'use URI'. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Oct 5 14:11:25 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 13:11:25 -0500 Subject: [Bioperl-l] Eutilities Batch In-Reply-To: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com> References: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com> Message-ID: <4A340977-C6AD-4728-8947-BF5A8A782807@uiuc.edu> On Oct 5, 2006, at 9:09 AM, Bernd Web wrote: > Hi, > > I am using the new EUtilities. It looks great. > I was trying to use epost followed by elink but i get an error. The > same error is actually given with the example on > http://doc.bioperl.org/bioperl-live/Bio/DB/EUtilities/elink.html: > Can't call method "get_databases" on an undefined value at EU.pl > line 25. > > For completeness, the code is shown below too. > > Any suggestions what is going wrong? > > Regards, > Bernd Grr...that's my error, sorry Bernd. The POD wasn't updated to match the change I made and has a few errors. The elink object, for starters, doesn't fetch the response using get_response(). Also, the ElinkData method has changed slightly but accomplishes the same thing. Odd, since I copied and pasted that from working code... Just a note: these are considered highly experimental at the moment, though they should be ready for general use and toying around. I would like any suggestions on methods and so on you may have (Sendu has made some very helpful ones off-list which I plan on implementing). Feel free to let me know if something doesn't work. Note that, because of their experimental nature, you will want to take note of any methods changes in particular as I try to solidify the API and clean up the POD, so expect some momentary 'outages'. I plan on setting up a remedial interface for all the container objects (like ElinkData) which will help clarify things and solidify the API in the next few weeks, at least to a point where the class methods have a consistent naming scheme. I plan on using this as a backend web agent for a general Entrez interface at some point to get data into Bio* objects. In the meantime, try this: use Bio::DB::EUtilities; my $esearch = Bio::DB::EUtilities->new(-eutil => 'esearch', -db => 'pubmed', -term => 'hutP', -usehistory => 'y'); $esearch->get_response; # parse the response, fetch a cookie my $elink = Bio::DB::EUtilities->new(-eutil => 'elink', -db => 'protein,taxonomy', -dbfrom => 'pubmed', -cookie => $esearch- >next_cookie, -cmd => 'neighbor'); $elink->get_response; # this retrieves the Bio::DB::EUtilities::ElinkData object my $linkset = $elink->next_linkset; my @ids; # step through IDs for each linked database in the ElinkData object for my $db ($linkset->get_all_linkdbs) { @ids = $linkset->get_LinkIds_by_db($db); #returns primary ID's print join q(,), @ids; # do something here } Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From dmessina at wustl.edu Thu Oct 5 14:07:56 2006 From: dmessina at wustl.edu (David Messina) Date: Thu, 5 Oct 2006 13:07:56 -0500 Subject: [Bioperl-l] BioPerl Deobfuscator updated Message-ID: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> I'm pleased to announce a revised version of the BioPerl Deobfuscator is now available. Many thanks to Mauricio Cuadra for updating bioperl.org's installation: http://bioperl.org/cgi-bin/deob_interface.cgi I've incorporated many of the suggestions you all sent in after the first release, and many of the modules that had non-standard documentation have been updated in the meantime, too, so hopefully you'll find it much improved. There are still some issues with a few modules; please report any problems you see. Also, it's now indexing bioperl-live instead of 1.4, which should make it a little more useful, too. A complete list of changes is below. I welcome your bug reports and suggestions for improvements, via email, this list, Bugzilla, or the Wiki page. Thanks, Dave Changes 0.0.3 Mon Oct 2 20:01:45 CDT 2006 FIX: change default $deob_detail_path to be a relative URL instead of having localhost hardcoded. Thanks to Jason Stajich for pointing this out. FIX: Bio::Ontology modules are no longer missing their prefix in the class list, and their methods are now shown in the lower pane as expected. Thanks to Hilmar Lapp for reporting this bug. FIX: can now handle (and ignore) VERSION POD section. FIX: missing SYNOPSIS section now handled properly. In fact, the SYNOPSIS and DESCRIPTION sections can be in reverse order now, although for consistency this is not recommended. FIX: Bug #2114: "Obfuscator doesn't show "Bio:Matrix:Generic" has been fixed. This bug turned out to afflict multiple modules, which weren't getting parsed correctly by deob_index.pl. NEW: Table cells have been padded out to get rid of that "scrunched" look. Thanks to Sendu Bala for this great suggestion. NEW: If the 'Returns' subsection of a method's documentation contains a POD L<> link, the Deobfuscator assumes this to be a package name, and wraps it in an href for display. This feature is not robust, but seems to work well enough for now. NEW: the list of classes is now sorted alphabetically depth- first, so that subclasses appear just after their parent class. Thanks to Amir Karger for noticing the strange sorting behavior. NEW: HTML page title now 'BioPerl Deobfuscator' to distinguish it from other Deobfuscators out there. Thanks to Amir Karger for suggesting this. NEW: 'No match' search string now more prominent. Yep, kudos to Amir Karger again -- another great idea! NEW: Search box caption now explicitly states that only package names can be searched. Big ups to Amir Karger for this suggestion. The ability to search method names is planned for a future version. NEW: added -x option to deob_index.pl. This allows the use of an 'excluded modules' file. This feature was added to resolve an issue with four modules which rely on external modules to compile. Class::Inspector, used by the Deobfuscator needs to load a module to traverse its inheritance tree, and modules must compile before they can be loaded. CHANGE: using short name now when traversing with File::Find to help identify excluded modules (deob_index.pl). From lincoln.stein at gmail.com Thu Oct 5 14:41:08 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Thu, 5 Oct 2006 14:41:08 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net> Message-ID: <6dce9a0b0610051141x6b61407ar1c0a13cf7616b35f@mail.gmail.com> The non-numeric comparison bug in Bio::DB::SeqFeature is fixed in the latest CVS. Do I need to do anything special to get the CVS fixes into the release candidate? Lincoln On 10/2/06, Chris Fields wrote: > > [I won't create a wiki account just to report this.] > > > > Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG > > not set. Lots of warnings about missing packages and all, but this > > looks interesting: > > > > Argument "+" isn't numeric in numeric lt (<) at Bio/DB/ > > SeqFeature/Segment.pm line 423. > > This is verified on Mac OS X. > > > Otherwise: > > > > Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, > > 99.99% okay. > > > > The failed test is: > > > > t/ESEfinder..................dubious > > Test returned status 255 (wstat 65280, 0xff00) > > DIED. FAILED test 15 > > What do you get when you run that set of tests using 'perl -I. -w t/ > ESEFinder.t'? The bad status code is odd and could be a remote > server issue. > > Chris > > > > > > florin > > > > -- > > If we wish to count lines of code, we should not regard them as lines > > produced but as lines spent. -- Edsger Dijkstra > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From MEC at stowers-institute.org Thu Oct 5 15:18:08 2006 From: MEC at stowers-institute.org (Cook, Malcolm) Date: Thu, 5 Oct 2006 14:18:08 -0500 Subject: [Bioperl-l] using nfreeze instead of freeze in Bio::SeqFeature::Store Message-ID: Yes, there is overhead (c.f. perldoc Storable) "When writing in network order, all fields are written out as standard lengths, which allows full interworking, but takes longer to read and write)" And, I suppose there is also risk of loosing precision in using network order: You can also store data in network order to allow easy sharing across multiple platforms, or when storing on a socket known to be remotely connected. The routines to call have an initial "n" prefix for *network*, as in "nstore" and "nstore_fd". At retrieval time, your data will be correctly restored so you don't have to know whether you're restoring from native or network ordered data. Double values are stored stringified to ensure portability as well, at the slight risk of loosing some precision in the last decimals. So, I agree, it should be configuration option, perhaps defaulting to using network order. However, given the factoring of ../Bio/DB/SeqFeature/Store.pm I'm not sure how to best make it a configuration option since the two provided serializers don't share a common interface. Possibly something like: =head1 Methods for Connecting and Initializating a Database =head2 new Title : new Usage : $db = Bio::DB::SeqFeature::Store->new(@options) Function: connect to a database Returns : A descendent of Bio::DB::Seqfeature::Store Args : several - see below Status : public This class method creates a new database connection. The following -name=E$value arguments are accepted:http://iowg.brcdevel.org/gff3.html#a_fasta Name Value ---- ----- -adaptor The name of the Adaptor class (default DBI::mysql) -serializer The name of the serializer class (default Storable) -network_order Strive to 'preserve network order' (if the serializer implements it. Currently, only Storable.pm does, and this will cause it to use nfreeze instead of freeze. (default 1) -index_subfeatures Whether or not to make subfeatures searchable (default true) -cache Activate LRU caching feature -- size of cache -compress Compresses features before storing them in database using Compress::Zlib Malcolm Cook Database Applications Manager - Bioinformatics Stowers Institute for Medical Research - Kansas City, Missouri > -----Original Message----- > From: Lincoln Stein [mailto:lincoln.stein at gmail.com] > Sent: Thursday, October 05, 2006 1:43 PM > To: Cook, Malcolm > Cc: lstein at cshl.org; bioperl-l > Subject: Re: using nfreeze instead of freeze in Bio::SeqFeature::Store > > I think it's fine unless there is a significant performance hit, in > which case the change should be made into a configuration option. Do > you know if there is any overhead on doing this? > > Lincoln > > On 10/5/06, Cook, Malcolm wrote: > > Lincoln, > > > > I committed a change to Bio::SeqFeature::Store to use > nfreeze instead of > > freeze which should allow SeqFeature objects to survive database > > freeze/thaw cycles across architectures. > > > > I hope I was not presumptuous or in error in doing this.... > > > > Regards, > > > > Malcolm Cook > > Database Applications Manager - Bioinformatics > > Stowers Institute for Medical Research - Kansas City, Missouri > > > > > > > -- > Lincoln D. Stein > Cold Spring Harbor Laboratory > 1 Bungtown Road > Cold Spring Harbor, NY 11724 > (516) 367-8380 (voice) > (516) 367-8389 (fax) > FOR URGENT MESSAGES & SCHEDULING, > PLEASE CONTACT MY ASSISTANT, > SANDRA MICHELSEN, AT michelse at cshl.edu > From lincoln.stein at gmail.com Thu Oct 5 14:32:40 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Thu, 5 Oct 2006 14:32:40 -0400 Subject: [Bioperl-l] Bio::DB::SeqFeature In-Reply-To: <45253CE2.1070208@biologie.uni-freiburg.de> References: <45253CE2.1070208@biologie.uni-freiburg.de> Message-ID: <6dce9a0b0610051132p7d7fcf84g27578731f9727f3f@mail.gmail.com> Hi Daniel, The warnings you are seeing are occurring because Bio::SeqFeature::Gene::GeneStructure contains a CODE reference. I think it must be registering a cleanup method via its Bio::Root::Root ancestor. When Storable serializes the object, it complains that it can't serialize the CODE reference and instead converts it into the string "CODE(0xXXXXX)". Then, after you thaw the object, Bio::Root::Root is complaining that the CODE reference is invalid because it is a string, not a reference. Yuck. I think, however, that I can fix this by setting some magic variables in Storable version 2.05 that will decompile and compile the CODE references. I will try this and send you a note when the code is in CVS. GBrowse does run off Bio::DB::SeqFeature::Store and is noticeably faster than the original Bio::DB::GFF adaptor. Nothing really changes except that you set the db_adaptor option to Bio::DB::SeqFeature::Store. I haven't tried it using Bio::SeqFeature::Gene::GeneStructure, so no guarantees, but I am hopeful that it will work. Lincoln On 10/5/06, Daniel Lang wrote: > Hi, > > we are storing Bio::SeqFeature::Gene::GeneStructure objects (with > multiple transcripts) using Bio::DB::SeqFeature::Store in a mysql db > (latest bioperl-live checkout). > > The Bio::SeqFeature::Gene::GeneStructure's are generated from scratch > out of a database. > > The first observation is that is seems to work (fetched objects behave > like Bio::SeqFeature::Gene::GeneStructure's) despite the fact that we > get these warnings: > > Can't store item CODE(0x113db10) at lib/Storable.pm (autosplit into > lib/auto/Storable/_freeze.al) line 287, line 1. > Can't store item CODE(0x11786f0) at lib/Storable.pm (autosplit into > lib/auto/Storable/_freeze.al) line 287, line 1. > (in cleanup) Not a CODE reference at > /home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, line 1. > prepare_cached(SELECT f.id,f.object > FROM feature as f > WHERE ( f.seqid=? > AND f.end>=? AND f.start<=? AND ((f.tier=? AND f.bin between ? AND ?) > OR (f.tier=? AND f.bin between ? AND ?) > OR (f.tier=? AND f.bin between ? AND ?) > OR (f.tier=? AND f.bin between ? AND ?) > OR (f.tier=? AND f.bin between ? AND ?) > OR (f.tier=? AND f.bin between ? AND ?) > OR (f.tier=? AND f.bin between ? AND ?)) > ) > > ) statement handle DBI::st=HASH(0x1c317cf0) still Active at > /home/lang/bioperl/bioperl-live//Bio/DB/SeqFeature/Store/DBI/mysql.pm > line 1422 > (in cleanup) Not a CODE reference at > /home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, line 1. > > Is this something serious? Does this mean that the stored object doesn't > have everything it had before freezing? Or are we using > Bio::DB::SeqFeature inappropriately? > > The other question would be, if we can visualize these stored feature > objects easily using gbrowse? I didn't find a hint mentioning > Bio::DB::SeqFeature as being supported by gbrowse on the gmod pages... > Is it working already? Will it? > > Thanks in advance, > Daniel > > -- > > Daniel Lang > University of Freiburg, Plant Biotechnology > Schaenzlestr. 1, D-79104 Freiburg > fax: +49 761 203 6945 > phone: +49 761 203 6974 > homepage: http://www.plant-biotech.net/ > e-mail: daniel.lang at biologie.uni-freiburg.de > > ################################################# > My software never has bugs. > It just develops random features. > ################################################# > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From hlapp at gmx.net Thu Oct 5 16:34:49 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 5 Oct 2006 16:34:49 -0400 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <4525314C.7020205@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> <4525314C.7020205@sendu.me.uk> Message-ID: <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net> On Oct 5, 2006, at 12:22 PM, Sendu Bala wrote: > If you think I'm stupid, fine, but I'm probably not the only stupid > person on the planet. That's a great suggestion that I hope we can all agree on? I'll happily count myself among the stupid ones too so you're not alone, and stupid people and even more so those who are lucky enough not to be stupid have an obligation to document stuff so that even the stupid can understand, no matter how silly the documentation might get. Is that agreeable without causing yet more progressive hair loss? Actually - I'm having second thoughts. Isn't it a distinguishing feature of stupid people that - among other things - they are stupid enough to believe they don't need to read documentation? You admitted publicly that you read documentation - are you just faking the stupid? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Thu Oct 5 17:11:06 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 16:11:06 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> <4525314C.7020205@sendu.me.uk> <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net> Message-ID: <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu> On Oct 5, 2006, at 3:34 PM, Hilmar Lapp wrote: > > On Oct 5, 2006, at 12:22 PM, Sendu Bala wrote: > >> If you think I'm stupid, fine, but I'm probably not the only stupid >> person on the planet. > > That's a great suggestion that I hope we can all agree on? I'll > happily count myself among the stupid ones too so you're not alone, > and stupid people and even more so those who are lucky enough not > to be stupid have an obligation to document stuff so that even the > stupid can understand, no matter how silly the documentation might > get. > > Is that agreeable without causing yet more progressive hair loss? > > Actually - I'm having second thoughts. Isn't it a distinguishing > feature of stupid people that - among other things - they are > stupid enough to believe they don't need to read documentation? You > admitted publicly that you read documentation - are you just faking > the stupid? > > -hilmar If lack of good documentation == stupid, I know of a few other modules in trouble besides mine. Based on that we're in for a whole lot of stupid! And I feel stupid for my earlier remarks, Sendu, so apologies. And Hilmar, you're too late on the hair loss, at least on my end. I have corrected the EUtilities POD to reflect that all text input needs to be raw as URI encoding is done in the module, which should work (I think). I plan on committing it tonight. It also indicates that EUtilities search queries need to be made as if they are regular Entrez queries. Would that be sufficient? Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From pmiguel at purdue.edu Thu Oct 5 16:42:00 2006 From: pmiguel at purdue.edu (Phillip San Miguel) Date: Thu, 05 Oct 2006 16:42:00 -0400 Subject: [Bioperl-l] BioPerl Deobfuscator updated In-Reply-To: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> Message-ID: <45256E18.3080103@purdue.edu> David Messina wrote: > I'm pleased to announce a revised version of the BioPerl Deobfuscator > is now available. Many thanks to Mauricio Cuadra for updating > bioperl.org's installation: > > http://bioperl.org/cgi-bin/deob_interface.cgi > > I've incorporated many of the suggestions you all sent in after the > first release, and many of the modules that had non-standard > documentation have been updated in the meantime, too, so hopefully > you'll find it much improved. There are still some issues with a few > modules; please report any problems you see. Also, it's now indexing > bioperl-live instead of 1.4, which should make it a little more > useful, too. A complete list of changes is below. > > I welcome your bug reports and suggestions for improvements, via > email, this list, Bugzilla, or the Wiki page. > > > Thanks, > Dave > > Here are some comments: Would be good to have the column headings for the methods table in the fixed part of the page, rather than the scroll box. That way you could always see the column headings from anywhere in the list. Second, I've noticed that there are a fair number of methods that have "not documented" for "Returns" and "Usage". But in every case I've checked both of these were documented. For example, consider methods for Bio::Seq::SeqWithQuality. The method "accession_number" is listed as "not documented". But if you click on Bio::Seq:SeqWithQuality link to the documentation, usage is defined as: "$unique_biological_key = $obj->accession_number;" and returns is defined as "A string". Finally, it would be good to have the version of bioperl being deobfuscated on the deob_interface.cgi page. Just as a quick sanity-checking measure. After poking around a bit I found that bioperl-live is being indexed in the wiki. But, I can tell, it is just the sort of thing I'm going to forget and look for every time come back to the page after a few months... Overall very nice, though. Just what is needed when I'm trying to remember "which was the method that returns subseq string and which one returns an object?" Phillip SanMiguel Purdue University From bix at sendu.me.uk Thu Oct 5 17:24:34 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 22:24:34 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> <4525314C.7020205@sendu.me.uk> <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net> <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu> Message-ID: <45257812.5050008@sendu.me.uk> Chris Fields wrote: > > I have corrected the EUtilities POD to reflect that all text input needs > to be raw as URI encoding is done in the module, which should work (I > think). I plan on committing it tonight. It also indicates that > EUtilities search queries need to be made as if they are regular Entrez > queries. Would that be sufficient? You may not even need to mention anything about URI encoding, which might frighten some people. Something as simple as: =head1 SYNOPSIS use Bio::DB::EUtilities; my $esearch = Bio::DB::EUtilities->new(-eutil => 'esearch', -db => 'pubmed', -term => 'hutP AND xyz', ... and/or some POD for the new() method: =head2 new Title : new ... Args : -eutil => ... -db => ... -term => string, an entrez-style query =cut would get the point across, I think. BTW, can the term string be supplied anywhere else other than new()? It doesn't matter at all if it can't, I'm just idly wondering if I missed anything. From dmessina at wustl.edu Thu Oct 5 17:42:49 2006 From: dmessina at wustl.edu (David Messina) Date: Thu, 5 Oct 2006 16:42:49 -0500 Subject: [Bioperl-l] BioPerl Deobfuscator updated In-Reply-To: <45256E18.3080103@purdue.edu> References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> <45256E18.3080103@purdue.edu> Message-ID: <5861A24A-E0F9-4B06-9AD1-68857E6561F9@wustl.edu> Thanks so much, Phillip, for taking the time to check out the new version and send your comments. I really appreciate it! I've added them to the wiki page so I can track them. Best, Dave From cjfields at uiuc.edu Thu Oct 5 17:50:11 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 16:50:11 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45257812.5050008@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> <4525314C.7020205@sendu.me.uk> <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net> <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu> <45257812.5050008@sendu.me.uk> Message-ID: Sendu, I have the parameters all set up as get/sets at this point, but I'm open to suggestions on that. Note in the BEGIN block the heredoc eval {} block. Yes, nasty I know, but I hate AUTOLOAD. It works as a quick way of getting parameter get/sets up-and-running. I plan on making those explicit get/sets as soon as I can then sorting out particular ones to the various eutil modules where they are primarily used. Long story short, every parameter is a get/set at this time (including term()). The common ones needed for most EUtilities are initialized in the parent EUtilities::_initialize(), and eutil- specific parameters are initialized in the individual eutil plugins. Each eutil plugin only sets whatever parameters may be needed for operation (though you could circumvent that, since all of them are inherited via EUtilities). We could always simplify it to accept simple key-value pairs, but get/ sets (at least to me) allow more flexibility as long as you remember which parameters are set and to what. Chris On Oct 5, 2006, at 4:24 PM, Sendu Bala wrote: > Chris Fields wrote: >> I have corrected the EUtilities POD to reflect that all text input >> needs to be raw as URI encoding is done in the module, which >> should work (I think). I plan on committing it tonight. It also >> indicates that EUtilities search queries need to be made as if >> they are regular Entrez queries. Would that be sufficient? > > You may not even need to mention anything about URI encoding, which > might frighten some people. Something as simple as: > > =head1 SYNOPSIS > > use Bio::DB::EUtilities; > > my $esearch = Bio::DB::EUtilities->new(-eutil => 'esearch', > -db => 'pubmed', > -term => 'hutP AND > xyz', > ... > > and/or some POD for the new() method: > > =head2 new > > Title : new > ... > Args : -eutil => ... > -db => ... > -term => string, an entrez-style query > > =cut > > would get the point across, I think. > > BTW, can the term string be supplied anywhere else other than new > ()? It doesn't matter at all if it can't, I'm just idly wondering > if I missed anything. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Oct 5 17:51:06 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 16:51:06 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45257812.5050008@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> <4525314C.7020205@sendu.me.uk> <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net> <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu> <45257812.5050008@sendu.me.uk> Message-ID: <5B2E844F-7B8B-4F69-9005-138826B835FB@uiuc.edu> > You may not even need to mention anything about URI encoding, which > might frighten some people. Something as simple as: > > =head1 SYNOPSIS > > use Bio::DB::EUtilities; > > my $esearch = Bio::DB::EUtilities->new(-eutil => 'esearch', > -db => 'pubmed', > -term => 'hutP AND > xyz', > ... > > and/or some POD for the new() method: > > =head2 new > > Title : new > ... > Args : -eutil => ... > -db => ... > -term => string, an entrez-style query > > =cut > > would get the point across, I think. Oops, forgot. I'll add this in and update new() when I can. Thanks! Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From arareko at campus.iztacala.unam.mx Thu Oct 5 18:12:49 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Thu, 05 Oct 2006 17:12:49 -0500 Subject: [Bioperl-l] BioPerl Deobfuscator updated In-Reply-To: <45256E18.3080103@purdue.edu> References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> <45256E18.3080103@purdue.edu> Message-ID: <45258361.8080803@campus.iztacala.unam.mx> Phillip San Miguel wrote: > Finally, it would be good to have the version of bioperl being > deobfuscated on the deob_interface.cgi page. Just as a quick > sanity-checking measure. After poking around a bit I found that > bioperl-live is being indexed in the wiki. But, I can tell, it is just > the sort of thing I'm going to forget and look for every time come back > to the page after a few months... Dave, I think this value can be stored in one of the index files and passed as an argument to the deob_index.pl script. What do you think? Mauricio. -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From lincoln.stein at gmail.com Thu Oct 5 14:42:41 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Thu, 5 Oct 2006 14:42:41 -0400 Subject: [Bioperl-l] using nfreeze instead of freeze in Bio::SeqFeature::Store In-Reply-To: References: Message-ID: <6dce9a0b0610051142h56479843ofc5429d959cb6e3@mail.gmail.com> I think it's fine unless there is a significant performance hit, in which case the change should be made into a configuration option. Do you know if there is any overhead on doing this? Lincoln On 10/5/06, Cook, Malcolm wrote: > Lincoln, > > I committed a change to Bio::SeqFeature::Store to use nfreeze instead of > freeze which should allow SeqFeature objects to survive database > freeze/thaw cycles across architectures. > > I hope I was not presumptuous or in error in doing this.... > > Regards, > > Malcolm Cook > Database Applications Manager - Bioinformatics > Stowers Institute for Medical Research - Kansas City, Missouri > > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From torsten.seemann at infotech.monash.edu.au Fri Oct 6 01:26:10 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Fri, 06 Oct 2006 15:26:10 +1000 Subject: [Bioperl-l] Clean-up of Bio::Root::IO In-Reply-To: <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net> References: <452344D4.8070908@infotech.monash.edu.au> <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net> Message-ID: <4525E8F2.1000704@infotech.monash.edu.au> Hilmar, > I don't think there's a need to deprecate - if the methods just plain > delegate to whatever File:: module is appropriate their > implementation (supposedly) will become very simple and hence won't > pose a maintenance burden anymore. >> I have an uncommitted simplified version of Bio::Root::IO which does >> this, and "all tests pass". The functions currently (silently) >> dispatch >> directly to their native counterparts. >> >> The only tricky function is tempfile() which is *mostly* like >> File::Temp::tempfile(), but does some voodoo of converting >> (TEMPLATE=>'xxx') to the non-hash first parameter of the File:: >> version, >> so I'm hesitant to commit. It may do other magic - Hilmar? > > Not that I would know of. If the tests pass (without having to change > them!) I'd give it a try. Tempfile.t had two tests that failed. It seems that Bio::Root::IO had some magic whereby it would keep a list of all tempfilenames created with UNLINK != 0 and when the Bio::Root::IO object was destroyed (eg. undef $obj) it would MANUALLY unlink each of them. This would occur before File::Temp got to unlink them. Not sure why it was written like this (as File::Temp will delete them at the end of the script anyway) but maybe it was legacy for when File::Temp::tempfile WASN'T available. Anyway, I've kept backward compatibility there, although I think eventually it should be removed and Tempfile.t adjusted. Although all tests pass with my new trim Bio/Root/IO.pm I am still concerned about committing as the assumption is that the BioPerl test suite is good enough to handle such a change to an important module, but the reality may be different :-) Let me know if you think I should commit anyway, Your advice is appreciated. -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From dmessina at wustl.edu Fri Oct 6 01:25:56 2006 From: dmessina at wustl.edu (David Messina) Date: Fri, 6 Oct 2006 00:25:56 -0500 Subject: [Bioperl-l] BioPerl Deobfuscator updated In-Reply-To: <45258361.8080803@campus.iztacala.unam.mx> References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> <45256E18.3080103@purdue.edu> <45258361.8080803@campus.iztacala.unam.mx> Message-ID: On Oct 5, 2006, at 5:12 PM, Mauricio Herrera Cuadra wrote: > I think this value can be stored in one of the index files and > passed as an argument to the deob_index.pl script. What do you think? Yep, I think that works nicely. I added this feature and committed it to CVS. Here's what the new header looks like if you do deob_index.pl -s "bioperl-live": ? Thanks for the suggestions, guys. Dave -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061006/1c5819f9/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: deob_header.jpg Type: image/jpeg Size: 25739 bytes Desc: not available Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061006/1c5819f9/attachment-0001.jpg From deep_ans at yahoo.com Fri Oct 6 09:22:49 2006 From: deep_ans at yahoo.com (deepak shingan) Date: Fri, 6 Oct 2006 06:22:49 -0700 (PDT) Subject: [Bioperl-l] Sort blast file result according to evalues Message-ID: <20061006132249.49450.qmail@web51711.mail.yahoo.com> Hi , Is there any way to parse the blast file according to evalue for each hit. I want the output sorted according to hit evalue. I am using SearchIO algorithm and already tried sorting the hits according to bits, gaps, but I am not able to sort the hits by evalue. As evalues are mainly associated with hsp and each hit may have multiple hsps. waiting for help. Thanks, Dun Dansi --------------------------------- How low will we go? Check out Yahoo! Messenger?s low PC-to-Phone call rates. From hlapp at gmx.net Fri Oct 6 10:03:04 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 6 Oct 2006 10:03:04 -0400 Subject: [Bioperl-l] Clean-up of Bio::Root::IO In-Reply-To: <4525E8F2.1000704@infotech.monash.edu.au> References: <452344D4.8070908@infotech.monash.edu.au> <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net> <4525E8F2.1000704@infotech.monash.edu.au> Message-ID: <074183F9-004A-4693-A7BC-D5E5524FCC92@gmx.net> This is a 1.5, i.e. developers release that's in the works, and also you'd be doing this on the main trunk. If you get the tests to pass there's no reason to hold back. You may be right and in reality it has repercussions somewhere, but those will be the opportunities to improve our test suite. -hilmar On Oct 6, 2006, at 1:26 AM, Torsten Seemann wrote: > Although all tests pass with my new trim Bio/Root/IO.pm I am still > concerned about committing as the assumption is that the BioPerl > test suite is good enough to handle such a change to an important > module, but the reality may be different :-) > > Let me know if you think I should commit anyway, > > Your advice is appreciated. -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Fri Oct 6 10:58:09 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 6 Oct 2006 09:58:09 -0500 Subject: [Bioperl-l] Sort blast file result according to evalues In-Reply-To: <20061006132249.49450.qmail@web51711.mail.yahoo.com> References: <20061006132249.49450.qmail@web51711.mail.yahoo.com> Message-ID: The evalue for the hit is retrieved by the BlastHit::signifiance() method, if I remember correctly. So if $hit is a Bio::Search::Hit::BlastHit object, you use $hit->significance. If you want individual HSP evalues, you would use $hsp->evalue for the individual HSP objects. The output is normally sorted by the order they appear in the alignments and table, which is typically by increasing evalue or decreasing bits (score). So they are already sorted. If you wanted to run a sort yourself you could use a sort block using '{$a- >significance() <=> $b->significance()} @hits', but as pointed out on the wiki it may be safer to run a Schwartzian transform instead: http://www.bioperl.org/wiki/Bioperl_Best_Practices#Sorting Chris On Oct 6, 2006, at 8:22 AM, deepak shingan wrote: > Hi , > Is there any way to parse the blast file according to evalue for > each hit. I want the output sorted according to hit evalue. I am > using SearchIO algorithm and already tried sorting the hits > according to bits, gaps, but I am not able to sort the hits by evalue. > As evalues are mainly associated with hsp and each hit may have > multiple hsps. > > waiting for help. > > Thanks, > Dun Dansi > > > > > > --------------------------------- > How low will we go? Check out Yahoo! Messenger?s low PC-to-Phone > call rates. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Fri Oct 6 11:03:45 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 6 Oct 2006 10:03:45 -0500 Subject: [Bioperl-l] Clean-up of Bio::Root::IO In-Reply-To: <074183F9-004A-4693-A7BC-D5E5524FCC92@gmx.net> References: <452344D4.8070908@infotech.monash.edu.au> <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net> <4525E8F2.1000704@infotech.monash.edu.au> <074183F9-004A-4693-A7BC-D5E5524FCC92@gmx.net> Message-ID: <265AD609-F74E-4545-B3DD-FF94290BE0B4@uiuc.edu> On Oct 6, 2006, at 9:03 AM, Hilmar Lapp wrote: > This is a 1.5, i.e. developers release that's in the works, and also > you'd be doing this on the main trunk. If you get the tests to pass > there's no reason to hold back. > > You may be right and in reality it has repercussions somewhere, but > those will be the opportunities to improve our test suite. > > -hilmar Agreed, though I think Sendu only wants bug fixes for 1.5.2. You could always commit to CVS HEAD and it could be in 1.5.3. Let me rethink that. There were some subtle tempfile/tempdir issues that were popping up on WinXP where the some tempfiles were not being deleted b/c of permissions issues; I had planned on adding that to Bugzilla today or tomorrow. Maybe changing to File::Temp would fix that, so in essence it would be a bug fix! I'll go ahead and post the bug. Chris >> Although all tests pass with my new trim Bio/Root/IO.pm I am still >> concerned about committing as the assumption is that the BioPerl >> test suite is good enough to handle such a change to an important >> module, but the reality may be different :-) >> >> Let me know if you think I should commit anyway, >> >> Your advice is appreciated. > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From pmiguel at purdue.edu Fri Oct 6 11:06:56 2006 From: pmiguel at purdue.edu (Phillip San Miguel) Date: Fri, 06 Oct 2006 11:06:56 -0400 Subject: [Bioperl-l] BioPerl Deobfuscator updated In-Reply-To: <5861A24A-E0F9-4B06-9AD1-68857E6561F9@wustl.edu> References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> <45256E18.3080103@purdue.edu> <5861A24A-E0F9-4B06-9AD1-68857E6561F9@wustl.edu> Message-ID: <45267110.7030905@purdue.edu> David Messina wrote: > Thanks so much, Phillip, for taking the time to check out the new > version and send your comments. I really appreciate it! I've added > them to the wiki page so I can track them. > > Best, > Dave > Dave, No problem. I've just added a "keyword" to search BioPerl Deobfuscator to my Firefox browser. That way I can just type "deob qual" in my URL bar in firefox and the browser jumps directly to BioPerl Deobfuscator (like a bookmark) but it pre-submits the search item "qual". I heard about the Firefox "keywords" in a TWiT/FLOSS episode on mozilla. You just go to any search page and right-click in the search box of interest and one of the choices is "Add a Keyword for this Search". Then you just have to fill out "Name" and "Keyword" fields and drop the keyword into whatever folder you like. The "Keyword" then becomes the word to invoke that search with parameters that follow it when it is typed into the URL bar. Phillip From arareko at campus.iztacala.unam.mx Fri Oct 6 11:18:02 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Fri, 06 Oct 2006 10:18:02 -0500 Subject: [Bioperl-l] BioPerl Deobfuscator updated In-Reply-To: References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> <45256E18.3080103@purdue.edu> <45258361.8080803@campus.iztacala.unam.mx> Message-ID: <452673AA.7070305@campus.iztacala.unam.mx> Looks great! I'll update it during the weekend. Mauricio. David Messina wrote: > > On Oct 5, 2006, at 5:12 PM, Mauricio Herrera Cuadra wrote: >> I think this value can be stored in one of the index files and passed >> as an argument to the deob_index.pl script. What do you think? > > Yep, I think that works nicely. I added this feature and committed it to > CVS. Here's what the new header looks like if you do deob_index.pl -s > "bioperl-live": > > > Thanks for the suggestions, guys. > > Dave > > > ------------------------------------------------------------------------ > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From bix at sendu.me.uk Fri Oct 6 11:27:14 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 06 Oct 2006 16:27:14 +0100 Subject: [Bioperl-l] Sort blast file result according to evalues In-Reply-To: References: <20061006132249.49450.qmail@web51711.mail.yahoo.com> Message-ID: <452675D2.9090803@sendu.me.uk> Chris Fields wrote: > The evalue for the hit is retrieved by the BlastHit::signifiance() > method, if I remember correctly. So if $hit is a > Bio::Search::Hit::BlastHit object, you use $hit->significance. If > you want individual HSP evalues, you would use $hsp->evalue for the > individual HSP objects. > > The output is normally sorted by the order they appear in the > alignments and table, which is typically by increasing evalue or > decreasing bits (score). So they are already sorted. Concur. > If you wanted to run a sort yourself you could use a sort block using > '{$a->significance() <=> $b->significance()} @hits' Actually, it is best to use the sort_hits() method of the result object prior to asking for any hits. (As this allows for potential optimization in the parser.) ->significance is still the thing you need to sort on though. From cjfields at uiuc.edu Fri Oct 6 11:52:57 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 6 Oct 2006 10:52:57 -0500 Subject: [Bioperl-l] Sort blast file result according to evalues In-Reply-To: <452675D2.9090803@sendu.me.uk> References: <20061006132249.49450.qmail@web51711.mail.yahoo.com> <452675D2.9090803@sendu.me.uk> Message-ID: <31A6FC3A-8BEB-42B8-B51D-66E659EF7495@uiuc.edu> On Oct 6, 2006, at 10:27 AM, Sendu Bala wrote: >> If you wanted to run a sort yourself you could use a sort block using >> '{$a->significance() <=> $b->significance()} @hits' > > Actually, it is best to use the sort_hits() method of the result > object > prior to asking for any hits. (As this allows for potential > optimization > in the parser.) Ah, forgot about that one! Chris Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Fri Oct 6 14:36:49 2006 From: jason at bioperl.org (Jason Stajich) Date: Fri, 6 Oct 2006 11:36:49 -0700 Subject: [Bioperl-l] tempfile cleanup In-Reply-To: <437E2D49-F5F6-424A-BC61-EC4BFEB9F3C4@berkeley.edu> References: <437E2D49-F5F6-424A-BC61-EC4BFEB9F3C4@berkeley.edu> Message-ID: <0FCEC6B2-E190-4800-AAB1-89559C552FA6@bioperl.org> I think the magic trickery in there for cleanup is that File::Temp only cleans up tempfiles when Perl exits not when the Root::IO object goes out of scope -- so this can be a problem for people on CGI scripts that stay resident in memory and don't ever have tempfiles cleaned up. The managing the list aspect allows us to call _cleanup periodically (perhaps before the start of every Blast run) to insure that tempfiles are removed. perhaps newer File::Temp versions can solve this better now but I believe that was the behavior we were trying to deal with with managing the list of to-be-deleted files by the Root::IO object. This is some hackery that also had to do with not expecting File::Temp to be installed I believe. -jason From torsten.seemann at infotech.monash.edu.au Mon Oct 9 00:52:29 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Mon, 09 Oct 2006 14:52:29 +1000 Subject: [Bioperl-l] Multiple packages in the one .pm file Message-ID: <4529D58D.1080004@infotech.monash.edu.au> Hi all, The following modules have more than one "package xxxx;" declaration in them. For small, internal classes I guess this is fine, but for others, they should be split up into the filesystem - otherwise they are troublesome to locate and the online documentation doesn't list them! eg. bioperl-run/Bio/Tools/Run/Analysis/Job.pm is in bioperl-run/Bio/Tools/Run/Analysis.pm Here's the culprits: % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio | sed 's/:.*$//' | sort | uniq -d ; done bioperl-live/Bio/AnalysisI.pm bioperl-live/Bio/DB/Fasta.pm bioperl-live/Bio/DB/GFF.pm bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm bioperl-live/Bio/DB/SeqFeature/Store/memory.pm bioperl-live/Bio/SeqIO/interpro.pm bioperl-run/Bio/Tools/Run/Analysis.pm bioperl-run/Bio/Tools/Run/Analysis/soap.pm -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From pmiguel at purdue.edu Mon Oct 9 15:57:12 2006 From: pmiguel at purdue.edu (Phillip San Miguel) Date: Mon, 09 Oct 2006 15:57:12 -0400 Subject: [Bioperl-l] Will 1.5.2 bugfixes propagate into next RC? Message-ID: <452AA998.5010104@purdue.edu> I found a bug in Bio::SeqIO::phd and am wondering if the fix will propagate into the next release candidate? The bug is here: http://bugzilla.open-bio.org/show_bug.cgi?id=2120 I also created a patch that fixes it (on my machine, anyway). It is a fairly minor change, so it seems like it would be worth propagating it into the next release candidate. -- Phillip SanMiguel From bix at sendu.me.uk Mon Oct 9 16:57:28 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 09 Oct 2006 21:57:28 +0100 Subject: [Bioperl-l] Will 1.5.2 bugfixes propagate into next RC? In-Reply-To: <452AA998.5010104@purdue.edu> References: <452AA998.5010104@purdue.edu> Message-ID: <452AB7B8.4040404@sendu.me.uk> Phillip San Miguel wrote: > I found a bug in Bio::SeqIO::phd and am wondering if the fix will > propagate into the next release candidate? > > The bug is here: > > http://bugzilla.open-bio.org/show_bug.cgi?id=2120 > > I also created a patch that fixes it (on my machine, anyway). It is a > fairly minor change, so it seems like it would be worth propagating it > into the next release candidate. If it gets committed to HEAD before I make the next candidate, then yes. I'll do that if no one beats me to it (and if someone does, please add a new test for this). BTW Phillip, thank you for the bug report but in future use the attachment capabilities for files, please don't paste them into the comments box. From bix at sendu.me.uk Mon Oct 9 17:01:56 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 09 Oct 2006 22:01:56 +0100 Subject: [Bioperl-l] Analysis soap problem Message-ID: <452AB8C4.1010704@sendu.me.uk> I thought I'd 'advertise' this bug on the list so more people see it: http://bugzilla.open-bio.org/show_bug.cgi?id=2117 I don't want to make the next 1.5.2 release candidate until its fixed. Does anyone have any idea about it? Even if you can't fix it, just explaining what's (supposed) to be going on would help a lot. Thank you, Sendu. From Kevin.M.Brown at asu.edu Mon Oct 9 18:40:54 2006 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Mon, 9 Oct 2006 15:40:54 -0700 Subject: [Bioperl-l] Analysis soap problem Message-ID: <1A4207F8295607498283FE9E93B775B40219690B@EX02.asurite.ad.asu.edu> If I had to guess from looking at the snippet provided, the variable $seq holds no data so when you try to setup the regex /^$seq$/ you end up with /^$/ (blank line) and the warning. > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Sendu Bala > Sent: Monday, October 09, 2006 2:02 PM > To: bioperl-l List > Subject: [Bioperl-l] Analysis soap problem > > I thought I'd 'advertise' this bug on the list so more people see it: > http://bugzilla.open-bio.org/show_bug.cgi?id=2117 > > I don't want to make the next 1.5.2 release candidate until > its fixed. > Does anyone have any idea about it? Even if you can't fix it, just > explaining what's (supposed) to be going on would help a lot. > > Thank you, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at uiuc.edu Mon Oct 9 22:34:23 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 9 Oct 2006 21:34:23 -0500 Subject: [Bioperl-l] Analysis soap problem In-Reply-To: <452AB8C4.1010704@sendu.me.uk> References: <452AB8C4.1010704@sendu.me.uk> Message-ID: <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu> I have 'fixed' this in CVS. Note the quotes; it depends on what you might consider fixed. Multiple calls to results() were returning empty hash refs, so no data was being returned. For now, I stored the hash reference in a variable then tested each one. All tests now pass, including the 'outseq' one. Maybe it's just me, but shouldn't results() either consistently return the same information, or contain documentation that it doesn't do so? Anyway, I have left the bugzilla report open for now. Chris On Oct 9, 2006, at 4:01 PM, Sendu Bala wrote: > I thought I'd 'advertise' this bug on the list so more people see it: > http://bugzilla.open-bio.org/show_bug.cgi?id=2117 > > I don't want to make the next 1.5.2 release candidate until its fixed. > Does anyone have any idea about it? Even if you can't fix it, just > explaining what's (supposed) to be going on would help a lot. > > Thank you, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bosborne11 at verizon.net Mon Oct 9 22:09:45 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 09 Oct 2006 22:09:45 -0400 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: <4529D58D.1080004@infotech.monash.edu.au> Message-ID: Torsten, Fixed interpro.pm, it could have been written more simply (or more like other SeqIO modules). Can't really address the others. Brian O. On 10/9/06 12:52 AM, "Torsten Seemann" wrote: > Hi all, > > The following modules have more than one "package xxxx;" declaration in > them. For small, internal classes I guess this is fine, but for others, > they should be split up into the filesystem - otherwise they are > troublesome to locate and the online documentation doesn't list them! > > eg. > bioperl-run/Bio/Tools/Run/Analysis/Job.pm > is in > bioperl-run/Bio/Tools/Run/Analysis.pm > > Here's the culprits: > > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio | > sed 's/:.*$//' | sort | uniq -d ; done > > bioperl-live/Bio/AnalysisI.pm > bioperl-live/Bio/DB/Fasta.pm > bioperl-live/Bio/DB/GFF.pm > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm > bioperl-live/Bio/SeqIO/interpro.pm > > bioperl-run/Bio/Tools/Run/Analysis.pm > bioperl-run/Bio/Tools/Run/Analysis/soap.pm From bix at sendu.me.uk Tue Oct 10 03:03:20 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 10 Oct 2006 08:03:20 +0100 Subject: [Bioperl-l] Analysis soap problem In-Reply-To: <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu> References: <452AB8C4.1010704@sendu.me.uk> <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu> Message-ID: <452B45B8.8010401@sendu.me.uk> Chris Fields wrote: > I have 'fixed' this in CVS. Note the quotes; it depends on what you > might consider fixed. Multiple calls to results() were returning > empty hash refs, so no data was being returned. For now, I stored > the hash reference in a variable then tested each one. All tests now > pass, including the 'outseq' one. > > Maybe it's just me, but shouldn't results() either consistently > return the same information, or contain documentation that it doesn't > do so? Anyway, I have left the bugzilla report open for now. Judging by the tests there seems a clear expectation that multiple calls to results() should work, and certainly that makes sense and seems natural. So I'd say that results() should be fixed and the test script reverted. From cjfields at uiuc.edu Tue Oct 10 07:42:33 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 06:42:33 -0500 Subject: [Bioperl-l] Analysis soap problem In-Reply-To: <452B45B8.8010401@sendu.me.uk> References: <452AB8C4.1010704@sendu.me.uk> <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu> <452B45B8.8010401@sendu.me.uk> Message-ID: I agree, though I think Martin Senger should be contacted, at least to get his thoughts. Has anyone tried yet? Chris On Oct 10, 2006, at 2:03 AM, Sendu Bala wrote: > Chris Fields wrote: >> I have 'fixed' this in CVS. Note the quotes; it depends on what you >> might consider fixed. Multiple calls to results() were returning >> empty hash refs, so no data was being returned. For now, I stored >> the hash reference in a variable then tested each one. All tests now >> pass, including the 'outseq' one. >> >> Maybe it's just me, but shouldn't results() either consistently >> return the same information, or contain documentation that it doesn't >> do so? Anyway, I have left the bugzilla report open for now. > > Judging by the tests there seems a clear expectation that multiple > calls > to results() should work, and certainly that makes sense and seems > natural. So I'd say that results() should be fixed and the test script > reverted. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Tue Oct 10 08:14:31 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 10 Oct 2006 13:14:31 +0100 Subject: [Bioperl-l] Analysis soap problem In-Reply-To: References: <452AB8C4.1010704@sendu.me.uk> <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu> <452B45B8.8010401@sendu.me.uk> Message-ID: <452B8EA7.1080800@sendu.me.uk> Chris Fields wrote: > I agree, though I think Martin Senger should be contacted, at least to > get his thoughts. Has anyone tried yet? He's CCd on the bug report, but I haven't tried directly, no. Do you want to tackle this (contacting him and/or fixing the bug)? Cheers, Sendu. From cjfields at uiuc.edu Tue Oct 10 09:20:03 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 08:20:03 -0500 Subject: [Bioperl-l] Analysis soap problem In-Reply-To: <452B8EA7.1080800@sendu.me.uk> Message-ID: <001801c6ec6e$cc016900$15327e82@pyrimidine> I'll try giving it a closer look, just didn't have much time yesterday. I'll also try contacting Martin. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Sendu Bala > Sent: Tuesday, October 10, 2006 7:15 AM > To: bioperl-l > Subject: Re: [Bioperl-l] Analysis soap problem > > Chris Fields wrote: > > I agree, though I think Martin Senger should be contacted, at least to > > get his thoughts. Has anyone tried yet? > > He's CCd on the bug report, but I haven't tried directly, no. Do you > want to tackle this (contacting him and/or fixing the bug)? > > Cheers, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From pmiguel at purdue.edu Tue Oct 10 10:26:35 2006 From: pmiguel at purdue.edu (Phillip San Miguel) Date: Tue, 10 Oct 2006 10:26:35 -0400 Subject: [Bioperl-l] Bug reports and attachments In-Reply-To: <452AB7B8.4040404@sendu.me.uk> References: <452AA998.5010104@purdue.edu> <452AB7B8.4040404@sendu.me.uk> Message-ID: <452BAD9B.5010903@purdue.edu> Sendu Bala wrote: > > BTW Phillip, thank you for the bug report but in future use the > attachment capabilities for files, please don't paste them into the > comments box. > Sendu, Sounds reasonable to me. I should note, however; when I entered the bug, I was looking for some method to attach files. There is none on the "Enter Bug: Bioperl" page: http://bugzilla.open-bio.org/enter_bug.cgi?product=Bioperl Also, "bug writing guidelines" makes no mention of it. I vaguely remembered there being some method to do it--but given the "bug writing guidelines" exhortations to be specific and detailed, I thought I must put the information somewhere. So I put them them the only place offered (on that page)--"Description:" I see that, once submitted, attachments can be added to a bug report. Is that normally how it is done? Doesn't each attachment result in a separate email to the bioperl guts email list? Anyway, I've just added the files to the bug report as attachments, in case someone needs them to construct a test. -- Phillip From bix at sendu.me.uk Tue Oct 10 11:10:25 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 10 Oct 2006 16:10:25 +0100 Subject: [Bioperl-l] Bug reports and attachments In-Reply-To: <452BAD9B.5010903@purdue.edu> References: <452AA998.5010104@purdue.edu> <452AB7B8.4040404@sendu.me.uk> <452BAD9B.5010903@purdue.edu> Message-ID: <452BB7E1.5020200@sendu.me.uk> Phillip San Miguel wrote: > Sendu Bala wrote: >> BTW Phillip, thank you for the bug report but in future use the >> attachment capabilities for files, please don't paste them into the >> comments box. >> > Sendu, Sounds reasonable to me. I should note, however; when I > entered the bug, I was looking for some method to attach files. There > is none on the "Enter Bug: Bioperl" page: > > http://bugzilla.open-bio.org/enter_bug.cgi?product=Bioperl > > Also, "bug writing guidelines" makes no mention of it. I vaguely > remembered there being some method to do it--but given the "bug > writing guidelines" exhortations to be specific and detailed, I > thought I must put the information somewhere. So I put them them the > only place offered (on that page)--"Description:" I agree that things could be better here. Who looks after bugzilla, and is this an alterable feature? > I see that, once submitted, attachments can be added to a bug report. > Is that normally how it is done? Yes, AFAIK. > Doesn't each attachment result in a separate email to the bioperl > guts email list? Yes, but that's not a problem. In fact, doing it this way means you don't email everyone subscribed to guts your big files in plain text, but instead they get a small email with a link to the download. > Anyway, I've just added the files to the bug report as attachments, > in case someone needs them to construct a test. Thank you. From arareko at campus.iztacala.unam.mx Tue Oct 10 11:14:00 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Tue, 10 Oct 2006 10:14:00 -0500 Subject: [Bioperl-l] Bug reports and attachments In-Reply-To: <452BAD9B.5010903@purdue.edu> References: <452AA998.5010104@purdue.edu> <452AB7B8.4040404@sendu.me.uk> <452BAD9B.5010903@purdue.edu> Message-ID: <452BB8B8.40409@campus.iztacala.unam.mx> Phillip San Miguel wrote: > I see that, once submitted, attachments can be added to a bug report. > Is that normally how it is done? Yes, it's the normal method: create the bug report, then attach files. > Doesn't each attachment result in a separate email to the bioperl > guts email list? Adding a file will generate an informative email per bug change (attaching the file in this case) but won't send the attachment to the list. Regards, Mauricio. -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From cjfields at uiuc.edu Tue Oct 10 11:20:55 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 10:20:55 -0500 Subject: [Bioperl-l] Bug reports and attachments In-Reply-To: <452BAD9B.5010903@purdue.edu> Message-ID: <002801c6ec7f$ae8d85f0$15327e82@pyrimidine> > Also, "bug writing guidelines" makes no mention of it. I vaguely > remembered there being some method to do it--but given the "bug writing > guidelines" exhortations to be specific and detailed, I thought I must > put the information somewhere. So I put them them the only place offered > (on that page)--"Description:" > I see that, once submitted, attachments can be added to a bug > report. Is that normally how it is done? Doesn't each attachment result > in a separate email to the bioperl guts email list? > Anyway, I've just added the files to the bug report as attachments, > in case someone needs them to construct a test. Phillip, Initial bug reports only require the general description, OS used, bioperl version, etc. That's quite normal. Any relevant attachments are added afterward. We should probably make that clearer upfront on the wiki page; I don't know if anyone can make similar changes to bugzilla. Any bug changes, CVS commits, etc are mailed to bioperl-guts, yes. That isn't an issue though; it keeps the developers updated on the various bugs/commits that are going on and is a pretty common practice. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 10 12:48:22 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 11:48:22 -0500 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: <4529D58D.1080004@infotech.monash.edu.au> References: <4529D58D.1080004@infotech.monash.edu.au> Message-ID: <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> There are a number of other bioperl-run examples (the Bio::Tools::Run::Analysis::soap issue I looked into revealed such). I agree with both points, 1) that it depends on the size of the classes, and 2) from a maintainability standpoint, it can be very frustrating when looking for documentation. Is there really any advantage to doing this? Chris On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote: > Hi all, > > The following modules have more than one "package xxxx;" > declaration in > them. For small, internal classes I guess this is fine, but for > others, > they should be split up into the filesystem - otherwise they are > troublesome to locate and the online documentation doesn't list them! > > eg. > bioperl-run/Bio/Tools/Run/Analysis/Job.pm > is in > bioperl-run/Bio/Tools/Run/Analysis.pm > > Here's the culprits: > > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio | > sed 's/:.*$//' | sort | uniq -d ; done > > bioperl-live/Bio/AnalysisI.pm > bioperl-live/Bio/DB/Fasta.pm > bioperl-live/Bio/DB/GFF.pm > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm > bioperl-live/Bio/SeqIO/interpro.pm > > bioperl-run/Bio/Tools/Run/Analysis.pm > bioperl-run/Bio/Tools/Run/Analysis/soap.pm > > -- > Dr Torsten Seemann http://www.vicbioinformatics.com > Victorian Bioinformatics Consortium, Monash University, Australia > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 10 12:48:22 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 11:48:22 -0500 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: <4529D58D.1080004@infotech.monash.edu.au> References: <4529D58D.1080004@infotech.monash.edu.au> Message-ID: <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> There are a number of other bioperl-run examples (the Bio::Tools::Run::Analysis::soap issue I looked into revealed such). I agree with both points, 1) that it depends on the size of the classes, and 2) from a maintainability standpoint, it can be very frustrating when looking for documentation. Is there really any advantage to doing this? Chris On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote: > Hi all, > > The following modules have more than one "package xxxx;" > declaration in > them. For small, internal classes I guess this is fine, but for > others, > they should be split up into the filesystem - otherwise they are > troublesome to locate and the online documentation doesn't list them! > > eg. > bioperl-run/Bio/Tools/Run/Analysis/Job.pm > is in > bioperl-run/Bio/Tools/Run/Analysis.pm > > Here's the culprits: > > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio | > sed 's/:.*$//' | sort | uniq -d ; done > > bioperl-live/Bio/AnalysisI.pm > bioperl-live/Bio/DB/Fasta.pm > bioperl-live/Bio/DB/GFF.pm > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm > bioperl-live/Bio/SeqIO/interpro.pm > > bioperl-run/Bio/Tools/Run/Analysis.pm > bioperl-run/Bio/Tools/Run/Analysis/soap.pm > > -- > Dr Torsten Seemann http://www.vicbioinformatics.com > Victorian Bioinformatics Consortium, Monash University, Australia > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From lzhtom at hotmail.com Tue Oct 10 15:42:48 2006 From: lzhtom at hotmail.com (zhihua li) Date: Tue, 10 Oct 2006 19:42:48 +0000 Subject: [Bioperl-l] No "new" method in Bio::Tool::Run::AnalysisFactor::Pise? Message-ID: Hi netters. I've installed Bioperl 1.5.1, both core and run modules. But when I tried to use the Pise module, an error occured saying that there's no "new" method in this package. My script is: use strict; use warnings; use Bio::Tools::Run::AnalysisFactory::Pise; my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new(); my $program=$factory->program('mfold'); $program->seq('my_input_file'); my $job = $program->run(); print STDERR $job->contect('mfold.out'); The error message I got is: Can't locate object method "new" via package "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load "Bio::Tools::Run::AnalysisFactor::Pise"?) I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/Pise.pm and it DOES contain a sub new. So what's going on? Anyone could give me a hint? Thanks a lot! From cjfields at uiuc.edu Tue Oct 10 16:27:27 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 15:27:27 -0500 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com> References: <4529D58D.1080004@infotech.monash.edu.au> <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com> Message-ID: Makes sense to me. I think, as long as they're documented, it shouldn't be a problem. I think the main point is that the class methods for these don't show up using perldoc (something I ran into with Bio::DB::Fasta's inclusion of Bio::PrimarySeq::Fasta), but they do show up when using other documentation. So 'perldoc Bio::DB::Fasta' works, but 'perldoc Bio::PrimarySeq::Fasta' doesn't. So these can be problematic when looking for specific methods. However, I think pod2html handles multiple package declarations in one module, and the PDOC online do as well. Does the Deobfuscator? Chris On Oct 10, 2006, at 3:11 PM, Lincoln Stein wrote: > Hi, > > These ones are all mine: > > > bioperl-live/Bio/DB/Fasta.pm > > bioperl-live/Bio/DB/GFF.pm > > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm > > In each case, the second modules are teeny tiny ones that implement > iterators which are at most two methods long (typically a new() and > a next()). I prefer not to split them out because they will just > clutter up the file tree with stuff that is already well documented > in the "parent ship" modules. > > Lincoln > > > On 10/10/06, Chris Fields wrote: There are a > number of other bioperl-run examples (the > Bio::Tools::Run::Analysis::soap issue I looked into revealed such). > > I agree with both points, 1) that it depends on the size of the > classes, and 2) from a maintainability standpoint, it can be very > frustrating when looking for documentation. Is there really any > advantage to doing this? > > Chris > > On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote: > > > Hi all, > > > > The following modules have more than one "package xxxx;" > > declaration in > > them. For small, internal classes I guess this is fine, but for > > others, > > they should be split up into the filesystem - otherwise they are > > troublesome to locate and the online documentation doesn't list > them! > > > > eg. > > bioperl-run/Bio/Tools/Run/Analysis/Job.pm > > is in > > bioperl-run/Bio/Tools/Run/Analysis.pm > > > > Here's the culprits: > > > > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/ > Bio | > > sed 's/:.*$//' | sort | uniq -d ; done > > > > bioperl-live/Bio/AnalysisI.pm > > bioperl-live/Bio/DB/Fasta.pm > > bioperl-live/Bio/DB/GFF.pm > > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm > > bioperl-live/Bio/SeqIO/interpro.pm > > > > bioperl-run/Bio/Tools/Run/Analysis.pm > > bioperl-run/Bio/Tools/Run/Analysis/soap.pm > > > > -- > > Dr Torsten Seemann http://www.vicbioinformatics.com > > Victorian Bioinformatics Consortium, Monash University, Australia > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > Lincoln D. Stein > Cold Spring Harbor Laboratory > 1 Bungtown Road > Cold Spring Harbor, NY 11724 > (516) 367-8380 (voice) > (516) 367-8389 (fax) > FOR URGENT MESSAGES & SCHEDULING, > PLEASE CONTACT MY ASSISTANT, > SANDRA MICHELSEN, AT michelse at cshl.edu Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 10 16:30:16 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 15:30:16 -0500 Subject: [Bioperl-l] No "new" method in Bio::Tool::Run::AnalysisFactor::Pise? In-Reply-To: References: Message-ID: <870B7500-AA83-42D7-965B-865B91AA8E7F@uiuc.edu> On Oct 10, 2006, at 2:42 PM, zhihua li wrote: > Hi netters. > > I've installed Bioperl 1.5.1, both core and run modules. But when > I tried to use the Pise module, an error occured saying that > there's no "new" method in this package. > > My script is: > > use strict; > use warnings; > use Bio::Tools::Run::AnalysisFactory::Pise; > my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new(); > my $program=$factory->program('mfold'); > $program->seq('my_input_file'); > my $job = $program->run(); > print STDERR $job->contect('mfold.out'); > > The error message I got is: > > Can't locate object method "new" via package > "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load > "Bio::Tools::Run::AnalysisFactor::Pise"?) > > I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/ > Pise.pm and it DOES contain a sub new. > > So what's going on? Anyone could give me a hint? > > Thanks a lot! Well, according to your error output you have AnalysisFactory misspelled ('AnalysisFactor'), which should tell you what the problem is. Look for the same thing in your script. Chris > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Tue Oct 10 16:43:06 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 10 Oct 2006 21:43:06 +0100 Subject: [Bioperl-l] No "new" method in Bio::Tool::Run::AnalysisFactor::Pise? In-Reply-To: References: Message-ID: <452C05DA.5050803@sendu.me.uk> zhihua li wrote: > Hi netters. > > I've installed Bioperl 1.5.1, both core and run modules. But when I > tried to use the Pise module, an error occured saying that there's no > "new" method in this package. > > My script is: > > use strict; > use warnings; > use Bio::Tools::Run::AnalysisFactory::Pise; > my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new(); > my $program=$factory->program('mfold'); > $program->seq('my_input_file'); > my $job = $program->run(); > print STDERR $job->contect('mfold.out'); > > The error message I got is: > > Can't locate object method "new" via package > "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load > "Bio::Tools::Run::AnalysisFactor::Pise"?) > > I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/Pise.pm > and it DOES contain a sub new. > > So what's going on? Anyone could give me a hint? You have a typo. Bio::Tools::Run::AnalysisFactory::Pise, not Bio::Tools::Run::AnalysisFactor::Pise From lincoln.stein at gmail.com Tue Oct 10 16:11:00 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Tue, 10 Oct 2006 16:11:00 -0400 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> References: <4529D58D.1080004@infotech.monash.edu.au> <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> Message-ID: <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com> Hi, These ones are all mine: > bioperl-live/Bio/DB/Fasta.pm > bioperl-live/Bio/DB/GFF.pm > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm In each case, the second modules are teeny tiny ones that implement iterators which are at most two methods long (typically a new() and a next()). I prefer not to split them out because they will just clutter up the file tree with stuff that is already well documented in the "parent ship" modules. Lincoln On 10/10/06, Chris Fields wrote: > > There are a number of other bioperl-run examples (the > Bio::Tools::Run::Analysis::soap issue I looked into revealed such). > > I agree with both points, 1) that it depends on the size of the > classes, and 2) from a maintainability standpoint, it can be very > frustrating when looking for documentation. Is there really any > advantage to doing this? > > Chris > > On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote: > > > Hi all, > > > > The following modules have more than one "package xxxx;" > > declaration in > > them. For small, internal classes I guess this is fine, but for > > others, > > they should be split up into the filesystem - otherwise they are > > troublesome to locate and the online documentation doesn't list them! > > > > eg. > > bioperl-run/Bio/Tools/Run/Analysis/Job.pm > > is in > > bioperl-run/Bio/Tools/Run/Analysis.pm > > > > Here's the culprits: > > > > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio | > > sed 's/:.*$//' | sort | uniq -d ; done > > > > bioperl-live/Bio/AnalysisI.pm > > bioperl-live/Bio/DB/Fasta.pm > > bioperl-live/Bio/DB/GFF.pm > > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm > > bioperl-live/Bio/SeqIO/interpro.pm > > > > bioperl-run/Bio/Tools/Run/Analysis.pm > > bioperl-run/Bio/Tools/Run/Analysis/soap.pm > > > > -- > > Dr Torsten Seemann http://www.vicbioinformatics.com > > Victorian Bioinformatics Consortium, Monash University, Australia > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From asjo at koldfront.dk Tue Oct 10 16:04:35 2006 From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=) Date: Tue, 10 Oct 2006 22:04:35 +0200 Subject: [Bioperl-l] No "new" method in Bio::Tool::Run::AnalysisFactor::Pise? References: Message-ID: <871wpglyy4.fsf@topper.koldfront.dk> On Tue, 10 Oct 2006 19:42:48 +0000, zhihua wrote: > use Bio::Tools::Run::AnalysisFactory::Pise; > my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new(); ^ y [...] > Can't locate object method "new" via package > "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load > "Bio::Tools::Run::AnalysisFactor::Pise"?) You missed a 'y' in "Factory". Best wishes, -- "We've reached a special place... Spiritually... Adam Sj?gren ecumenically... grammatically." asjo at koldfront.dk From dmessina at wustl.edu Tue Oct 10 17:08:45 2006 From: dmessina at wustl.edu (David Messina) Date: Tue, 10 Oct 2006 16:08:45 -0500 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: References: <4529D58D.1080004@infotech.monash.edu.au> <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com> Message-ID: > However, I think pod2html handles multiple package declarations in > one module, and the PDOC online do as well. Does the Deobfuscator? Nope. From my cursory examination at the time they mostly were, as Lincoln said, short and sweet, so I didn't consider it a big deal. I do think the Deobfuscator should theoretically handle such cases anyway, though. I'll add it as a feature request on the wiki page. Or if you're chomping at the bit for it, I could certainly be beer- suaded to do it sooner rather than later... :) Dave From cjfields at uiuc.edu Tue Oct 10 17:33:39 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 16:33:39 -0500 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: References: <4529D58D.1080004@infotech.monash.edu.au> <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com> Message-ID: <7F35F565-7D28-4B06-A501-4D4083652C5C@uiuc.edu> Me? I'm a lowly postdoc. Lincoln's got the cash! Chris On Oct 10, 2006, at 4:08 PM, David Messina wrote: >> However, I think pod2html handles multiple package declarations in >> one module, and the PDOC online do as well. Does the Deobfuscator? > > Nope. From my cursory examination at the time they mostly were, as > Lincoln said, short and sweet, so I didn't consider it a big deal. > > I do think the Deobfuscator should theoretically handle such cases > anyway, though. I'll add it as a feature request on the wiki page. > Or if you're chomping at the bit for it, I could certainly be beer- > suaded to do it sooner rather than later... :) > > Dave > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From sdavis2 at mail.nih.gov Wed Oct 11 05:43:35 2006 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Wed, 11 Oct 2006 05:43:35 -0400 Subject: [Bioperl-l] No "new" method in Bio::Tool::Run::AnalysisFactor::Pise? In-Reply-To: References: Message-ID: <452CBCC7.30108@mail.nih.gov> zhihua li wrote: > Hi netters. > > I've installed Bioperl 1.5.1, both core and run modules. But when I > tried to use the Pise module, an error occured saying that there's no > "new" method in this package. > > My script is: > > use strict; > use warnings; > use Bio::Tools::Run::AnalysisFactory::Pise; > my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new(); > my $program=$factory->program('mfold'); > $program->seq('my_input_file'); > my $job = $program->run(); > print STDERR $job->contect('mfold.out'); > > The error message I got is: > > Can't locate object method "new" via package > "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load > "Bio::Tools::Run::AnalysisFactor::Pise"?) > > I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/Pise.pm > and it DOES contain a sub new. > > So what's going on? Anyone could give me a hint? > > Thanks a lot! The module name is Bio::Tools::Run::AnalysisFactory::Pise. Note that it is not "factor" but "factory". That should probably fix your problem. Sean From jay at jays.net Sat Oct 7 18:34:23 2006 From: jay at jays.net (Jay Hannah) Date: Sat, 07 Oct 2006 17:34:23 -0500 Subject: [Bioperl-l] Documentation typo: Bio::Search::Result::GenericResult Message-ID: <45282B6F.1030308@jays.net> I just updated my bioperl-live this morning, so I think I'm current. :) perldoc Bio::Search::Result::GenericResult ------------ SYNOPSIS # typically one gets Results from a SearchIO stream use Bio::SearchIO; my $io = new Bio::SearchIO(-format => 'blast', -file => 't/data/HUMBETGLOA.tblastx'); while( my $result = $io->next_result) { # process all search results within the input stream while( my $hit = $result->next_hits()) { ------------- Except that "next_hits()" does not exist. Should be "next_hit()". (Should I have posted a patch instead?) Thanks, j From bosborne11 at verizon.net Tue Oct 10 18:42:25 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 10 Oct 2006 18:42:25 -0400 Subject: [Bioperl-l] Documentation typo: Bio::Search::Result::GenericResult In-Reply-To: <45282B6F.1030308@jays.net> Message-ID: j, No need, not for something so simple. Brian O. On 10/7/06 6:34 PM, "Jay Hannah" wrote: > Except that "next_hits()" does not exist. Should be "next_hit()". > > (Should I have posted a patch instead?) From zchou at cau.edu.cn Wed Oct 11 02:34:24 2006 From: zchou at cau.edu.cn (zhuocheng Hou) Date: Wed, 11 Oct 2006 14:34:24 +0800 Subject: [Bioperl-l] about retreive alinged sequence Message-ID: <000a01c6ecff$4ea4b2f0$0915020a@zchou> Hello,everyone, I am a new user of Bioperl. I want to align mutiple DNA sequences based on translated proteins by using clustalw. However, I don't know how to retreive aligned sequences out. The codes as follows (from the tutorials of HOWTOPAML): # # These codes run and can find the screen print out of clustalw ....... my $aa_aln = $aln_factory->align(\@prots, at params); # project the protein alignment back to CDS coordinates my $dna_aln = aa_to_dna_aln($aa_aln, \%seqs); my @each = $dna_aln->each_seq(); # The following codes were writed by me. However, it doesn't work. I want to get teh $dna_aln contents and write it to files. my $in = Bio::AlignIO->newFh(-Fh=>\$dna_aln,-format => 'fasta'); my $aln=$dna_aln; my $out = Bio::AlignIO->new(-file => ">out.msf" , -format => 'msf'); #print $out $_ while <$in>; while ($aln = $in->next_aln() ) { my $out->write_aln($aln); } Best regards, Zhuocheng CAU From n.haigh at sheffield.ac.uk Wed Oct 11 10:00:33 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 11 Oct 2006 15:00:33 +0100 Subject: [Bioperl-l] about retreive alinged sequence In-Reply-To: <000a01c6ecff$4ea4b2f0$0915020a@zchou> References: <000a01c6ecff$4ea4b2f0$0915020a@zchou> Message-ID: <452CF901.6020409@sheffield.ac.uk> Dear Zhuocheng I'm not familiar with the aa_to_dna_al method but it appears that from your code that it returns an alignment object. Please find comments inserted below - hope they help! Nathan zhuocheng Hou wrote: > Hello,everyone, > > I am a new user of Bioperl. I want to align mutiple DNA sequences based on translated proteins by using clustalw. However, I don't know how to retreive aligned sequences out. > > The codes as follows (from the tutorials of HOWTOPAML): > > # > # These codes run and can find the screen print out of clustalw > ....... > my $aa_aln = $aln_factory->align(\@prots, at params); > # project the protein alignment back to CDS coordinates > my $dna_aln = aa_to_dna_aln($aa_aln, \%seqs); > $dna_aln should be a Bio::AlignIO object so all you need to do is setup the output stream to write the alignment object similar to what you wrote below. i.e. my $out = Bio::AlignIO->new(-file => ">out.msf" , -format => 'msf'); Then simply write the input alignment ($dna_aln) to the output stream with this: my $out->write_aln($dna_aln); > my @each = $dna_aln->each_seq(); > > # The following codes were writed by me. However, it doesn't work. I want to get teh $dna_aln contents and write it to files. > > > my $in = Bio::AlignIO->newFh(-Fh=>\$dna_aln,-format => 'fasta'); > my $aln=$dna_aln; > my $out = Bio::AlignIO->new(-file => ">out.msf" , > -format => 'msf'); > #print $out $_ while <$in>; > while ($aln = $in->next_aln() ) { > my $out->write_aln($aln); > } > > > Best regards, > > Zhuocheng > CAU > > > ------------------------------------------------------------------------ > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From melcher at rescomp.berkeley.edu Wed Oct 11 17:09:17 2006 From: melcher at rescomp.berkeley.edu (Graham Melcher) Date: Wed, 11 Oct 2006 14:09:17 -0700 Subject: [Bioperl-l] Accessing GO through MYSQL? Message-ID: <20061011210917.GA783@rescomp.berkeley.edu> Hey all, Preface:: This is my first post to this list, please redirect if my questions belong elsewhere. I need to lookup GO ontology information given GO:Accessors, and I have a local mysql db that mirrors the GO db from that website. I am not sure if the Bio::Ontology::* libraries were designed to be used in a dynamic, load-as-you-need sort of way, and am wondering how other people have gone about solving this problem. Details follow... Right now I'm using Class::DBI to access the Mysql database, then made a new set of subclassed Bio::Ontology::TermI and Bio::Ontology::RelationshipI which use these class::DBI objects to access the relevent information in the database on the fly. Unfortunately, I was getting stuck with the implementation of some of the other Bio::Ontology::*I, especially Ontology. Making all of these subclasses seems infeasible, or at least enough work that it might be available somewhere. Are mysql accessors out there, and I just haven't found them, or is Bio::Ontology possibly not way to go? Alternatively, if I end up having to write this sort of Bio::Ontology - Class::DBI interface, would anyone be interested in it being made generally usable and available? Finally, I just found go-perl, but although I haven't had a lot of time to look into it, it doesn't seem to use mysql either. Thanks! Graham -- Graham Melcher From sdavis2 at mail.nih.gov Thu Oct 12 07:51:14 2006 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Thu, 12 Oct 2006 07:51:14 -0400 Subject: [Bioperl-l] Accessing GO through MYSQL? In-Reply-To: <20061011210917.GA783@rescomp.berkeley.edu> References: <20061011210917.GA783@rescomp.berkeley.edu> Message-ID: <452E2C32.7070502@mail.nih.gov> Graham Melcher wrote: > Finally, I just found go-perl, but although I haven't had a lot of time > to look into it, it doesn't seem to use mysql either. > Yep. Keep going. Go-perl and Go-db-perl: http://www.godatabase.org/dev/go-db-perl/doc/go-db-perl-doc.html Sean From hlapp at gmx.net Thu Oct 12 00:44:49 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 12 Oct 2006 00:44:49 -0400 Subject: [Bioperl-l] NESCent Phyloinformatics Hackathon Message-ID: <939B253E-2F87-450A-A277-78B5645D3494@gmx.net> (apologies in advance to those who receive this multiple times) The National Evolutionary Synthesis Center (NESCent) in collaboration with Arlin Stoltzfus (U. Maryland, NIST), Aaron Mackey (GSK), Rutger Vos (UBC), and Mark Holder (FSU) sponsors a Phyloinformatics Hackathon to take place Dec 11-15 in Durham, NC. The (wiki) website with more information and a formal proposal is at https://www.nescent.org/wg_phyloinformatics/ In short, the goal is to leverage the Bio* toolkits to provide the "glue" for evolutionary analyses of various types that depend on automation, interoperability, and data integration. CALL FOR INPUT: The specific objectives are driven by "use cases", that is, specific target problems of interest to evolutionary biologists (click 'Use Cases' at the above website). We invite community input in order to focus efforts on the most urgent or pervasive problems. The wiki for the hackathon allows direct editing of the use cases after registration. You may also upload data files, or add comments to the "Forum" page. Alternatively, send email to hlapp at nescent.org. You may also contact any of the organizers with questions or comments. ATTENDANCE: The hackathon is scheduled for Dec 11-15, 2006 in Durham NC. Space is limited, and attendance is by invitation. If you have not been contacted but desire to attend, please contact Hilmar Lapp (hlapp at nescent.org). ORGANIZERS: Hilmar Lapp (NESCent; hlapp at nescent.org) Aaron Mackey (GSK; aaron.j.mackey at gsk.com) Mark Holder (FSU; mholder at scs.fsu.edu) Arlin Stoltzfus (CARB, NIST; arlin.stoltzfus at nist.gov) Todd Vision (NESCent; tjv at bio.unc.edu) Rutger Vos (UBC; rvosa at sfu.ca) From neetisomaiya at gmail.com Thu Oct 12 02:03:20 2006 From: neetisomaiya at gmail.com (neeti somaiya) Date: Thu, 12 Oct 2006 11:33:20 +0530 Subject: [Bioperl-l] need help urgently Message-ID: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> We are using BioPerl to parse a BLAST output file, and then we want to load full alignments into a CLOB column in one of our database tables. We are trying to use sql loader for the same. Anyone has an idea how we can go about it? We have tried loading sequences into CLOB columns using sql loader, and that works fine, but the same syntax when used for loading alignments, is not working. -- -Neeti Even my blood says, B positive From neetisomaiya at gmail.com Thu Oct 12 02:03:20 2006 From: neetisomaiya at gmail.com (neeti somaiya) Date: Thu, 12 Oct 2006 11:33:20 +0530 Subject: [Bioperl-l] need help urgently Message-ID: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> We are using BioPerl to parse a BLAST output file, and then we want to load full alignments into a CLOB column in one of our database tables. We are trying to use sql loader for the same. Anyone has an idea how we can go about it? We have tried loading sequences into CLOB columns using sql loader, and that works fine, but the same syntax when used for loading alignments, is not working. -- -Neeti Even my blood says, B positive From neetisomaiya at gmail.com Thu Oct 12 02:03:20 2006 From: neetisomaiya at gmail.com (neeti somaiya) Date: Thu, 12 Oct 2006 11:33:20 +0530 Subject: [Bioperl-l] need help urgently Message-ID: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> We are using BioPerl to parse a BLAST output file, and then we want to load full alignments into a CLOB column in one of our database tables. We are trying to use sql loader for the same. Anyone has an idea how we can go about it? We have tried loading sequences into CLOB columns using sql loader, and that works fine, but the same syntax when used for loading alignments, is not working. -- -Neeti Even my blood says, B positive From sayali_salodkar at persistent.co.in Thu Oct 12 06:16:34 2006 From: sayali_salodkar at persistent.co.in (Sayali) Date: Thu, 12 Oct 2006 15:46:34 +0530 Subject: [Bioperl-l] regarding polyphred output Message-ID: <006e01c6ede7$7f10bef0$00b1580a@persistent.co.in> Hi, I want to parse the output of polyphred http://droog.gs.washington.edu/PolyPhred.html. Is there a parser already available in Bioperl which would help me in doing the same. Thanks, Sayali DISCLAIMER ========== This e-mail may contain privileged and confidential information which is the property of Persistent Systems Pvt. Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Pvt. Ltd. does not accept any liability for virus infected mails. From sayali_salodkar at persistent.co.in Thu Oct 12 06:16:34 2006 From: sayali_salodkar at persistent.co.in (Sayali) Date: Thu, 12 Oct 2006 15:46:34 +0530 Subject: [Bioperl-l] regarding polyphred output Message-ID: <006e01c6ede7$7f10bef0$00b1580a@persistent.co.in> Hi, I want to parse the output of polyphred http://droog.gs.washington.edu/PolyPhred.html. Is there a parser already available in Bioperl which would help me in doing the same. Thanks, Sayali DISCLAIMER ========== This e-mail may contain privileged and confidential information which is the property of Persistent Systems Pvt. Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Pvt. Ltd. does not accept any liability for virus infected mails. From sdavis2 at mail.nih.gov Thu Oct 12 06:40:12 2006 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Thu, 12 Oct 2006 06:40:12 -0400 Subject: [Bioperl-l] need help urgently In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> Message-ID: <200610120640.12250.sdavis2@mail.nih.gov> On Thursday 12 October 2006 02:03, neeti somaiya wrote: > We are using BioPerl to parse a BLAST output file, and then we want to load > full alignments into a CLOB column in one of our database tables. We are > trying to use sql loader for the same. Anyone has an idea how we can go > about it? > We have tried loading sequences into CLOB columns using sql loader, and > that works fine, but the same syntax when used for loading alignments, is > not working. Neeti, You'll need to be a bit more specific about what you are doing. Can you post the code you are using and error messages? Also, what is "sql loader"? And what database are you trying to use? Sean From sdavis2 at mail.nih.gov Thu Oct 12 06:40:12 2006 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Thu, 12 Oct 2006 06:40:12 -0400 Subject: [Bioperl-l] need help urgently In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> Message-ID: <200610120640.12250.sdavis2@mail.nih.gov> On Thursday 12 October 2006 02:03, neeti somaiya wrote: > We are using BioPerl to parse a BLAST output file, and then we want to load > full alignments into a CLOB column in one of our database tables. We are > trying to use sql loader for the same. Anyone has an idea how we can go > about it? > We have tried loading sequences into CLOB columns using sql loader, and > that works fine, but the same syntax when used for loading alignments, is > not working. Neeti, You'll need to be a bit more specific about what you are doing. Can you post the code you are using and error messages? Also, what is "sql loader"? And what database are you trying to use? Sean From sdavis2 at mail.nih.gov Thu Oct 12 06:40:12 2006 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Thu, 12 Oct 2006 06:40:12 -0400 Subject: [Bioperl-l] need help urgently In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> Message-ID: <200610120640.12250.sdavis2@mail.nih.gov> On Thursday 12 October 2006 02:03, neeti somaiya wrote: > We are using BioPerl to parse a BLAST output file, and then we want to load > full alignments into a CLOB column in one of our database tables. We are > trying to use sql loader for the same. Anyone has an idea how we can go > about it? > We have tried loading sequences into CLOB columns using sql loader, and > that works fine, but the same syntax when used for loading alignments, is > not working. Neeti, You'll need to be a bit more specific about what you are doing. Can you post the code you are using and error messages? Also, what is "sql loader"? And what database are you trying to use? Sean From crabtree at tigr.ORG Thu Oct 12 07:28:06 2006 From: crabtree at tigr.ORG (Jonathan Crabtree) Date: Thu, 12 Oct 2006 07:28:06 -0400 Subject: [Bioperl-l] need help urgently In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> Message-ID: <452E26C6.6040800@tigr.org> Hi Neeti- neeti somaiya wrote: > We are using BioPerl to parse a BLAST output file, and then we want to load > full alignments into a CLOB column in one of our database tables. We are > trying to use sql loader for the same. Anyone has an idea how we can go > about it? > This doesn't sound like a BioPerl issue per se, so this list might not be the best venue for your question. Since SQL*Loader is an Oracle utility you may have better luck in a forum frequented by Oracle DBAs and/or general bioinformatics people. (Not that this isn't such a forum, but unless your difficulty is actually being caused by BioPerl, or there's some kind of SQL*Loader wrapper in BioPerl--which I don't think is the case--you run the risk of having people complain that your question doesn't have enough to do with BioPerl.) > We have tried loading sequences into CLOB columns using sql loader, and that > works fine, but the same syntax when used for loading alignments, is not > working. > It's been a while since I've done any work with SQL*Loader, but I'd guess that the reason it works with sequences and not alignments is because there are characters in the alignments (newlines, perhaps?) that SQL*Loader is incorrectly interpreting as either column (field) or row (record) delimiters. You may need to change your flat file encoding to use delimiters other than the defaults (and alter the SQL*Loader control file accordingly.) As Sean pointed out, however, it's difficult to be much help without seeing an example of a failed input and the corresponding error(s)! One other thing I remember about SQL*Loader (as of Oracle 8-9 or so) is that all the CLOB values had to appear *last* in the SQL*Loader record, at least if you were using variable-length fields. But since you've loaded sequences successfully, I doubt this is the issue. One final thought is that I believe SQL*Loader has an option whereby you can place your LOB values in files external to the main SQL*Loader input file, which sidesteps the field/row delimiter issue completely; you may want to look into this if you're not already loading your Oracle database this way. Jonathan From bix at sendu.me.uk Fri Oct 13 04:56:01 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 13 Oct 2006 09:56:01 +0100 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <4521E74E.1040404@infotech.monash.edu.au> References: <4521E74E.1040404@infotech.monash.edu.au> Message-ID: <452F54A1.7010908@sendu.me.uk> Bio::Tools::Run::WrapperBase is currently a Bio::Root::RootI. It's certainly interface-like, but doesn't follow the normal interface naming convention (WrapperBaseI). Should it be a Bio::Root::Root? Renamed WrapperBaseI? Left alone? From cjfields at uiuc.edu Fri Oct 13 08:20:58 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 13 Oct 2006 07:20:58 -0500 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <452F54A1.7010908@sendu.me.uk> References: <4521E74E.1040404@infotech.monash.edu.au> <452F54A1.7010908@sendu.me.uk> Message-ID: <43CC4E80-8F15-4C83-929D-DDC719360C8F@uiuc.edu> I would say, according to BioPerl convention, it should be renamed WrapperBaseI. It has a few interface-like methods and (importantly) lacks a constructor. Unless someone else out there has other reasoning? Note that this will require lots of bioperl-run changes as well, at least I think it will. Chris On Oct 13, 2006, at 3:56 AM, Sendu Bala wrote: > Bio::Tools::Run::WrapperBase is currently a Bio::Root::RootI. It's > certainly interface-like, but doesn't follow the normal interface > naming > convention (WrapperBaseI). Should it be a Bio::Root::Root? Renamed > WrapperBaseI? Left alone? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From avilella at gmail.com Fri Oct 13 11:26:47 2006 From: avilella at gmail.com (Albert Vilella) Date: Fri, 13 Oct 2006 16:26:47 +0100 Subject: [Bioperl-l] hickup in SimpleAlign remove_gaps method Message-ID: <358f4d650610130826k1a126983t73839d665bbcb502@mail.gmail.com> Hi all, While using the remove_gaps method in Bio::SimpleAlign I found out that if the alignment is (bad enough for) having no columns without any gap at all, the method will give a: Use of uninitialized value in split at this line in add_seq: map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq); So my idea was to tweak this line to something like: map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq || ''); But I am unsure about any other side effects this may have. Anyone? Albert. From cjfields at uiuc.edu Fri Oct 13 11:51:38 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 13 Oct 2006 10:51:38 -0500 Subject: [Bioperl-l] hickup in SimpleAlign remove_gaps method In-Reply-To: <358f4d650610130826k1a126983t73839d665bbcb502@mail.gmail.com> References: <358f4d650610130826k1a126983t73839d665bbcb502@mail.gmail.com> Message-ID: You can check to see if it passes all tests. I'm guessing SimpleAlign.t tests this method out in some way (though it's always safer to check). Chris On Oct 13, 2006, at 10:26 AM, Albert Vilella wrote: > Hi all, > > While using the remove_gaps method in Bio::SimpleAlign I found out > that if the alignment is (bad enough for) having no columns without > any gap at all, the method will give a: > > Use of uninitialized value in split at this line in add_seq: > > map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq); > > So my idea was to tweak this line to something like: > > map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq || ''); > > But I am unsure about any other side effects this may have. > > Anyone? > > Albert. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From jay at jays.net Fri Oct 13 12:09:16 2006 From: jay at jays.net (Jay Hannah) Date: Fri, 13 Oct 2006 11:09:16 -0500 Subject: [Bioperl-l] Documentation typo: Bio::Search::Result::GenericResult In-Reply-To: References: Message-ID: <452FBA2C.7070003@jays.net> Thanks Brian! My first BioPerl patch has been applied! A one byte typo correction! I'm so excited! :) /home/repository/bioperl/bioperl-live/Bio/Search/Result/GenericResult.pm,v ---------------------------- revision 1.27 date: 2006/10/10 22:41:46; author: bosborne; state: Exp; lines: +4 -4 next_hit, not next_hits ---------------------------- I'm a simple man who takes great satisfaction in the simple things. :) j Brian Osborne wrote: > j, > > No need, not for something so simple. > > Brian O. > > > On 10/7/06 6:34 PM, "Jay Hannah" wrote: >> Except that "next_hits()" does not exist. Should be "next_hit()". >> >> (Should I have posted a patch instead?) > From jay at jays.net Fri Oct 13 12:24:48 2006 From: jay at jays.net (Jay Hannah) Date: Fri, 13 Oct 2006 11:24:48 -0500 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? Message-ID: <452FBDD0.2070008@jays.net> So I'm doing the following: 1) Using Bio::SeqIO to read in a genbank file and kick out fasta. 2) Reading that fasta file w/ command line formatdb. 3) Using that output for command line blastall. 4) Using Bio::SearchIO to read the blast results. (If there's a better way, do tell. -grin-) This sequence is working great for nucleotide BLASTing, but I'm stuck on step 1 when trying protein BLAST. my $seq_in = Bio::SeqIO->new( -file => " "genbank", -alphabet => "protein" ); my $seq_out_protein = Bio::SeqIO->new( -file => ">out", -format => 'fasta', -alphabet => 'protein' ); while (my $inseq = $seq_in->next_seq) { $inseq->molecule("protein"); $seq_out_protein->write_seq($inseq); } This creates a nucleotide file "out". Setting -alphabet doesn't seem to do anything. Setting molecule("protein") doesn't seem to do anything either. I was expecting that it would just pull all the CDS strings out of the genbank file and dump those into fasta format? Am I missing something obvious? Thanks, j From bosborne11 at verizon.net Fri Oct 13 12:54:02 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Fri, 13 Oct 2006 12:54:02 -0400 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? In-Reply-To: <452FBDD0.2070008@jays.net> Message-ID: Jay, You're looking for the "translation" string in the CDS section, yes? You need to delve a bit into features, the CDS is considered to be a feature of the main or parent nucleotide sequence and the translation is part of CDS feature: http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Features_from_Genbank Brian O. On 10/13/06 12:24 PM, "Jay Hannah" wrote: > Am I missing something From bix at sendu.me.uk Fri Oct 13 12:59:46 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 13 Oct 2006 17:59:46 +0100 Subject: [Bioperl-l] Documentation typo: Bio::Search::Result::GenericResult In-Reply-To: <452FBA2C.7070003@jays.net> References: <452FBA2C.7070003@jays.net> Message-ID: <452FC602.3080302@sendu.me.uk> Jay Hannah wrote: > Thanks Brian! > > My first BioPerl patch has been applied! A one byte typo correction! I'm so excited! :) > > /home/repository/bioperl/bioperl-live/Bio/Search/Result/GenericResult.pm,v > ---------------------------- > revision 1.27 > date: 2006/10/10 22:41:46; author: bosborne; state: Exp; lines: +4 -4 > next_hit, not next_hits > ---------------------------- Congratulations! :D Next it will be two byte corrections and from there, the sky's the limit! :) From hlapp at gmx.net Fri Oct 13 13:28:50 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 13 Oct 2006 13:28:50 -0400 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <452F54A1.7010908@sendu.me.uk> References: <4521E74E.1040404@infotech.monash.edu.au> <452F54A1.7010908@sendu.me.uk> Message-ID: <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net> What does the POD (and the code) say about instantiating it? -hilmar On Oct 13, 2006, at 4:56 AM, Sendu Bala wrote: > Bio::Tools::Run::WrapperBase is currently a Bio::Root::RootI. It's > certainly interface-like, but doesn't follow the normal interface > naming > convention (WrapperBaseI). Should it be a Bio::Root::Root? Renamed > WrapperBaseI? Left alone? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From jay at jays.net Fri Oct 13 14:56:38 2006 From: jay at jays.net (Jay Hannah) Date: Fri, 13 Oct 2006 13:56:38 -0500 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? In-Reply-To: References: Message-ID: <452FE166.5080405@jays.net> Brian Osborne wrote: > You're looking for the "translation" string in the CDS section, yes? You > need to delve a bit into features, the CDS is considered to be a feature of > the main or parent nucleotide sequence and the translation is part of CDS > feature: > > http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Features_from_Genbank Yes. Thanks. I "rolled my own" -- I'm now doing this: while (my $inseq = $seq_in->next_seq) { my @features = $inseq->get_SeqFeatures(); foreach my $feat ( @features ) { next unless ($feat->primary_tag eq "CDS"); my @db_xrefs = $feat->annotation->get_Annotations("db_xref"); @db_xrefs = grep { /^GI:/ } @db_xrefs; die "Panic! More than one GI: db_xref?" if (@db_xrefs > 1); die "Panic! No GI: db_xref?" unless (@db_xrefs == 1); my $gi = $db_xrefs[0]; $gi =~ s/^GI://; my @translations = $feat->annotation->get_Annotations("translation"); die "Panic! More than one translation?" if (@translations > 1); my @protein_ids = $feat->annotation->get_Annotations("protein_id"); die "Panic! More than one protein_id?" if (@protein_ids > 1); my @product = $feat->annotation->get_Annotations("product"); die "Panic! More than one product?" if (@product > 1); print ">gi|$gi|gb|$protein_ids[0]|"; print $inseq->id . " $product[0]\n"; print "$translations[0]\n"; } } To generate a homebrew fasta file for a protein BLAST. I just thought that -alphabet and molecule() would do that stuff for me? What else would "protein" mean in those? Does anyone use -alphabet and/or molecule()? For what? How? Again, here's what I'm talking about: ========== my $seq_out_protein = Bio::SeqIO->new( -file => ">out", -format => 'fasta', -alphabet => 'protein' # No effect? ); while (my $inseq = $seq_in->next_seq) { $inseq->molecule("protein"); # No effect? ========== Thanks, j From bosborne11 at verizon.net Fri Oct 13 17:20:40 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Fri, 13 Oct 2006 17:20:40 -0400 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? In-Reply-To: <452FE166.5080405@jays.net> Message-ID: Jay, Yes, people use the -alphabet parameter. If you set it to something then Bioperl will not try to determine whether the sequence is protein, rna, or dna and this is particularly useful when the sequence contains characters that Bioperl would object to (sequences with distasteful characters can be created by various applications, for example, or you might introduce some weird character for some reason). Setting the -alphabet would also speed up Bioperl a bit, for the same reason. Brian O. On 10/13/06 2:56 PM, "Jay Hannah" wrote: > > I just thought that -alphabet and molecule() would do that stuff for me? What > else would "protein" mean in those? From jay at jays.net Sat Oct 14 11:25:05 2006 From: jay at jays.net (Jay Hannah) Date: Sat, 14 Oct 2006 10:25:05 -0500 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? In-Reply-To: References: Message-ID: <45310151.5050901@jays.net> Brian Osborne wrote: > Yes, people use the -alphabet parameter. If you set it to something then > Bioperl will not try to determine whether the sequence is protein, rna, or > dna and this is particularly useful when the sequence contains characters > that Bioperl would object to (sequences with distasteful characters can be > created by various applications, for example, or you might introduce some > weird character for some reason). Setting the -alphabet would also speed up > Bioperl a bit, for the same reason. Huh. That's what I assumed when I stumbled into the -alphabet parameter. So I thought this would read the protein sequences out of my genbank file and write a fasta file for me: my $seq_in = Bio::SeqIO->new( -file => "<$file", -format => "genbank", -alphabet => "protein" # No effect? ); my $seq_out = Bio::SeqIO->new( -file => ">$outfile", -format => "fasta", -alphabet => "protein" # No effect? ); while (my $inseq = $seq_in->next_seq) { $inseq->molecule("protein"); # No effect? $seq_out->write_seq($inseq); } It didn't. Would it be a Good Thing if it did what I was expecting? (Like I said I rolled my own, but I'm always looking for ways to enhance BioPerl that other people might find useful... Someday I will contribute something useful, by golly. -grin-) (Background: I'm doing protein BLASTs from genbank files. To make formatdb happy I have to have fasta files full of the protein sequences.) j From bosborne11 at verizon.net Sat Oct 14 14:40:21 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Sat, 14 Oct 2006 14:40:21 -0400 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? In-Reply-To: <45310151.5050901@jays.net> Message-ID: Jay, What you expected was that setting the -alphabet to "protein" would make Bioperl translate the input nucleotide sequence to output protein. In Bioperl this is accomplished by using the translate() method, no surprise there. If you take a look at the documentation on translate() in the online Bioperl Tutorial you'll see that this is a fairly sophisticated method, you can do all sorts of different things with it. So using -alphabet for this purpose won't really work, there are too many different ways to translate. Brian O. On 10/14/06 11:25 AM, "Jay Hannah" wrote: > Would it be a Good Thing if it did what I was expecting? From cjfields at uiuc.edu Sat Oct 14 20:44:04 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 14 Oct 2006 19:44:04 -0500 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? In-Reply-To: <45310151.5050901@jays.net> Message-ID: <000601c6eff3$084663c0$15327e82@pyrimidine> ... > Huh. That's what I assumed when I stumbled into the -alphabet parameter. > So I thought this would read the protein sequences out of my genbank file > and write a fasta file for me: You have to think about it this way: the GenBank record you are using is for the nucleotide sequence only, and all other information in that record describes the sequence. Similarly, if you used a 'GenPept' sequence, the focus would be the protein sequence. Both normally contain annotations which describe the sequence globally, such as references, organism info, etc. Both also may contain features (or SeqFeatures), which describe a feature bound to a particular location on the sequence. However, features are not an absolute requirement for a sequence; they're sort of 'window dressing', albeit almost always essential for describing the main sequence. I would do exactly as Brian suggests. See the Feature/Annotation HOWTO for ideas on how to screen out the particular features you want and either grab the 'translation' tag data or get the sequence object from the feature and translate it directly. You should get the same result either way though getting the tag may be faster. ... > It didn't. Would it be a Good Thing if it did what I was expecting? (Like > I said I rolled my own, but I'm always looking for ways to enhance BioPerl > that other people might find useful... Someday I will contribute something > useful, by golly. -grin-) > > (Background: I'm doing protein BLASTs from genbank files. To make formatdb > happy I have to have fasta files full of the protein sequences.) > > j You could, theoretically, write up a method to only retrieve features which correspond to coding regions only (CDS). You may want to optionally screen out pseudogenes but that's up to you. Chris From avilella at gmail.com Sun Oct 15 07:08:23 2006 From: avilella at gmail.com (Albert Vilella) Date: Sun, 15 Oct 2006 12:08:23 +0100 Subject: [Bioperl-l] no_residues test in SimpleAlign.t Message-ID: <358f4d650610150408r23cdf693t8e3b6b797649ff58@mail.gmail.com> Hi all, Can somebody check the SimpleAlign.t test? perl t/SimpleAlign.t I get a few errors, I am looking at one that deals with no_residues. I don't understand if this is suposed to work: sub no_residues { my $self = shift; my $count = 0; foreach my $seq ($self->each_seq) { my $str = $seq->seq(); $count += ($str =~ s/[^A-Za-z]//g); #is this the same as: # $str =~ s/[^A-Za-z]//g; # $count += length($str); } Cheers, Albert. return $count; } From cjfields at uiuc.edu Sun Oct 15 13:53:50 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 15 Oct 2006 12:53:50 -0500 Subject: [Bioperl-l] no_residues test in SimpleAlign.t In-Reply-To: <358f4d650610150408r23cdf693t8e3b6b797649ff58@mail.gmail.com> References: <358f4d650610150408r23cdf693t8e3b6b797649ff58@mail.gmail.com> Message-ID: Albert, I get all 75 tests passing. SimpleAlign.t was recently switched over to Test::More, so you should be seeing more explicit test descriptions. It looks like test 27 is no_residues(). Were there any more that failed? I usually run 'perl -I. t/test.t' from the main bioperl directory to check individual tests from the local directory. Otherwise you are checking your installed version which may be older (and may not match tests and recent bug fixes). Could that be the problem? Chris On Oct 15, 2006, at 6:08 AM, Albert Vilella wrote: > Hi all, > > Can somebody check the SimpleAlign.t test? > > perl t/SimpleAlign.t > > I get a few errors, I am looking at one that deals with no_residues. I > don't understand if this is suposed to work: > > sub no_residues { > my $self = shift; > my $count = 0; > > foreach my $seq ($self->each_seq) { > my $str = $seq->seq(); > > $count += ($str =~ s/[^A-Za-z]//g); > #is this the same as: > # $str =~ s/[^A-Za-z]//g; > # $count += length($str); > } > > Cheers, > > Albert. > return $count; > } > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From DGroskreutz at twt.com Mon Oct 16 02:00:39 2006 From: DGroskreutz at twt.com (DGroskreutz at twt.com) Date: Mon, 16 Oct 2006 01:00:39 -0500 Subject: [Bioperl-l] CN=Deb Groskreutz/OU=MSN/O=TWT is out of the office. Message-ID: I will be out of the office starting 10/13/2006 and will not return until 10/30/2006. I will be out of the office until October 30, 2006. I will reply to your message at that time. Thanks, Deb NOTICE OF CONFIDENTIALITY: The information contained in this communication, including attachments, is intended for the specific delivery to and use by the individual(s) to whom it is addressed. This email includes confidential information that may be attorney-client privileged. Any review, retransmission, dissemination, or unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please reply to the sender immediately and delete the original communication and any copy of it from your computer system, including all attachments. From bix at sendu.me.uk Mon Oct 16 04:08:34 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 09:08:34 +0100 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net> References: <4521E74E.1040404@infotech.monash.edu.au> <452F54A1.7010908@sendu.me.uk> <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net> Message-ID: <45333E02.9070808@sendu.me.uk> Hilmar Lapp wrote: > What does the POD (and the code) say about instantiating it? =head1 SYNOPSIS # do not use this object directly, it provides the following methods # for its subclasses ... =head1 DESCRIPTION This is a basic module from which to build executable wrapper modules. It has some basic methods to help when implementing new modules. There is no new() method. From bix at sendu.me.uk Mon Oct 16 09:23:41 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 14:23:41 +0100 Subject: [Bioperl-l] Bio::WebAgent sleep warning Message-ID: <453387DD.3040105@sendu.me.uk> Hi, Does anyone think it's appropriate for Bio::WebAgent to issue warnings every time it sleeps? I'd consider the sleeping part of its normal, expected and desired behaviour so I don't need to be warned about it. Perhaps change the $self->warn to a $self->debug? From cjfields at uiuc.edu Mon Oct 16 10:12:10 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 09:12:10 -0500 Subject: [Bioperl-l] Bio::WebAgent sleep warning In-Reply-To: <453387DD.3040105@sendu.me.uk> Message-ID: <000c01c6f12d$121b5000$15327e82@pyrimidine> > Hi, > > Does anyone think it's appropriate for Bio::WebAgent to issue warnings > every time it sleeps? I'd consider the sleeping part of its normal, > expected and desired behaviour so I don't need to be warned about it. > Perhaps change the $self->warn to a $self->debug? That sounds fine. Using debugging output for sleep would be similar behavior to Bio::DB::NCBIHelper and BioDB::GenericWebDBI. You may want to pass it by Heikki (I think that's his module). The only reason I would want to see sleep output, personally, is to make sure it is working properly. Almost looks like that class has the same intent that GenericWebDBI has (even down to using LWP::UserAgent as a superclass). I may look into it to see if I can use this as a superclass for GenericWebDBI. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Mon Oct 16 10:26:21 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Mon, 16 Oct 2006 15:26:21 +0100 Subject: [Bioperl-l] Bioperl Server Reconfig Message-ID: <4533968D.6040009@sheffield.ac.uk> Did anyone reconfigure the bioperl web server (which ever server hosts http://bioperl.org/DIST) by adding the following lines to the httpd.conf file: RedirectMatch /DIST/MSWin32-x86-multi-thread-5.8/(.*) http://ppm.activestate.com/PPMPackages/5.8-windows/MSWin32-x86-multi-thread-5.8/$1 This will be required as a workaround to a bug in ActivePerl 5.8.8.819 which will result in a failed install of Bioperl via PPM. Cheers Nath From n.haigh at sheffield.ac.uk Mon Oct 16 11:30:16 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Mon, 16 Oct 2006 16:30:16 +0100 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533A257.2000207@campus.iztacala.unam.mx> References: <4533968D.6040009@sheffield.ac.uk> <4533A257.2000207@campus.iztacala.unam.mx> Message-ID: <4533A588.9020505@sheffield.ac.uk> Mauricio Herrera Cuadra wrote: > Done. Could you please check if it works as it should? > > Cheers, > Mauricio. Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got someone to pop it in http://bioperl/DIST Volunteers? BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for the PPD? I seem to remember that there was talk about having to maintain a separate Bundle::BioPerl for each release of Bioperl. Any ideas on this front? Nath From arareko at campus.iztacala.unam.mx Mon Oct 16 11:16:39 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Mon, 16 Oct 2006 10:16:39 -0500 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533968D.6040009@sheffield.ac.uk> References: <4533968D.6040009@sheffield.ac.uk> Message-ID: <4533A257.2000207@campus.iztacala.unam.mx> Done. Could you please check if it works as it should? Cheers, Mauricio. Nathan Haigh wrote: > Did anyone reconfigure the bioperl web server (which ever server hosts > http://bioperl.org/DIST) by adding the following lines to the httpd.conf > file: > > RedirectMatch /DIST/MSWin32-x86-multi-thread-5.8/(.*) > http://ppm.activestate.com/PPMPackages/5.8-windows/MSWin32-x86-multi-thread-5.8/$1 > > This will be required as a workaround to a bug in ActivePerl 5.8.8.819 > which will result in a failed install of Bioperl via PPM. > > Cheers > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From arareko at campus.iztacala.unam.mx Mon Oct 16 11:33:33 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Mon, 16 Oct 2006 10:33:33 -0500 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533A588.9020505@sheffield.ac.uk> References: <4533968D.6040009@sheffield.ac.uk> <4533A257.2000207@campus.iztacala.unam.mx> <4533A588.9020505@sheffield.ac.uk> Message-ID: <4533A64D.6040203@campus.iztacala.unam.mx> Nathan Haigh wrote: > Mauricio Herrera Cuadra wrote: >> Done. Could you please check if it works as it should? >> >> Cheers, >> Mauricio. > Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got > someone to pop it in http://bioperl/DIST > > Volunteers? You can send it to me. -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From akarger at CGR.Harvard.edu Mon Oct 16 11:54:33 2006 From: akarger at CGR.Harvard.edu (Amir Karger) Date: Mon, 16 Oct 2006 11:54:33 -0400 Subject: [Bioperl-l] Bio::Location::Split Message-ID: I recently came across bug 2101, where Bio::Location::Split::to_FTstring gives the incorrect order for multi-sublocation locations on the minus strand. That is, I found it by getting incorrect results, and then found it in Bugzilla and in the September archives. I'm converting CDS files from one format to another. E.g., I read an EMBL file with a chromosome and CDS features, and want to output the location in a FASTA header. If I do something like: foreach (<$in>) { foreach my $feat ($seq->getSeqFeatures) { print $feat->location->to_FTstring() } } I get the wrong results for multi-exon CDSs on the -1 strand, as described in the bug report. Is there a relatively easy way around this? I assume I can't get at the original string of the location, which in this case is all I need. Can I just flip the order of the exons in certain cases? Chris F, can you tell me the preliminary solution you mentioned? I must say I'm sort of surprised this wasn't found before. It seems like a not-that-rare occurrence. Oh well. Thanks, - Amir Karger Research Computing Life Sciences Division Harvard University From bix at sendu.me.uk Mon Oct 16 12:14:39 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 17:14:39 +0100 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533A588.9020505@sheffield.ac.uk> References: <4533968D.6040009@sheffield.ac.uk> <4533A257.2000207@campus.iztacala.unam.mx> <4533A588.9020505@sheffield.ac.uk> Message-ID: <4533AFEF.8080103@sendu.me.uk> Nathan Haigh wrote: > Mauricio Herrera Cuadra wrote: >> Done. Could you please check if it works as it should? >> >> Cheers, >> Mauricio. > Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got > someone to pop it in http://bioperl/DIST > > Volunteers? I'm sure Mauricio would be happy to do it, but so am I. You may want to hold off a little while until I release rc2, which may be a few hours away. > BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for > the PPD? I seem to remember that there was talk about having to maintain > a separate Bundle::BioPerl for each release of Bioperl. Any ideas on > this front? It depends on what is in the PPD and what kind of auto-dependency features the ActiveState installer has. Given Perl 5.8 and your current PPD, does Bioperl install with the same or fewer number of skips if you also install Bundle::BioPerl first? That is, does Bundle::BioPerl even do anything useful anymore? If not, obviously don't bother making it a pre-req. If it does, my opinion is that you make it a pre-req. If people really don't want to install the optional stuff they can download the .zip file and install manually without even a make. From Kevin.M.Brown at asu.edu Mon Oct 16 12:14:51 2006 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Mon, 16 Oct 2006 09:14:51 -0700 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? Message-ID: <1A4207F8295607498283FE9E93B775B402196FAA@EX02.asurite.ad.asu.edu> > > Yes, people use the -alphabet parameter. If you set it to > something then > > Bioperl will not try to determine whether the sequence is > protein, rna, or > > dna and this is particularly useful when the sequence > contains characters > > that Bioperl would object to (sequences with distasteful > characters can be > > created by various applications, for example, or you might > introduce some > > weird character for some reason). Setting the -alphabet > would also speed up > > Bioperl a bit, for the same reason. > > Huh. That's what I assumed when I stumbled into the -alphabet > parameter. So I thought this would read the protein sequences > out of my genbank file and write a fasta file for me: > > my $seq_in = Bio::SeqIO->new( > -file => "<$file", > -format => "genbank", > -alphabet => "protein" # No effect? > ); > my $seq_out = Bio::SeqIO->new( > -file => ">$outfile", > -format => "fasta", > -alphabet => "protein" # No effect? > ); > while (my $inseq = $seq_in->next_seq) { > $inseq->molecule("protein"); # No effect? > $seq_out->write_seq($inseq); > } > > It didn't. Would it be a Good Thing if it did what I was > expecting? (Like I said I rolled my own, but I'm always > looking for ways to enhance BioPerl that other people might > find useful... Someday I will contribute something useful, by > golly. -grin-) > > (Background: I'm doing protein BLASTs from genbank files. To > make formatdb happy I have to have fasta files full of the > protein sequences.) This might work for your needs (CDS to protein FASTA). my $seq_in = Bio::SeqIO->new( -file => "<$file", -format => "genbank", ); open my $seq_out, ">$outfile"; while (my $inseq = $seq_in->next_seq) { print $seq_out ">". $inseq->display_id(). "\n"; print $seq_out $inseq->translate() ."\n"; } From bix at sendu.me.uk Mon Oct 16 11:44:19 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 16:44:19 +0100 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? Message-ID: <4533A8D3.90709@sendu.me.uk> I think Chris recently deprecated this, but should it be? For me, its POD description justifies its existence, and perhaps more importantly, Bio::Index::Blast relies on it. I took a quick peek at the latter and it didn't seem trivial to move it over to Bio::SearchIO instead. Should it be undeprecated? From n.haigh at sheffield.ac.uk Mon Oct 16 12:39:02 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Mon, 16 Oct 2006 17:39:02 +0100 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533AFEF.8080103@sendu.me.uk> References: <4533968D.6040009@sheffield.ac.uk> <4533A257.2000207@campus.iztacala.unam.mx> <4533A588.9020505@sheffield.ac.uk> <4533AFEF.8080103@sendu.me.uk> Message-ID: <4533B5A6.1070701@sheffield.ac.uk> Sendu Bala wrote: > Nathan Haigh wrote: >> Mauricio Herrera Cuadra wrote: >>> Done. Could you please check if it works as it should? >>> >>> Cheers, >>> Mauricio. >> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got >> someone to pop it in http://bioperl/DIST >> >> Volunteers? > > I'm sure Mauricio would be happy to do it, but so am I. You may want > to hold off a little while until I release rc2, which may be a few > hours away. Just e-mailed Mauricio links to the files off list, It's not a big job for me to remake the bioperl PPD, so Mauricio it's up to you if you want to wait 18hrs for me to make the PPDs for 1.5.2-rc2. > > >> BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for >> the PPD? I seem to remember that there was talk about having to maintain >> a separate Bundle::BioPerl for each release of Bioperl. Any ideas on >> this front? > > It depends on what is in the PPD and what kind of auto-dependency > features the ActiveState installer has. Given Perl 5.8 and your > current PPD, does Bioperl install with the same or fewer number of > skips if you also install Bundle::BioPerl first? That is, does > Bundle::BioPerl even do anything useful anymore? If not, obviously > don't bother making it a pre-req. If it does, my opinion is that you > make it a pre-req. If people really don't want to install the optional > stuff they can download the .zip file and install manually without > even a make. As far as the PPDs are concerned - no tests are run during installation. PPM more or less just copies files into the correct place for Perl to find so both approaches result in the same thing. However, I've not tried making a CPAN distribution file for either Bioperl or Bundle::Bioperl - I wouldn't know where to start! MakeFile.PL now only documents the prereq in one place (%packages), and this is used to add the prereq to the bioperl PPD when issuing "nmake ppd". This way, each release of BioPerl should be up-to-date with prereq as long as developers add their modules prereq to %packages. If we have Bundle::BioPerl, most of those prereq need to be moved from the Bioperl PPD to the Bundle::BioPerl PPD - a bit of a pain because there are no guidelines as to what should/should not go in Bundle::BioPerl. Therefore, as far as the PPDs are concerned, it far easier to do away with Bundel::BioPerl. Nath From hlapp at gmx.net Mon Oct 16 13:04:24 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 16 Oct 2006 13:04:24 -0400 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <45333E02.9070808@sendu.me.uk> References: <4521E74E.1040404@infotech.monash.edu.au> <452F54A1.7010908@sendu.me.uk> <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net> <45333E02.9070808@sendu.me.uk> Message-ID: <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net> So it looks like an abstract base class, not an interface that defines a contract or API? Should use Root.pm then, would be my vote. -hilmar On Oct 16, 2006, at 4:08 AM, Sendu Bala wrote: > Hilmar Lapp wrote: >> What does the POD (and the code) say about instantiating it? > > =head1 SYNOPSIS > > # do not use this object directly, it provides the following > methods > # for its subclasses > > ... > > > =head1 DESCRIPTION > > This is a basic module from which to build executable wrapper modules. > It has some basic methods to help when implementing new modules. > > > There is no new() method. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Mon Oct 16 13:08:28 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 16 Oct 2006 13:08:28 -0400 Subject: [Bioperl-l] Bio::WebAgent sleep warning In-Reply-To: <453387DD.3040105@sendu.me.uk> References: <453387DD.3040105@sendu.me.uk> Message-ID: It depends. What triggers the sleeping? If it's part of every request that it processes then I'd agree. If it is triggered by failure to precede the next try then the failure is probably not expected (though possible), and hence should be reported by warn(). If it is just part of the polling cycle then there should probably be a limit up to which the time waited is considered 'normal' and after which it is considered 'excessive' and hence should be reported through warn(). My $0.02. -hilmar On Oct 16, 2006, at 9:23 AM, Sendu Bala wrote: > Hi, > > Does anyone think it's appropriate for Bio::WebAgent to issue warnings > every time it sleeps? I'd consider the sleeping part of its normal, > expected and desired behaviour so I don't need to be warned about it. > Perhaps change the $self->warn to a $self->debug? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Mon Oct 16 13:13:53 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 18:13:53 +0100 Subject: [Bioperl-l] Bio::WebAgent sleep warning In-Reply-To: References: <453387DD.3040105@sendu.me.uk> Message-ID: <4533BDD1.8060204@sendu.me.uk> Hilmar Lapp wrote: > It depends. What triggers the sleeping? If it's part of every request > that it processes then I'd agree. If it is triggered by failure to > precede the next try then the failure is probably not expected (though > possible), and hence should be reported by warn(). > > If it is just part of the polling cycle then there should probably be a > limit up to which the time waited is considered 'normal' and after which > it is considered 'excessive' and hence should be reported through warn(). =head2 sleep Title : sleep Usage : $self->sleep Function: sleep for a number of seconds indicated by the delay policy Returns : none Args : none NOTE: This method keeps track of the last time it was called and only imposes a sleep if it was called more recently than the delay_policy() allows. =cut It issues a warning every time it actually sleeps. I find it inappropriate that a method warns me that it did what I asked it to do. From arareko at campus.iztacala.unam.mx Mon Oct 16 13:14:06 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Mon, 16 Oct 2006 12:14:06 -0500 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533B5A6.1070701@sheffield.ac.uk> References: <4533968D.6040009@sheffield.ac.uk> <4533A257.2000207@campus.iztacala.unam.mx> <4533A588.9020505@sheffield.ac.uk> <4533AFEF.8080103@sendu.me.uk> <4533B5A6.1070701@sheffield.ac.uk> Message-ID: <4533BDDE.2040801@campus.iztacala.unam.mx> Nathan Haigh wrote: > Sendu Bala wrote: >> Nathan Haigh wrote: >>> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got >>> someone to pop it in http://bioperl/DIST >>> >>> Volunteers? >> I'm sure Mauricio would be happy to do it, but so am I. You may want >> to hold off a little while until I release rc2, which may be a few >> hours away. > > Just e-mailed Mauricio links to the files off list, It's not a big job > for me to remake the bioperl PPD, so Mauricio it's up to you if you want > to wait 18hrs for me to make the PPDs for 1.5.2-rc2. Too late, I've already placed 1.5.2-rc1 in DIST. hehe :) -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From bix at sendu.me.uk Mon Oct 16 12:32:11 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 17:32:11 +0100 Subject: [Bioperl-l] Swissprot problems Message-ID: <4533B40B.2030908@sendu.me.uk> t/Biofetch.t and t/DB.t are skipping their swissprot database fetches. Earlier today http://www.ebi.ac.uk/cgi-bin/dbfetch was down for maintenance but is now back up. However I'm guessing the databases must have changed. I've manually looked for the test case 'YNB3_YEAST' in database 'UniProtKB' and it came back with no result, even though I can find the test case manually at the expasy website. Is this an EBI bug or deliberate change that makes sense to someone? From m.weimer at dkfz-heidelberg.de Mon Oct 16 12:43:38 2006 From: m.weimer at dkfz-heidelberg.de (Marc Weimer) Date: Mon, 16 Oct 2006 18:43:38 +0200 Subject: [Bioperl-l] Bio::DB::SwissProt Problem Message-ID: <1161017019.5203.6.camel@localhost> Dear list members, when running ###################################################################### #! /usr/bin/perl -w use strict; use Bio::DB::SwissProt; my $db_obj = new Bio::DB::SwissProt(-verbose => 1); my $seq_obj = $db_obj->get_Seq_by_acc("O02938"); ###################################################################### using Bioperl 1.5.2 I get the following message: ########################################################################################## request is POST http://www.ebi.ac.uk/cgi-bin/dbfetch Content-Length: 49 Content-Type: application/x-www-form-urlencoded format=swissprot&db=UniProtKB&style=raw&id=O02938 ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: acc O02938 does not exist STACK: Error::throw STACK: Bio::Root::Root::throw /usr/local/share/perl/5.8.7/Bio/Root/Root.pm:350 STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc /usr/local/share/perl/5.8.7/Bio/DB/WebDBSeqI.pm:181 STACK: ./get.test.pl:8 ----------------------------------------------------------- ########################################################################################## But the accession number does exist. Surprisingly, everything worked fine a few days ago. Any ideas of what might have happened? Thanks and best regards, Marc From hlapp at gmx.net Mon Oct 16 13:15:50 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 16 Oct 2006 13:15:50 -0400 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <4533A8D3.90709@sendu.me.uk> References: <4533A8D3.90709@sendu.me.uk> Message-ID: The problem is it is not maintained, and there are outstanding been bug reports. If you un-deprecate it, then we need a response to people who come across problems with it when using it. Either you change the POD to say exactly who and when one should use it (or rather not) and point to the fact that it is unsupported for all other cases. Or what would you suggest? -hilmar On Oct 16, 2006, at 11:44 AM, Sendu Bala wrote: > I think Chris recently deprecated this, but should it be? For me, its > POD description justifies its existence, and perhaps more importantly, > Bio::Index::Blast relies on it. > > I took a quick peek at the latter and it didn't seem trivial to > move it > over to Bio::SearchIO instead. > > Should it be undeprecated? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Mon Oct 16 13:21:46 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 12:21:46 -0500 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <4533A8D3.90709@sendu.me.uk> Message-ID: <000001c6f147$8efdfd60$15327e82@pyrimidine> Bio::Tools::BPlite was placed on the deprecation list a while back (~ rel 1.5); the other related Bio::Tools::BP* modules were also supposed to be on that list as well. If we want to undeprecate (de-deprecate? reprecate?) BPlite we also would need to do the same for the others. They must be updated to parse current BLAST/PSI-BLAST/bl2seq text output, something that Bio::SearchIO::blast is currently capable of (so the functionality is redundant). And someone needs to take them over. In my opinion it may be more trouble than it's worth as they haven't been touched in a while. Seems if we 'revive' BPlite we're not really moving forward esp. since you have added the PullParser recently and made substantial improvements to SearchIO. Maybe Bio::Index::Blast just needs to be deprecated or rewritten to use SearchIO? Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Sendu Bala > Sent: Monday, October 16, 2006 10:44 AM > To: bioperl-l > Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? > > I think Chris recently deprecated this, but should it be? For me, its > POD description justifies its existence, and perhaps more importantly, > Bio::Index::Blast relies on it. > > I took a quick peek at the latter and it didn't seem trivial to move it > over to Bio::SearchIO instead. > > Should it be undeprecated? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From bix at sendu.me.uk Mon Oct 16 13:21:58 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 18:21:58 +0100 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: References: <4533A8D3.90709@sendu.me.uk> Message-ID: <4533BFB6.5070504@sendu.me.uk> Hilmar Lapp wrote: > The problem is it is not maintained, and there are outstanding been bug > reports. > > If you un-deprecate it, then we need a response to people who come > across problems with it when using it. Either you change the POD to say > exactly who and when one should use it (or rather not) and point to the > fact that it is unsupported for all other cases. > > Or what would you suggest? I'm not sure. Does Bio::Index::Blast even work correctly? Does it suffer from whatever bugs Bio::Tools::BPlite has? Is Bio::Index::Blast maintained? Should that be deprecated as well? Ideally I'd like to see Bio::Index::Blast updated to use Bio::SearchIO and then Bio::Tools::BPlite can be deprecated. But like I say, it didn't seem trivial (or even appropriate). Ultimately I just wanted to solve the warnings in the test suite. Thoughts, Chris? From cjfields at uiuc.edu Mon Oct 16 13:30:05 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 12:30:05 -0500 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533A588.9020505@sheffield.ac.uk> Message-ID: <000101c6f148$b8538b20$15327e82@pyrimidine> > Mauricio Herrera Cuadra wrote: > > Done. Could you please check if it works as it should? > > > > Cheers, > > Mauricio. > Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got > someone to pop it in http://bioperl/DIST > > Volunteers? > > BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for > the PPD? I seem to remember that there was talk about having to maintain > a separate Bundle::BioPerl for each release of Bioperl. Any ideas on > this front? > > Nath Nathan, I think Chris Dagdigian still maintains Bundle::Bioperl on CPAN. That version should be the common basis for prereqs for any Bioperl core installation. It's relatively easy to add/remove modules to the Bundle::Bioperl. Contact Chris D. and let him know if anything needs to be changed. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 16 13:33:50 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 12:33:50 -0500 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net> Message-ID: <000201c6f149$3ed63490$15327e82@pyrimidine> > So it looks like an abstract base class, not an interface that > defines a contract or API? Should use Root.pm then, would be my vote. > > -hilmar Makes sense to me. Maybe another audit is needed to catch similar instances, or has this been done already? Chris > On Oct 16, 2006, at 4:08 AM, Sendu Bala wrote: > > > Hilmar Lapp wrote: > >> What does the POD (and the code) say about instantiating it? > > > > =head1 SYNOPSIS > > > > # do not use this object directly, it provides the following > > methods > > # for its subclasses > > > > ... > > > > > > =head1 DESCRIPTION > > > > This is a basic module from which to build executable wrapper modules. > > It has some basic methods to help when implementing new modules. > > > > > > There is no new() method. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 16 13:57:35 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 12:57:35 -0500 Subject: [Bioperl-l] Bio::Location::Split In-Reply-To: Message-ID: <000301c6f14c$8fb0e060$15327e82@pyrimidine> > I recently came across bug 2101, where Bio::Location::Split::to_FTstring > gives the incorrect order for multi-sublocation locations on the minus > strand. That is, I found it by getting incorrect results, and then found > it in Bugzilla and in the September archives. > > I'm converting CDS files from one format to another. E.g., I read an > EMBL file with a chromosome and CDS features, and want to output the > location in a FASTA header. If I do something like: > > foreach (<$in>) { > foreach my $feat ($seq->getSeqFeatures) { > print $feat->location->to_FTstring() > } > } > > I get the wrong results for multi-exon CDSs on the -1 strand, as > described in the bug report. > > Is there a relatively easy way around this? I assume I can't get at the > original string of the location, which in this case is all I need. Can I > just flip the order of the exons in certain cases? Chris F, can you tell > me the preliminary solution you mentioned? > > I must say I'm sort of surprised this wasn't found before. It seems like > a not-that-rare occurrence. Oh well. > > Thanks, > > - Amir Karger > Research Computing > Life Sciences Division > Harvard University Could you let me know specifically which EMBL file contains the odd locations? The bug report uses theoretical locations, not actual ones, so it would be nice to have a real-world example to test against. As for the lack of catching this, the particular types of locations that cause the issue are quite rare. Note that there are two bugs for that bug report. The first (and more serious) is still unresolved. The second (where remote locations are treated differently in Location::Split, which caused more problems than it was worth) had a fix committed about a month ago. Any fixes I have made for the first bug invariably break several other methods, which use the current Location::Split object logic for retrieving sequences, building feature strings, etc. Since a new RC is imminent and the bug only affects a small number of locations, I have held off until after a final release is made (the last thing I want to do is fix something that breaks ~6-8 other methods), but I'll try looking at it again this week. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 16 14:29:02 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 13:29:02 -0500 Subject: [Bioperl-l] Swissprot problems In-Reply-To: <4533B40B.2030908@sendu.me.uk> Message-ID: <000401c6f150$f57dfc30$15327e82@pyrimidine> > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Sendu Bala > Sent: Monday, October 16, 2006 11:32 AM > To: bioperl-l > Subject: [Bioperl-l] Swissprot problems > > t/Biofetch.t and t/DB.t are skipping their swissprot database fetches. > Earlier today http://www.ebi.ac.uk/cgi-bin/dbfetch was down for > maintenance but is now back up. However I'm guessing the databases must > have changed. I've manually looked for the test case 'YNB3_YEAST' in > database 'UniProtKB' and it came back with no result, even though I can > find the test case manually at the expasy website. > > Is this an EBI bug or deliberate change that makes sense to someone? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l I can confirm that. It's not our end, though. Entering the same data on the DBFetch web page also gets no data. I have emailed EBI about the problem and will let you know if I hear anything; I think it's related to the maintenance issue. Notably, nothing on the web page indicates any database name changes yet. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 16 14:29:52 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 13:29:52 -0500 Subject: [Bioperl-l] Bio::DB::SwissProt Problem In-Reply-To: <1161017019.5203.6.camel@localhost> Message-ID: <000501c6f151$12918710$15327e82@pyrimidine> We think there is a problem on the SwissProt (DBFetch) server. I have contacted them about the problem and will post something when I hear something back. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Marc Weimer > Sent: Monday, October 16, 2006 11:44 AM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] Bio::DB::SwissProt Problem > > Dear list members, > > when running > > ###################################################################### > #! /usr/bin/perl -w > > use strict; > use Bio::DB::SwissProt; > > my $db_obj = new Bio::DB::SwissProt(-verbose => 1); > > my $seq_obj = $db_obj->get_Seq_by_acc("O02938"); > ###################################################################### > > using Bioperl 1.5.2 I get the following message: > > ########################################################################## > ################ > > request is POST http://www.ebi.ac.uk/cgi-bin/dbfetch > Content-Length: 49 > Content-Type: application/x-www-form-urlencoded > > format=swissprot&db=UniProtKB&style=raw&id=O02938 > > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: acc O02938 does not exist > STACK: Error::throw > STACK: > Bio::Root::Root::throw /usr/local/share/perl/5.8.7/Bio/Root/Root.pm:350 > STACK: > Bio::DB::WebDBSeqI::get_Seq_by_acc > /usr/local/share/perl/5.8.7/Bio/DB/WebDBSeqI.pm:181 > STACK: ./get.test.pl:8 > ----------------------------------------------------------- > > ########################################################################## > ################ > > But the accession number does exist. Surprisingly, everything worked > fine a few days ago. Any ideas of what might have happened? > > Thanks and best regards, > > Marc > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Mon Oct 16 14:39:28 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 13:39:28 -0500 Subject: [Bioperl-l] SwissProt Down Message-ID: <000601c6f152$6997dbd0$15327e82@pyrimidine> Looks like the swissprot problem stems from maintenance at EBI. From the EBI page http://www.ebi.ac.uk/Information/ (not on the DBFetch page, BTW): Please Note: Monday October 16th 12:00-15:00 - Due to general maintenance, some services from the EBI may be temporarily unavailable. We apologise for any inconvenience. At least we know that Test::More skips are working! Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Mon Oct 16 14:51:31 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 19:51:31 +0100 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: References: Message-ID: <4533D4B3.2000809@sendu.me.uk> Brian Osborne wrote: > Sendu, > > I just made a commit that makes Bio::Index::Blast use SearchIO instead of > BPlite. I was concerned about the whole id_parser thing. Did you determine that your change still allows for id_parser to be used and have the intended effect, or that id_parser is in someway meaningless and should be removed as a method? From cjfields at uiuc.edu Mon Oct 16 15:03:33 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 14:03:33 -0500 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <4533BFB6.5070504@sendu.me.uk> Message-ID: <000301c6f155$c7029ff0$15327e82@pyrimidine> > Hilmar Lapp wrote: > > The problem is it is not maintained, and there are outstanding been bug > > reports. > > > > If you un-deprecate it, then we need a response to people who come > > across problems with it when using it. Either you change the POD to say > > exactly who and when one should use it (or rather not) and point to the > > fact that it is unsupported for all other cases. > > > > Or what would you suggest? > > I'm not sure. > > Does Bio::Index::Blast even work correctly? Does it suffer from whatever > bugs Bio::Tools::BPlite has? Is Bio::Index::Blast maintained? Should > that be deprecated as well? > > Ideally I'd like to see Bio::Index::Blast updated to use Bio::SearchIO > and then Bio::Tools::BPlite can be deprecated. But like I say, it didn't > seem trivial (or even appropriate). > > Ultimately I just wanted to solve the warnings in the test suite. > Thoughts, Chris? My opinion is we either have to completely support BPlite (and the others) or drop it altogether. I don't think we can state "use BPLite only with Bio::Index::Blast, use SearchIO everywhere else." That's too inconsistent. It seems simpler to deprecate the various Bio::Tools::BP* classes and either fix Bio::Index::Blast to use Bio::SearchIO (which I think Brian is working on) or deprecate Bio::Index::Blast as well. The warnings in the test suite belong to BlastIndex.t, correct? I updated using Brian's Bio::Index::blast fix and it passes now w/o warnings. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From akarger at CGR.Harvard.edu Mon Oct 16 15:00:28 2006 From: akarger at CGR.Harvard.edu (Amir Karger) Date: Mon, 16 Oct 2006 15:00:28 -0400 Subject: [Bioperl-l] Bio::Location::Split Message-ID: > -----Original Message----- > From: Chris Fields [mailto:cjfields at uiuc.edu] > > > > I'm converting CDS files from one format to another. E.g., I read an > > EMBL file with a chromosome and CDS features, and want to output the > > location in a FASTA header.> > > > I get the wrong results for multi-exon CDSs on the -1 strand, as > > described in the bug report. > > > > Could you let me know specifically which EMBL file contains the odd > locations? The bug report uses theoretical locations, not > actual ones, so > it would be nice to have a real-world example to test against. I downloaded candida glabrata chromosome B from EBI: http://www.ebi.ac.uk/genomes/eukaryota.html, CR380948 testportal>perl location.pl new_glabrata_B.embl > bio testportal>perl -wlne 'print $1 if /^FT\s+CDS\s+(.*)/' new_glabrata_B.embl > nonbio testportal>wc bio nonbio 217 217 4537 bio 217 217 4549 nonbio 434 434 9086 total testportal>diff bio nonbio 4c4 < complement(join(10632..11157,10347..10372)) --- > join(complement(10632..11157),complement(10347..10372)) Just one example here, but see below. > As for the lack of catching this, the particular types of > locations that > cause the issue are quite rare. Really? I guess our definitions of rare depend on which sequences we're working with. I'm doing fungal genomes, and here's a grep for a few species' entire genomes: testportal>foreach i ( *.embl ) foreach? echo $i foreach? grep CDS $i | grep join | grep -c complement foreach? end glabrata_orf.embl 29 hansenii_orf.embl 151 lactis_orf.embl 70 lipolytica_orf.embl 337 pombe_orf.embl 1137 You might like to use pombe as a test case, as it has lots of these complement joins, including ones with multiple introns. Anyway, I'd question the "rare" designation. It seems to me like any species that has introns will have situations like this in their CDSs. Not to mention any other sequence that uses Bio::Location::Split. (Since I'm not a Real Biologist, I can't think up mor examples here, but I'm sure they exist.) Or are you saying it's rare to use join (complement(C..D), complement(A..B)) instead of complement(join(A..B, C..D)). In that case, I guess I just got really unlucky in that five fungal genomes I was using decided to use the "rare" syntax. > Note that there are two bugs > for that bug > report. The first (and more serious) is still unresolved. The second > (where remote locations are treated differently in > Location::Split, which > caused more problems than it was worth) had a fix committed > about a month > ago. Sadly, it's the first (and in my case, more common (I have no remote locations.)) bug for me. > Any fixes I have made for the first bug invariably break several other > methods, which use the current Location::Split object logic > for retrieving > sequences, building feature strings, etc. Since a new RC is > imminent and > the bug only affects a small number of locations, I have held > off until > after a final release is made (the last thing I want to do is > fix something > that breaks ~6-8 other methods), but I'll try looking at it > again this week. IMO this is a pretty serious bug (if these kinds of sequences aren't that rare as I've shown above), because you're outputting sequence descriptions that are just plain wrong. Anyone who uses FTLocationFactory to read these output description will have incorrect sequence, incorrect translated proteins, etc. And it's even more serious if other methods are depending on it. I know I can't dictate your time, and should be volunteering to work on fixing it. But if it affects other modules, then I will no doubt break things even more than you have in your attempts. -Amir > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > From bosborne11 at verizon.net Mon Oct 16 14:25:14 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 16 Oct 2006 14:25:14 -0400 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <4533A8D3.90709@sendu.me.uk> Message-ID: Sendu, I just made a commit that makes Bio::Index::Blast use SearchIO instead of BPlite. The BlastIndex.t test is giving a few warnings so I need to take a look at that but all tests are passing. An awful lot of work has gone into the SearchIO system, for more on why its approach is deemed to be superior in the context of Bioperl see the SearchIO HOWTO. One key feature of this upcoming release is an emphasis on removing extraneous modules, I think it's safe to say that BPlite has been considered extraneous for a number of years now. Brian O. On 10/16/06 11:44 AM, "Sendu Bala" wrote: > I think Chris recently deprecated this, but should it be? For me, its > POD description justifies its existence, and perhaps more importantly, > Bio::Index::Blast relies on it. > > I took a quick peek at the latter and it didn't seem trivial to move it > over to Bio::SearchIO instead. > > Should it be undeprecated? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From bosborne11 at verizon.net Mon Oct 16 14:59:38 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 16 Oct 2006 14:59:38 -0400 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <4533D4B3.2000809@sendu.me.uk> Message-ID: Sendu, OK. I _think_ this change shouldn't affect id_parser() but I will test this in BlastIndex.t. The id_parser() method is relevant to all these Index* modules - don't know how much it's used but it certainly is nice to have it available. Brian O. On 10/16/06 2:51 PM, "Sendu Bala" wrote: > Brian Osborne wrote: >> Sendu, >> >> I just made a commit that makes Bio::Index::Blast use SearchIO instead of >> BPlite. > > I was concerned about the whole id_parser thing. Did you determine that > your change still allows for id_parser to be used and have the intended > effect, or that id_parser is in someway meaningless and should be > removed as a method? From cjfields at uiuc.edu Mon Oct 16 16:51:08 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 15:51:08 -0500 Subject: [Bioperl-l] Bio::Location::Split In-Reply-To: Message-ID: <000001c6f164$d1380190$15327e82@pyrimidine> ... > I downloaded candida glabrata chromosome B from EBI: > http://www.ebi.ac.uk/genomes/eukaryota.html, CR380948 > > testportal>perl location.pl new_glabrata_B.embl > bio > testportal>perl -wlne 'print $1 if /^FT\s+CDS\s+(.*)/' > new_glabrata_B.embl > nonbio > testportal>wc bio nonbio > 217 217 4537 bio > 217 217 4549 nonbio > 434 434 9086 total > testportal>diff bio nonbio > 4c4 > < complement(join(10632..11157,10347..10372)) > --- > > join(complement(10632..11157),complement(10347..10372)) > > Just one example here, but see below. > > > As for the lack of catching this, the particular types of > > locations that > > cause the issue are quite rare. > > Really? I guess our definitions of rare depend on which sequences we're > working with. I'm doing fungal genomes, and here's a grep for a few > species' entire genomes: > > testportal>foreach i ( *.embl ) > foreach? echo $i > foreach? grep CDS $i | grep join | grep -c complement > foreach? end > glabrata_orf.embl > 29 > hansenii_orf.embl > 151 > lactis_orf.embl > 70 > lipolytica_orf.embl > 337 > pombe_orf.embl > 1137 > > You might like to use pombe as a test case, as it has lots of these > complement joins, including ones with multiple introns. I'll use those. I'll see if an analogous GenBank file exists as well. I can probably make a preliminary fix for FT_string() so that it arranges the sublocations correctly, but I think the best way to go is to have FTLocationFactory not modify the various sublocations to start with, which it currently does when it sets strand() (strand() propagates the strand info to sublocations). > Anyway, I'd question the "rare" designation. It seems to me like any > species that has introns will have situations like this in their CDSs. > Not to mention any other sequence that uses Bio::Location::Split. (Since > I'm not a Real Biologist, I can't think up mor examples here, but I'm > sure they exist.) I think that additional tests are definitely needed for pulling out sequences. What I mean by 'rare' is that the majority of sequences do not have problems. Also, this seems to be a 'silent' bug since the error shows up in to_FTstring() but the object sublocations seem to beprocessed correctly when using the location object directly (such as via SeqFeatureI). Round-tripping the sequence should pick it up though. Since complement(join(10632..11157,10347..10372)) is not the same as join(complement(10632..11157),complement(10347..10372)). That is essentially what you are doing, correct? i.e. getting the sequences using Bioperl, saving them (which passes them through SeqIO), reading them again (back through SeqIO with the malformed location string). > Or are you saying it's rare to use join (complement(C..D), > complement(A..B)) instead of complement(join(A..B, C..D)). In that case, > I guess I just got really unlucky in that five fungal genomes I was > using decided to use the "rare" syntax. Location::Split is supposed to handle all variations, but apparently it isn't. > > Note that there are two bugs > > for that bug > > report. The first (and more serious) is still unresolved. The second > > (where remote locations are treated differently in > > Location::Split, which > > caused more problems than it was worth) had a fix committed > > about a month > > ago. > > Sadly, it's the first (and in my case, more common (I have no remote > locations.)) bug for me. > > > Any fixes I have made for the first bug invariably break several other > > methods, which use the current Location::Split object logic > > for retrieving > > sequences, building feature strings, etc. Since a new RC is > > imminent and > > the bug only affects a small number of locations, I have held > > off until > > after a final release is made (the last thing I want to do is > > fix something > > that breaks ~6-8 other methods), but I'll try looking at it > > again this week. > > IMO this is a pretty serious bug (if these kinds of sequences aren't > that rare as I've shown above), because you're outputting sequence > descriptions that are just plain wrong. Anyone who uses > FTLocationFactory to read these output description will have incorrect > sequence, incorrect translated proteins, etc. And it's even more serious > if other methods are depending on it. > > I know I can't dictate your time, and should be volunteering to work on > fixing it. But if it affects other modules, then I will no doubt break > things even more than you have in your attempts. > > -Amir I'll give it a look over the next week. Like I mentioned above, I may be able to fix it in Split::to_FTstring() w/o breaking other tests (in which case I'll commit it for the 1.5.2 release), but it would be a temporary hack until I can work out why other tests are failing. Chris From jason at bioperl.org Mon Oct 16 18:45:21 2006 From: jason at bioperl.org (Jason Stajich) Date: Mon, 16 Oct 2006 15:45:21 -0700 Subject: [Bioperl-l] split location problems Message-ID: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org> The whole point of split locations is to represent genes with introns so that is not the "rare" case. I'm confused where the problem is. The locations that I get out with to_FTstring on the location object are exactly the same as those input. I have processed the genbank fungal genomes into GFF3 and have had no problems so I'm confused where you are breaking down. If I write them out as embl I also get the correct thing. This is using the CVS version of bioperl from the HEAD. I've added code to test this to bug 2101 including a C.glabrata chromsome downloaded from genbank. Perhaps the problem is on the EMBL parsing side, I didn't test that. On the technical side, I still am not sure I fully know where the strand information should be stored - the top level container or the sub-features. I'll try and stay up on the discussion if anything has been decided that I should know about. -jason From torsten.seemann at infotech.monash.edu.au Mon Oct 16 18:23:23 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Tue, 17 Oct 2006 08:23:23 +1000 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <000201c6f149$3ed63490$15327e82@pyrimidine> References: <000201c6f149$3ed63490$15327e82@pyrimidine> Message-ID: <4534065B.9020309@infotech.monash.edu.au> Chris Fields wrote: >> So it looks like an abstract base class, not an interface that >> defines a contract or API? Should use Root.pm then, would be my vote. >> -hilmar > > Makes sense to me. Maybe another audit is needed to catch similar > instances, or has this been done already? The purpose of my original (poorly phrased) question was to try and sort out where Root and RootI where being used the wrong way around. I'm currently "all-audited out" so I leave this task to another volunteer. -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From cjfields at uiuc.edu Mon Oct 16 21:07:55 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 20:07:55 -0500 Subject: [Bioperl-l] split location problems In-Reply-To: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org> References: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org> Message-ID: On Oct 16, 2006, at 5:45 PM, Jason Stajich wrote: > The whole point of split locations is to represent genes with > introns so that is not the "rare" case. > > I'm confused where the problem is. The locations that I get out > with to_FTstring on the location object are exactly the same as > those input. The problem is with the a subset of split locations described in the bug report. The following works: complement(join(2691..4571,4918..5163)) whereas this: join(complement(4918..5163),complement(2691..4571)) gives this: complement(join(4918..5163,2691..4571)) which is not syntactically the same. It should be: complement(join(2691..4571,4918..5163)) since 'join' implies that the order of the segments to be joined is important ('order' and 'bond' do not, I guess). > I have processed the genbank fungal genomes into GFF3 and have had > no problems so I'm confused where you are breaking down. If I > write them out as embl I also get the correct thing. This is using > the CVS version of bioperl from the HEAD. > > I've added code to test this to bug 2101 including a C.glabrata > chromsome downloaded from genbank. Perhaps the problem is on the > EMBL parsing side, I didn't test that. > > On the technical side, I still am not sure I fully know where the > strand information should be stored - the top level container or > the sub-features. I'll try and stay up on the discussion if > anything has been decided that I should know about. > > -jason Split::strand() sets the sublocations as well, which seems to confuse the situation more but it is consistent with LocationI, as Hilmar points out. I'm looking into a few solutions now, including a fix in Split::to_FTstring(). Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Mon Oct 16 22:48:14 2006 From: jason at bioperl.org (Jason Stajich) Date: Mon, 16 Oct 2006 19:48:14 -0700 Subject: [Bioperl-l] split location problems In-Reply-To: References: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org> Message-ID: <8273f6c20610161948w201537a5v2fcfa189eb809283@mail.gmail.com> This probably was exposed by the fact that the Split object used to explicitly sort the features by start*strand always. But with remote locations and needing to be able to explicitly set the order (for features that are not required to be 5' -> 3') that code must have been removed. I think there is just one place that must be missing a 'reverse' on the list of sub-locations when the top-level feature is a complement. I'll wait for your fix before wading in - we probably might want to figure out a 'consolidate' method to shrink redundant and equivalent representations to the shortest possible form. Ugh this really starts to resemble trying to write a boolean logic toolkit.... -jason On 10/16/06, Chris Fields wrote: > > > On Oct 16, 2006, at 5:45 PM, Jason Stajich wrote: > > > The whole point of split locations is to represent genes with > > introns so that is not the "rare" case. > > > > I'm confused where the problem is. The locations that I get out > > with to_FTstring on the location object are exactly the same as > > those input. > > The problem is with the a subset of split locations described in the > bug report. The following works: > > complement(join(2691..4571,4918..5163)) > > whereas this: > > join(complement(4918..5163),complement(2691..4571)) > > gives this: > > complement(join(4918..5163,2691..4571)) > > which is not syntactically the same. It should be: > > complement(join(2691..4571,4918..5163)) > > since 'join' implies that the order of the segments to be joined is > important ('order' and 'bond' do not, I guess). > > > I have processed the genbank fungal genomes into GFF3 and have had > > no problems so I'm confused where you are breaking down. If I > > write them out as embl I also get the correct thing. This is using > > the CVS version of bioperl from the HEAD. > > > > I've added code to test this to bug 2101 including a C.glabrata > > chromsome downloaded from genbank. Perhaps the problem is on the > > EMBL parsing side, I didn't test that. > > > > On the technical side, I still am not sure I fully know where the > > strand information should be stored - the top level container or > > the sub-features. I'll try and stay up on the discussion if > > anything has been decided that I should know about. > > > > -jason > > Split::strand() sets the sublocations as well, which seems to confuse > the situation more but it is consistent with LocationI, as Hilmar > points out. I'm looking into a few solutions now, including a fix in > Split::to_FTstring(). > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > -- Jason Stajich jason at bioperl.org http://www.duke.edu/~jes12/ From cjfields at uiuc.edu Mon Oct 16 23:34:25 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 22:34:25 -0500 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: References: Message-ID: On Oct 16, 2006, at 10:24 PM, Brian Osborne wrote: > Chris and Sendu, > > Sendu was correct in wondering whether id_parser() in Blast.pm > would work > after the module was altered to use SearchIO but what I've found > out from my > local tests is that id_parser() didn't work when BPlite was being used > either. I can continue to work on this but it's safe to say that > removing > BPlite doesn't cause a problem with id_parser, it was already there. > > Brian O. .... It may be one reason (the main reason?) the method wasn't tested. Maybe it should be removed if it can't be easily fixed; I don't think it makes sense keeping it otherwise. Chris Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bosborne11 at verizon.net Mon Oct 16 23:24:59 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 16 Oct 2006 23:24:59 -0400 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <000301c6f155$c7029ff0$15327e82@pyrimidine> Message-ID: Chris and Sendu, Sendu was correct in wondering whether id_parser() in Blast.pm would work after the module was altered to use SearchIO but what I've found out from my local tests is that id_parser() didn't work when BPlite was being used either. I can continue to work on this but it's safe to say that removing BPlite doesn't cause a problem with id_parser, it was already there. Brian O. On 10/16/06 3:03 PM, "Chris Fields" wrote: >> Hilmar Lapp wrote: >>> The problem is it is not maintained, and there are outstanding been bug >>> reports. >>> >>> If you un-deprecate it, then we need a response to people who come >>> across problems with it when using it. Either you change the POD to say >>> exactly who and when one should use it (or rather not) and point to the >>> fact that it is unsupported for all other cases. >>> >>> Or what would you suggest? >> >> I'm not sure. >> >> Does Bio::Index::Blast even work correctly? Does it suffer from whatever >> bugs Bio::Tools::BPlite has? Is Bio::Index::Blast maintained? Should >> that be deprecated as well? >> >> Ideally I'd like to see Bio::Index::Blast updated to use Bio::SearchIO >> and then Bio::Tools::BPlite can be deprecated. But like I say, it didn't >> seem trivial (or even appropriate). >> >> Ultimately I just wanted to solve the warnings in the test suite. >> Thoughts, Chris? > > My opinion is we either have to completely support BPlite (and the others) > or drop it altogether. I don't think we can state "use BPLite only with > Bio::Index::Blast, use SearchIO everywhere else." That's too inconsistent. > > > It seems simpler to deprecate the various Bio::Tools::BP* classes and either > fix Bio::Index::Blast to use Bio::SearchIO (which I think Brian is working > on) or deprecate Bio::Index::Blast as well. > > The warnings in the test suite belong to BlastIndex.t, correct? I updated > using Brian's Bio::Index::blast fix and it passes now w/o warnings. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From bosborne11 at verizon.net Mon Oct 16 23:48:56 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 16 Oct 2006 23:48:56 -0400 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: Message-ID: Chris, OK. In fact there's no written guarantee that all Bio::Index* modules have an id_parser() method. It happens that most do, and it's useful. I'll fix the documentation in Bio::Index::Blast and add an enhancement request to Bugzilla, may be able to get around to before 1.5.2 release but no promises. Brian O. On 10/16/06 11:34 PM, "Chris Fields" wrote: > > On Oct 16, 2006, at 10:24 PM, Brian Osborne wrote: > >> Chris and Sendu, >> >> Sendu was correct in wondering whether id_parser() in Blast.pm >> would work >> after the module was altered to use SearchIO but what I've found >> out from my >> local tests is that id_parser() didn't work when BPlite was being used >> either. I can continue to work on this but it's safe to say that >> removing >> BPlite doesn't cause a problem with id_parser, it was already there. >> >> Brian O. > > .... > > It may be one reason (the main reason?) the method wasn't tested. > Maybe it should be removed if it can't be easily fixed; I don't think > it makes sense keeping it otherwise. > > Chris > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From n.haigh at sheffield.ac.uk Tue Oct 17 02:35:43 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 17 Oct 2006 07:35:43 +0100 Subject: [Bioperl-l] INSTALL and INSTALL.WIN Message-ID: <453479BF.90408@sheffield.ac.uk> I'm a bit unclear as to what is happening with these files. Are these files now superseded by the wikified versions? If so, should these files now just simply contain a link to the wikified versions - otherwise things could get in a mess since I updated the wiki version of INSTALL.WIN and I see Sendu updated the cvs versions a couple of weeks ago - hopefully these differences aren't that big. Nath From faruque at ebi.ac.uk Tue Oct 17 04:19:44 2006 From: faruque at ebi.ac.uk (Nadeem Faruque) Date: Tue, 17 Oct 2006 09:19:44 +0100 Subject: [Bioperl-l] split location problems Message-ID: EMBL' currently outputs join-complements in the format join(complement(30..40),complement(10..20)) instead of the Genbank preferred complement(join(10..20,30..40)) EMBL's may reflect what happens in the cell a little more than Genbank's, but it is less readable and less concise. NB I've also seen a couple of people construct these incorrectly eg join(complement(10..20),complement(30..40)) I believe we are moving to the complement-join format but I can't give a date for the transition. Having said that, trans-splicing will still give us the joys of complex locations, eg join(1..5,complement(join(10..20,30..40))) complement(join(30..40,10..20)) <- looks wrong (unless it is a very small circle) but mis-ordered exons are resolved by the trans- splicing machinery. Nadeem -- S.M. Nadeem N. Faruque EMBL Nucleotide Database Curation Team EMBL Outstation Tel: +44 1223 494611 Fax: +44 1223 494472 The European Bioinformatics Institute URL: http://www.ebi.ac.uk/ Email for data submissions: datasubs at ebi.ac.uk Email for updates: update at ebi.ac.uk ======================================================== From bix at sendu.me.uk Tue Oct 17 04:59:36 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 17 Oct 2006 09:59:36 +0100 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net> References: <4521E74E.1040404@infotech.monash.edu.au> <452F54A1.7010908@sendu.me.uk> <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net> <45333E02.9070808@sendu.me.uk> <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net> Message-ID: <45349B78.8090905@sendu.me.uk> Hilmar Lapp wrote: > So it looks like an abstract base class, not an interface that > defines a contract or API? Should use Root.pm then, would be my vote. Agreed, that was actually what I did in my local copy when I made a new inheriting class (so discovering the problem). This change is harmless to other modules, but does mean they'll have redundant use of Bio::Root::Root which will want cleaning up at some stage. From bix at sendu.me.uk Tue Oct 17 06:32:54 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 17 Oct 2006 11:32:54 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 Message-ID: <4534B156.4090501@sendu.me.uk> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. See http://www.bioperl.org/wiki/Release_1.5.2 for instructions on getting and testing this RC. Developers: This should be the last RC before release ~next monday. Now would be a good time for last minute documentaiton updates and additions. Users: Even though 1.5.2 is a 'developer' release, we consider it the most stable and capable version of Bioperl, and recommend that you use it in all but the most critical production environments. Please try it out and let us know of any problems or difficulties you run into. Thank you, Sendu. From cjfields at uiuc.edu Tue Oct 17 07:16:47 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 06:16:47 -0500 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <453479BF.90408@sheffield.ac.uk> References: <453479BF.90408@sheffield.ac.uk> Message-ID: <72C437E7-8F69-416D-996B-FF3DD1498E78@uiuc.edu> The general consensus was to keep text versions available; we could add URL links to the wiki pages for the most up-to-dat version. BTW, I have modified INSTALL already. INSTALL.WIN is next in line (I was waiting for your changes). Chris On Oct 17, 2006, at 1:35 AM, Nathan S. Haigh wrote: > I'm a bit unclear as to what is happening with these files. > > Are these files now superseded by the wikified versions? If so, should > these files now just simply contain a link to the wikified versions - > otherwise things could get in a mess since I updated the wiki > version of > INSTALL.WIN and I see Sendu updated the cvs versions a couple of weeks > ago - hopefully these differences aren't that big. > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Tue Oct 17 07:45:45 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 17 Oct 2006 12:45:45 +0100 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <72C437E7-8F69-416D-996B-FF3DD1498E78@uiuc.edu> References: <453479BF.90408@sheffield.ac.uk> <72C437E7-8F69-416D-996B-FF3DD1498E78@uiuc.edu> Message-ID: <4534C269.5050704@sheffield.ac.uk> Chris Fields wrote: > The general consensus was to keep text versions available; we could > add URL links to the wiki pages for the most up-to-dat version. BTW, > I have modified INSTALL already. INSTALL.WIN is next in line (I was > waiting for your changes). > Is it possible to generate these files from the wiki whenever there is a release? I now edits shouldn't be too severe or too often - but I can see things getting a little messy/annoying if edits have to be made in 2 places. Nath From cjfields at uiuc.edu Tue Oct 17 10:04:32 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 09:04:32 -0500 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <4534C269.5050704@sheffield.ac.uk> Message-ID: <000301c6f1f5$2d3509d0$15327e82@pyrimidine> There isn't a very easy way since so many links have to be removed/modified. I have found a few CPAN modules that could help, but for now I just dump the text output from a text browser (elinks) using the 'printable version' page and hand-edit, which works very quickly. That works for the time being until I can find another more automated solution. Fortunately there have been very few edits to either INSTALL wiki page so they should remain relatively stable. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk] > Sent: Tuesday, October 17, 2006 6:46 AM > To: Chris Fields > Cc: bioperl-l > Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN > > Chris Fields wrote: > > The general consensus was to keep text versions available; we could > > add URL links to the wiki pages for the most up-to-dat version. BTW, > > I have modified INSTALL already. INSTALL.WIN is next in line (I was > > waiting for your changes). > > > Is it possible to generate these files from the wiki whenever there is a > release? I now edits shouldn't be too severe or too often - but I can > see things getting a little messy/annoying if edits have to be made in 2 > places. > > Nath From cjfields at uiuc.edu Tue Oct 17 10:12:09 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 09:12:09 -0500 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: Message-ID: <000401c6f1f6$424b5580$15327e82@pyrimidine> > Chris, > > OK. In fact there's no written guarantee that all Bio::Index* modules have > an id_parser() method. It happens that most do, and it's useful. I'll fix > the documentation in Bio::Index::Blast and add an enhancement request to > Bugzilla, may be able to get around to before 1.5.2 release but no > promises. > > Brian O. Do the various Bio::Index* modules share a common interface? I wouldn't worry too much about it for this release, unless you really have time. It is still, after all, a developer's release, and you've noted it in Bugzilla. We could try for another dev release in winter (rel 1.5.3, I guess) to get any bug fixes or new modules added. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > On 10/16/06 11:34 PM, "Chris Fields" wrote: > > > > > On Oct 16, 2006, at 10:24 PM, Brian Osborne wrote: > > > >> Chris and Sendu, > >> > >> Sendu was correct in wondering whether id_parser() in Blast.pm > >> would work > >> after the module was altered to use SearchIO but what I've found > >> out from my > >> local tests is that id_parser() didn't work when BPlite was being used > >> either. I can continue to work on this but it's safe to say that > >> removing > >> BPlite doesn't cause a problem with id_parser, it was already there. > >> > >> Brian O. > > > > .... > > > > It may be one reason (the main reason?) the method wasn't tested. > > Maybe it should be removed if it can't be easily fixed; I don't think > > it makes sense keeping it otherwise. > > > > Chris > > > > Christopher Fields > > Postdoctoral Researcher > > Lab of Dr. Robert Switzer > > Dept of Biochemistry > > University of Illinois Urbana-Champaign > > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From n.haigh at sheffield.ac.uk Tue Oct 17 10:15:17 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 17 Oct 2006 15:15:17 +0100 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <000301c6f1f5$2d3509d0$15327e82@pyrimidine> References: <000301c6f1f5$2d3509d0$15327e82@pyrimidine> Message-ID: <4534E575.5050308@sheffield.ac.uk> Chris Fields wrote: > There isn't a very easy way since so many links have to be removed/modified. > I have found a few CPAN modules that could help, but for now I just dump the > text output from a text browser (elinks) using the 'printable version' page > and hand-edit, which works very quickly. That works for the time being > until I can find another more automated solution. > > Fortunately there have been very few edits to either INSTALL wiki page so > they should remain relatively stable. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > So am I correct in saying that the best way is to make all updates to the wikified versions of these files, and then at regular intervals/major releases you (or someone else) will update the CVS version of the files in the way describe above? Cheers Nath From bix at sendu.me.uk Tue Oct 17 10:00:39 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 17 Oct 2006 15:00:39 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4534E09C.9030707@genomics.dk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> Message-ID: <4534E207.8030508@sendu.me.uk> Niels Larsen wrote: > Greetings, > > I am no perl beginner, but I am a BioPerl beginner. Today I looked > for remote similarity services that can be used from Perl. I found > the EBI SOAP interface where their example script returns > > Can't find method element in the message at > /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311. What script exactly? There was a problem with the SOAP server that was fixed earlier today. > and the DDBJ service which (from Denmark) returns > > undef What returned undef? Specifics please. > and then the NCBI server accessed through BioPerls RemoteBlast which > seems to spin in a loop that fills TMPDIR with many tempfiles. Will > release 1.5.2 include improvements to RemoteBlast? (I see Roger Hall > is working towards that). What version of Bioperl were you testing with? What did you do to get it to 'spin in a loop'? I can tell you that remote blasting certainly works in Bioperl 1.5.2, but you'll have to give more details on the things you tried and the problems you encountered. You can also answer the questions yourself by trying the release candidate. From B.Beckert at ibmc.u-strasbg.fr Tue Oct 17 09:59:30 2006 From: B.Beckert at ibmc.u-strasbg.fr (Bertrand Beckert) Date: Tue, 17 Oct 2006 15:59:30 +0200 Subject: [Bioperl-l] problems with: Bio::Tools::Run::RemoteBlast Message-ID: hi, I am running a large number of blasts via a connexion to ncbi blast page ('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi'). I try to use 'Bio::Tools::Run::RemoteBlast' but unfortunately I have some problems. I make a simple example with only one sequence in order to understand how work this module. This is my simple input file, a DNA sequence in fasta form: > test > TTTTGATGAGGCGCATCAATCATGAGTAAAGTTTAGATTACTGTCTGCTAACAGCTGAAT TTGAAAGGGTGCGATGCCGAAGCGATTATAATAGCAGTTATAATTTGTTGGACTTTTTGG TTAAGAGCTGAGAGTTTGTCATTATTTAAAAATAATGGAGTGCATCACTTGTA I have made some modification of the example available in doc of bioperl. It give me a RID which contain the results of my blast but I have a problem with the "$result=$factory->retrieve_blast($rid)" in my script. In the documentation it wrote that $result=$factory->retrieve_blast ($rid) return when it work a Bio::Tools::Bplite or Bio::Tools::Blast object. In my case it returns a Bio::SearchIO::blast... I don't understand why I don't have the good type of object return (see PART I). I also try to resolve the problem by replace the foreach loop in my script by a new one in order to explore the blast page result but it also don't work (see part II). could you help me please. Thank you Bertrand Beckert. PART I: Here is my script with a little annotation and also the shell window printing: ------------------------------------------------------------------------ ---------------------------- #!/usr/bin/perl -w use Bio::Tools::Run::RemoteBlast; use Bio::SearchIO; sub blast { my $prog='blastn'; my $db='refseq_genomic'; my $e_val='1e-10'; my $Input='Seq.fasta'; my @params = ('-prog' => $prog, '-data' => $db, '-expect' => $e_val, '-readmethod' => 'SearchIO'); my $factory = Bio::Tools::Run::RemoteBlast->new(@params); #changes parameters $Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'}='Bacteria [ORGN]'; $Bio::Tools::Run::RemoteBlast::HEADER{'MATRIX_NAME'}='BLOSUM25'; $factory->submit_blast($Input); print STDERR "waiting...\n"; while (my @rids=$factory->each_rid) { print "my rid: ", at rids,"\n"; #return me the ID of the submited blast i.e. RID: 1161079157-766-185099855365.BLASTQ2 #this page contains the result of my blast... foreach my $rid (@rids) { $result=$factory->retrieve_blast($rid); #line in order to understand what type of object is return by retrieve_blast print "rc:", $result,"\n"; } } } &blast; ------------------------------------------------------------------------ ---------------------------- here you can see the shell window: bbeckert at tatooine:~/Script_perl$ ./test.pl waiting... my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x890bc54) my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x890bc30) my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x89eb7f4) my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x8a2cc74) my rid: 1161079157-766-185099855365.BLASTQ2 ... my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x886bbac) my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x89eb5f0) my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x8a2d2d4) my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x84fa054) ... PARTII: I also try to resolve the problem by replace the foreach loop in my script by: ------------------------------------------------------------------------ ---------------------------- foreach my $rid (@rids) { while(1) { $result=$factory->retrieve_blast($rid)->next_result(); print "rc:", $result,"\n"; if ($result) { print $result->num_hits(),"\n"; } ------------------------------------------------------------------------ ---------------------------- With tis loop I could explore the result Blast page. that is what I obtain in the shell window: bbeckert at tatooine:~/Script_perl$ ./test.pl waiting... my rid: 1161088606-9905-123050755601.BLASTQ4 Use of uninitialized value in print at ./retrieve_blast.pl line 30. rc: Use of uninitialized value in print at ./retrieve_blast.pl line 30. rc: Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::Search::Result::BlastResult=HASH(0x84fb8b8) 0 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::Search::Result::BlastResult=HASH(0x84fba8c) 0 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::Search::Result::BlastResult=HASH(0x84fb834) ---- -- Berrtrand BECKERT PhD student IBMC - UPR 9002 du CNRS - ARN 15, rue Rene Descartes F-67084 STRASBOURG Cedex b.beckert at ibmc.u-strasbg.fr From niels at genomics.dk Tue Oct 17 09:54:36 2006 From: niels at genomics.dk (Niels Larsen) Date: Tue, 17 Oct 2006 15:54:36 +0200 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4534B156.4090501@sendu.me.uk> References: <4534B156.4090501@sendu.me.uk> Message-ID: <4534E09C.9030707@genomics.dk> Greetings, I am no perl beginner, but I am a BioPerl beginner. Today I looked for remote similarity services that can be used from Perl. I found the EBI SOAP interface where their example script returns Can't find method element in the message at /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311. and the DDBJ service which (from Denmark) returns undef and then the NCBI server accessed through BioPerls RemoteBlast which seems to spin in a loop that fills TMPDIR with many tempfiles. Will release 1.5.2 include improvements to RemoteBlast? (I see Roger Hall is working towards that). Niels L ------------------------------------------------------------------------ Niels Larsen Danish Genome Institute Gustav Wieds vej 10 C DK-8000 Aarhus C Denmark Electronic mail: niels at genomics.dk Skype: niels_larsen_denmark Telephone: +45-8942-5268 Telefax: +45-8620-1222 ------------------------------------------------------------------------ From cjfields at uiuc.edu Tue Oct 17 10:28:40 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 09:28:40 -0500 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <4534E575.5050308@sheffield.ac.uk> Message-ID: <000501c6f1f8$8b78efe0$15327e82@pyrimidine> ... > So am I correct in saying that the best way is to make all updates to > the wikified versions of these files, and then at regular > intervals/major releases you (or someone else) will update the CVS > version of the files in the way describe above? > > Cheers > Nath Yes. I think the online docs will stay relatively stable. A week or so ago Mauricio and I were discussing moving the dependencies list to it's own CVS document (since they pertain to all Bioperl installations, not just UNIX'y flavors). I haven't done that yet since I was waiting on the INSTALL.WIN changes before I made any more changes. Well, that and I've been really busy doing other things. One way we could make sure that changes to the online docs would match the CVS docs would be to only allow certain wiki users (such as sysadmins) make modifications to those pages. That way any changes would have to go through someone who also has CVS access and could make similar changes to the distribution docs. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Tue Oct 17 10:37:38 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 17 Oct 2006 15:37:38 +0100 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <000501c6f1f8$8b78efe0$15327e82@pyrimidine> References: <000501c6f1f8$8b78efe0$15327e82@pyrimidine> Message-ID: <4534EAB2.50609@sheffield.ac.uk> Chris Fields wrote: > ... > >> So am I correct in saying that the best way is to make all updates to >> the wikified versions of these files, and then at regular >> intervals/major releases you (or someone else) will update the CVS >> version of the files in the way describe above? >> >> Cheers >> Nath >> > > Yes. I think the online docs will stay relatively stable. A week or so ago > Mauricio and I were discussing moving the dependencies list to it's own CVS > document (since they pertain to all Bioperl installations, not just UNIX'y > flavors). I haven't done that yet since I was waiting on the INSTALL.WIN > changes before I made any more changes. Well, that and I've been really > busy doing other things. > Sounds good. > One way we could make sure that changes to the online docs would match the > CVS docs would be to only allow certain wiki users (such as sysadmins) make > modifications to those pages. That way any changes would have to go through > someone who also has CVS access and could make similar changes to the > distribution docs. > Ugh, not sure I like the sound of maintaining 2 copies of any files - sounds like a future headache even if they are pretty stable. It also makes it unclear which of the two file should be considered first (i.e. is the most up-to-date) on pages such as: http://www.bioperl.org/wiki/Installing_BioPerl It suggests that INSTALL and INSTALL.WIN should be looked at first, but there are online copies of those files available - this should now be the other way around - shouldn't it? I might just be making a mountain out of a molehill, so I'll shut up on this topic and make any future edits to the wiki pages instead. > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > From bosborne11 at verizon.net Tue Oct 17 10:48:54 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 17 Oct 2006 10:48:54 -0400 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <000401c6f1f6$424b5580$15327e82@pyrimidine> Message-ID: Chris, The Bio::Index modules either 'use base qw(Bio::Index::Abstract)' or 'use base qw(Bio::Index::AbstractSeq)'. Neither of these modules has an id_parser() method. Brian O. On 10/17/06 10:12 AM, "Chris Fields" wrote: > Do the various Bio::Index* modules share a common interface? From cjfields at uiuc.edu Tue Oct 17 10:45:53 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 09:45:53 -0500 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <4534EAB2.50609@sheffield.ac.uk> Message-ID: <000601c6f1fa$f260b560$15327e82@pyrimidine> ... > > One way we could make sure that changes to the online docs would match > the > > CVS docs would be to only allow certain wiki users (such as sysadmins) > make > > modifications to those pages. That way any changes would have to go > through > > someone who also has CVS access and could make similar changes to the > > distribution docs. > > > Ugh, not sure I like the sound of maintaining 2 copies of any files - > sounds like a future headache even if they are pretty stable. It also > makes it unclear which of the two file should be considered first (i.e. > is the most up-to-date) on pages such as: > http://www.bioperl.org/wiki/Installing_BioPerl > > It suggests that INSTALL and INSTALL.WIN should be looked at first, but > there are online copies of those files available - this should now be > the other way around - shouldn't it? I might just be making a mountain > out of a molehill, so I'll shut up on this topic and make any future > edits to the wiki pages instead. Yes that should be the other way around (the wiki would be the most up-to-date), so the CVS docs should point to the wiki, not vice-versa. Getting the docs right is as important as getting the code to work. So I don't consider it a 'mountain-out-of-a-molehill' problem. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 17 11:07:49 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 10:07:49 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4534E207.8030508@sendu.me.uk> Message-ID: <001001c6f1fe$02fd4de0$15327e82@pyrimidine> > Niels Larsen wrote: > > Greetings, > > > > I am no perl beginner, but I am a BioPerl beginner. Today I looked > > for remote similarity services that can be used from Perl. I found > > the EBI SOAP interface where their example script returns > > > > Can't find method element in the message at > > /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311. > > What script exactly? There was a problem with the SOAP server that was > fixed earlier today. > > > > and the DDBJ service which (from Denmark) returns > > > > undef > > What returned undef? Specifics please. > The first problem, like Sendu mentions, was fixed on the remote server (I get them to pass now). Those were from bioperl-run, though, not the bioperl core distribution. As for DDBJ, do you mean EBI or SwissProt? I ask b/c you mention Denmark. EBI were having server maintenance outages yesterday, which was announced here. As Sendu mentions, please be more specific. > > and then the NCBI server accessed through BioPerls RemoteBlast which > > seems to spin in a loop that fills TMPDIR with many tempfiles. Will > > release 1.5.2 include improvements to RemoteBlast? (I see Roger Hall > > is working towards that). > > What version of Bioperl were you testing with? What did you do to get it > to 'spin in a loop'? I can tell you that remote blasting certainly works > in Bioperl 1.5.2, but you'll have to give more details on the things you > tried and the problems you encountered. > > You can also answer the questions yourself by trying the release > candidate. The tempfiles showing up are from the repeated RID requests and are deleted after the BLAST run (at least they should be); this is quite normal. They don't 'spin in a loop' unless the BLAST query is taking a particularly long time, which can happen depending on how the BLAST query is set up, i.e. what type of BLAST program is requested, if comp-based stats are requested, length of query, database requested, etc. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Tue Oct 17 11:14:07 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 17 Oct 2006 16:14:07 +0100 Subject: [Bioperl-l] problems with: Bio::Tools::Run::RemoteBlast In-Reply-To: References: Message-ID: <4534F33F.3070809@sendu.me.uk> Bertrand Beckert wrote: > hi, > > I am running a large number of blasts via a connexion to ncbi blast > page ('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi'). > I try to use 'Bio::Tools::Run::RemoteBlast' but unfortunately I have > some problems. [snip] > In the documentation it wrote that $result=$factory->retrieve_blast > ($rid) return when it work a Bio::Tools::Bplite or Bio::Tools::Blast > object. In my case it returns a Bio::SearchIO::blast... I don't > understand why I don't have the good type of object return (see PART I). I take it you're using some old version of Bioperl where unfortunately the documentation was incorrect. In fact you're supposed to get a Bio::SearchIO object, so it is a good thing that you are. The latest version of Bioperl has (as far as I can see) correct documentation and behaviour. Bio::Tools::Bplite and Bio::Tools::Blast are deprecated. You want Bio::SearchIO::blast. All is well. > I also try to resolve the problem by replace the foreach loop in my > script by a new one in order to explore the blast page result but it > also don't work (see part II). I'm not really sure what problem you might be facing there, but take a look at some up-to-date documentation, using the new example code: http://doc.bioperl.org/bioperl-live/Bio/Tools/Run/RemoteBlast.html From n.haigh at sheffield.ac.uk Tue Oct 17 12:10:15 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 17 Oct 2006 17:10:15 +0100 Subject: [Bioperl-l] [Fwd: Re: Bundle::BioPerl] Message-ID: <45350067.6070604@sheffield.ac.uk> FYI on Bundle::BioPerl Nathan -------- Original Message -------- Subject: Re: Bundle::BioPerl Date: Tue, 17 Oct 2006 11:52:00 -0400 From: Chris Dagdigian To: Nathan S. Haigh References: <45348FB8.4050009 at sheffield.ac.uk> Hi Nathan, I've updated the Bundle and uploaded it to CPAN. I *think* the rationale for keeping it still exists but I'm removed enough from Bioperl now that I'll defer to others on the decision. The basic idea was that BioPerl has a heck of a lot of dependencies that it requires of (other perl modules) in order to get all the functionality out of it. Many of these dependencies may not be present in default Perl installations. Tracking down all of the dependencies and installing them (along with all of the dependencies- of-the-dependencies) by hand is a massive pain. The nice thing about the Bundle is that it lists the core module dependencies and it works great with the CPAN.pm module to automate the downloading and installation of everything that BioPerl requires. The CPAN module is smart enough that when processing *our* bundle it will also track down and install anything that our bundle entries themselves list as a dependency. So for unix/Linux systems the Bundle is a great one-liner ("perl - MCPAN -e 'install Bundle::BioPerl'" ) way to auto-install or update the many perl modules that BioPerl makes use of. On the windows side, not sure if it is of any help though. Regards, Chris On Oct 17, 2006, at 4:09 AM, Nathan S. Haigh wrote: > Hi Chris > > I've been working on making a PPD for the upcoming Bioperl 1.5.2 > release. During this time I also updated Bundle::BioPerl to include > up-to-date prereqs. I was wondering if you could update the CPAN > package? The updated BioPerl.pm file is attached. > > There is some talk about why and if we need Bundle::BioPerl > anymore. What was the rationale for having it in the first place, > and does it still hold true now? > > Cheers > Nath > From plu5even at gmail.com Tue Oct 17 12:26:34 2006 From: plu5even at gmail.com (Peter H. Baenziger) Date: Tue, 17 Oct 2006 12:26:34 -0400 Subject: [Bioperl-l] LocatableSeq object vs Sequence Object Message-ID: <830b42e70610170926h5bdaabd6hc1352b8d4f3c8c4c@mail.gmail.com> All, This is my first bioperl script (but not my first Perl script) so please forgive my naivety. I've read through documentation and looked through cookbooks and the like but to no avail. Any advice is appreciated. So...I am working with an alignment object of several sequences. My intentions is to loop through all the sequences of the alignment to find what amino acid they have at a known position in the alignment (not the position in the sequence). I was thinking I could use: foreach $seq ($alignment->each_seq()) to loop through the sequences and call: $seq->location_from_column($pos) on each of the sequences. However, I don't think I have "LocatableSequences" (the type of object that has method "location_from_columns") being returned by $alignment->each_seq(). So, how do I bridge this gap here? Or is there a better way? My appreciation in advance! Peter code: my $swissObj = $swissdb->get_Seq_by_acc($query); //put several of these in @sequenceObjects ... my $alignFactory = Bio::Tools::Run::Alignment::Clustalw->new(); my $alignment = $alignFactory->align(\@sequenceObjects); #print $alignment->overall_percentage_identity(); #works #now we find the "alignment position" of the mutation we have on the human version and get the amino acid at that "alignment position" for all seq my $humanSequence = $prefix."HUMAN"; my $pos = $alignment->column_from_residue_number($humanSequence, $aa_seqpos); #this is the "alignment position" equivalent to the mutation position #we'll keep track of what amino acid each species has at the "alignment equivalent" location listed as being a mutation on the the human version foreach $seq ($alignment->each_seq()) { #print $seq->species() . "\n"; #won't work because $alignment->each_seq() actually returns a locatableSeq object, not a normal sequence object $speciesAA{$species} = $seq->locatation_from_column($pos); } -- <<->> Peter H. Baenziger From akarger at CGR.Harvard.edu Tue Oct 17 12:53:19 2006 From: akarger at CGR.Harvard.edu (Amir Karger) Date: Tue, 17 Oct 2006 12:53:19 -0400 Subject: [Bioperl-l] split location problems Message-ID: > From: Jason Stajich [mailto:jason.stajich at gmail.com] > > The whole point of split locations is to represent genes with > introns > so that is not the "rare" case. Absolutely. > I have processed the genbank fungal genomes into GFF3 and > have had no > problems so I'm confused where you are breaking down. If I write > them out as embl I also get the correct thing. This is using > the CVS > version of bioperl from the HEAD. > > I've added code to test this to bug 2101 including a C.glabrata > chromsome downloaded from genbank. Perhaps the problem is on the > EMBL parsing side, I didn't test that. Well, I don't know whether it's EMBL parsing, or a bit further down the pipe, but I downloaded C.glabrata chromosome B for GenBank (NC_005968), and it describes the complement/joins in the way that Bioperl is handling correctly. GenBank: CDS complement(join(10347..10372,10632..11157)) /locus_tag="CAGL0B00242g" EMBL: FT CDS join(complement(10632..11157),complement(10347..10372)) FT /locus_tag="CAGL0B00242g" Here's the diff when I run the location-printing script I posted yesterday: diff biogb bio 1c1,5 < complement(join(10347..10372,10632..11157)) --- > complement(1701..2651) > complement(2635..3345) > complement(3980..4408) > complement(join(10632..11157,10347..10372)) > 10379..10615 209a214,217 > 498198..498890 > 499712..500062 > 499851..500702 > 500579..501364 As you can see, the complement/join CDS is written out in a different order, which is Bad. (I looked at at least one of the other differences: the GB file says it's a "misc feature" and EMBL says it's a CDS. But they don't seem to be relevant here.) -Amir > > On the technical side, I still am not sure I fully know where the > strand information should be stored - the top level container or the > sub-features. I'll try and stay up on the discussion if > anything has > been decided that I should know about. > > -jason > > > > From paul.boutros at utoronto.ca Tue Oct 17 12:57:19 2006 From: paul.boutros at utoronto.ca (Paul Boutros) Date: Tue, 17 Oct 2006 12:57:19 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 Message-ID: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca> Hi, Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two failed tests, the first seems to be just a result of me not having DBD::mysql installed. Paul Test Summary ============ Failed Test Stat Wstat Total Fail List of Failed ------------------------------------------------------------------------------- t/BioDBSeqFeature_mysql.t 46 46 1-46 t/SearchIO.t 22 5632 1337 2671 2-1337 2 tests and 106 subtests skipped. Failed 2/236 test scripts. 1382/11688 subtests failed. Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys = 159.61 CPU) BioDBSeqFeature_mysql ===================== pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t 1..46 install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8 /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at (eval 37) line 3. Perhaps the DBD::mysql perl module hasn't been fully installed, or perhaps the capitalisation of 'mysql' isn't right. Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge. at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208 SearchIO ======== pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more 1..1337 ok 1 -------------------- WARNING --------------------- MSG: XML::SAX::Expat not currently supported; must have local copies of NCBI DTD docs! --------------------------------------------------- -------------------- WARNING --------------------- MSG: error in parsing a report: 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' does not exist file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd Handler couldn't resolve external entity at line 2, column 82, byte 104 error in processing external entity reference at line 2, column 82, byte 104 at /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line 187 --------------------------------------------------- not ok 2 # Failed test 2 in t/SearchIO.t at line 68 Can't call method "database_name" on an undefined value at t/SearchIO.t line 69. ------------------------------ Message: 10 Date: Tue, 17 Oct 2006 11:32:54 +0100 From: Sendu Bala Subject: [Bioperl-l] Bioperl 1.5.2 RC2 To: bioperl-l at bioperl.org Message-ID: <4534B156.4090501 at sendu.me.uk> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. See http://www.bioperl.org/wiki/Release_1.5.2 for instructions on getting and testing this RC. Developers: This should be the last RC before release ~next monday. Now would be a good time for last minute documentaiton updates and additions. Users: Even though 1.5.2 is a 'developer' release, we consider it the most stable and capable version of Bioperl, and recommend that you use it in all but the most critical production environments. Please try it out and let us know of any problems or difficulties you run into. Thank you, Sendu. From barry.moore at genetics.utah.edu Tue Oct 17 12:57:48 2006 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Tue, 17 Oct 2006 10:57:48 -0600 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <000301c6f1f5$2d3509d0$15327e82@pyrimidine> References: <000301c6f1f5$2d3509d0$15327e82@pyrimidine> Message-ID: <3CACE1AB-D7F2-4F54-A1C3-35E1AE444CD1@genetics.utah.edu> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix does a reasonable job of textifying html. You get the links as numbered references at the bottom or: lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix | perl -ane 's/\[?\[\d+\](edit\])?//g;print' to remove the links all together. Barry P.S. Looks like this: #Creative Commons copyright Installing Bioperl for Unix From BioPerl Jump to: navigation, search Contents * 1 BIOPERL INSTALLATION * 2 SYSTEM REQUIREMENTS * 3 OPTIONAL * 4 ADDITIONAL INSTALLATION INFORMATION * 5 THE BIOPERL BUNDLE * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make' * 8 WHERE ARE THE MAN PAGES? * 9 EXTERNAL PROGRAMS + 9.1 Environment Variables * 10 INSTALLING BIOPERL SCRIPTS * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA * 12 INSTALLING BIOPERL MODULES THE HARD WAY * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION * 14 THE TEST SYSTEM * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE + 15.1 CONFIGURING for BSD and Solaris boxes + 15.2 INSTALLATION * 16 DEPENDENCIES AND Bundle::BioPerl BIOPERL INSTALLATION Bioperl has been installed on many forms of Unix, Win9X/NT/2000/XP, and on Mac OS X (see the PLATFORMS file for more details). Following are instructions for installing Bioperl for Unix/Linux/Mac OS X; Windows installation instructions can be found here. For installing Bioperl for Mac OS X using Fink, see Getting BioPerl. SYSTEM REQUIREMENTS * Perl 5.005 or later; version 5.6 and greater are recommended. Note that most modules will work with earlier versions of Perl. The only ones that will not are Bio::SimpleAlign and the Bio::Index::* modules. If you don't need these modules and you want to install Bioperl using an earlier version of Perl, edit the "require 5.005;" line in Makefile.PL as necessary. * External modules: Bioperl uses functionality provided in other Perl modules. Some of these are included in the standard perl package but some need to be obtained from the CPAN site. The list of external modules is included at the bottom of this document. The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of these external modules easy. Simply install the bundle using your CPAN shell and all necessary modules will be installed. See THE BIOPERL BUNDLE, below. OPTIONAL * ANSI C or GNU C compiler (gcc) for XS extensions (the bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext PACKAGE, below). ADDITIONAL INSTALLATION INFORMATION * Additional information on Bioperl and MAC OS: + OS 9 - http://bioperl.org/Core/mac-bioperl.html + OSX-http://www.tc.umn.edu/~cann0010/ Bioperl_OSX_install.html + OS X - Installing using Fink (in Getting BioPerl) THE BIOPERL BUNDLE You typically need root privileges to install using CPAN. If you don't have these privileges please see INSTALLING BIOPERL IN A PERSONAL MODULE AREA for additional information. Install Bundle::Bioperl using CPAN. One way: >perl -MCPAN -e "install Bundle::BioPerl" Another way: >perl -MCPAN -e shell cpan>install Bundle::BioPerl On Oct 17, 2006, at 8:04 AM, Chris Fields wrote: > There isn't a very easy way since so many links have to be removed/ > modified. > I have found a few CPAN modules that could help, but for now I just > dump the > text output from a text browser (elinks) using the 'printable > version' page > and hand-edit, which works very quickly. That works for the time > being > until I can find another more automated solution. > > Fortunately there have been very few edits to either INSTALL wiki > page so > they should remain relatively stable. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > >> -----Original Message----- >> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk] >> Sent: Tuesday, October 17, 2006 6:46 AM >> To: Chris Fields >> Cc: bioperl-l >> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN >> >> Chris Fields wrote: >>> The general consensus was to keep text versions available; we could >>> add URL links to the wiki pages for the most up-to-dat version. >>> BTW, >>> I have modified INSTALL already. INSTALL.WIN is next in line (I was >>> waiting for your changes). >>> >> Is it possible to generate these files from the wiki whenever >> there is a >> release? I now edits shouldn't be too severe or too often - but I can >> see things getting a little messy/annoying if edits have to be >> made in 2 >> places. >> >> Nath > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From niels at genomics.dk Tue Oct 17 12:58:14 2006 From: niels at genomics.dk (Niels Larsen) Date: Tue, 17 Oct 2006 18:58:14 +0200 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4534E207.8030508@sendu.me.uk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> Message-ID: <45350BA6.3040102@genomics.dk> Ok, here are ways to reproduce; I sure apologize if I made the test scripts wrong. And I suppose EBI/DDBJ's interfaces are not a bioperl issue really. Niels ------------ EBI I invoked the EBI script http://www.ebi.ac.uk/Tools/webservices/downloads/WSWUBlastSOAPLite.zip like this WSWUBlastClient.pl -p blastn -D embl test.fasta where the content of test.fasta is below, and got Can't find method element in the message at /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311. >Planctomyces sp. 282; Genbank Taxonomy ID: 79927 AATGAACGTTGGCGGCATGGATTAGGCATGCAAGTCGAGGGAGAACCCGCAAGGGGACACCGGCG AACGGGGTAGGAATACATAGGTAACGTACCCTCAGGACGGGGATAGCCAAGGGAAACTTTGGGTA ATACCCGATGTGATGGCAAGATGTGAATGCTTGTCATCAAAGGTGAGATTCCACCTGAGGAGCGG CTTATGCATCATTAGCTTGTTGGCGGGGTAACGGCCCACCAAGGCTGCGATGATTAGGGGGTGTG AGAGCATGGCCCCCACCACTGGCACTGAGACACTGGCCAGACACCTACGGGTGGCTGCAGTCGAG I tried with this test sequence in fasta format and with just the sequence. ------------ DDBJ Inspired by this page, http://xml.nig.ac.jp/doc/Blast.txt I made this test script ------ cut -- #!/usr/bin/env perl use strict; use warnings FATAL => qw ( all ); my ( $service, $seqstr, $result ); use SOAP::Lite; use Data::Dumper; $service = SOAP::Lite->service('http://xml.nig.ac.jp/wsdl/Blast.wsdl'); $seqstr = "MSSRIARALALVVTLLHLTRLALSTCPAACHCPLEAPKCAPGVGLVRDGCGCCKVCAKQL"; $result = $service->searchSimple( "blastp", "SWISS", $seqstr ); print Dumper( $result ); ------ cut -- which for me prints undef. ------------- NCBI/Bioperl I installed 1.5.2-RC2, looked at the RemoteBlast example in http://www.bioperl.org/wiki/Bptutorial.pl and then put that into this test code, more or less cut/paste, --- cut -- #!/usr/bin/env perl use strict; use warnings FATAL => qw ( all ); use Bio::Tools::Run::RemoteBlast; use Data::Dumper; my ( $remote_blast, $r, $rc, $rid, @rids ); $remote_blast = Bio::Tools::Run::RemoteBlast->new ( -prog => 'blastn', -data => 'ecoli', -expect => '1e-10' ); $r = $remote_blast->submit_blast("ecoli.fasta"); while ( @rids = $remote_blast->each_rid ) { # print Dumper( \@rids ); for $rid ( @rids ) { $rc = $remote_blast->retrieve_blast($rid); # print Dumper( $rc ); } sleep 10; } --- cut -- which saves the same blast report to TMPDIR for every 10 seconds. The "ecoli.fasta" file contains this >test gggggctctgttggttctcccgcaacgctactctgtttaccaggtcaggtccggaaggaa gcagccaaggcagatgacgcgtgtgccgggatgtagctggcagggcccccaccc Maybe I am supposed to add a check for content in $rc and then stop the inner loop? I could figure that out maybe, but I wish there was a function which simply takes a single sequence + arguments and only returns a list of matches when done, and does not return until then (or until a specified timeout). ------------------------------------------------------------------------ Niels Larsen Danish Genome Institute Gustav Wieds vej 10 C DK-8000 Aarhus C Denmark Electronic mail: niels at genomics.dk Skype: niels_larsen_denmark Telephone: +45-8942-5268 Telefax: +45-8620-1222 ------------------------------------------------------------------------ From bertrand.beckert at gmail.com Tue Oct 17 10:52:36 2006 From: bertrand.beckert at gmail.com (bertrand beckert) Date: Tue, 17 Oct 2006 16:52:36 +0200 Subject: [Bioperl-l] problems with: Bio::Tools::Run::RemoteBlast Message-ID: <500217090610170752q565cfc08t5208e3b64f99ef7f@mail.gmail.com> hi, I am running a large number of blasts via a connexion to ncbi blast page ('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi'). I try to use 'Bio::Tools::Run::RemoteBlast' but unfortunately I have some problems. I make a simple example with only one sequence in order to understand how work this module. This is my simple input file, a DNA sequence in fasta form: >test TTTTGATGAGGCGCATCAATCATGAGTAAAGTTTAGATTACTGTCTGCTAACAGCTGAAT TTGAAAGGGTGCGATGCCGAAGCGATTATAATAGCAGTTATAATTTGTTGGACTTTTTGG TTAAGAGCTGAGAGTTTGTCATTATTTAAAAATAATGGAGTGCATCACTTGTA I have made some modification of the example available in doc of bioperl. It give me a RID which contain the results of my blast but I have a problem with the "$result=$factory->retrieve_blast($rid)" in my script. In the documentation it wrote that $result=$factory->retrieve_blast ($rid) return when it work a Bio::Tools::Bplite or Bio::Tools::Blast object. In my case it returns a Bio::SearchIO::blast... I don't understand why I don't have the good type of object return (see PART I). I also try to resolve the problem by replace the foreach loop in my script by a new one in order to explore the blast page result but it also don't work (see part II). could you help me please. Thank you Bertrand Beckert. PART I: Here is my script with a little annotation and also the shell window printing: ------------------------------------------------------------------------ #!/usr/bin/perl -w use Bio::Tools::Run::RemoteBlast; use Bio::SearchIO; sub blast { my $prog='blastn'; my $db='refseq_genomic'; my $e_val='1e-10'; my $Input='Seq.fasta'; my @params = ('-prog' => $prog, '-data' => $db, '-expect' => $e_val, '-readmethod' => 'SearchIO'); my $factory = Bio::Tools::Run::RemoteBlast->new(@params); #changes parameters $Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'}='Bacteria [ORGN]'; $Bio::Tools::Run::RemoteBlast::HEADER{'MATRIX_NAME'}='BLOSUM25'; $factory->submit_blast($Input); print STDERR "waiting...\n"; while (my @rids=$factory->each_rid) { print "my rid: ", at rids,"\n"; #return me the ID of the submited blast i.e. RID: 1161079157-766-185099855365.BLASTQ2 #this page contains the result of my blast... foreach my $rid (@rids) { $result=$factory->retrieve_blast($rid); #line in order to understand what type of object is return by retrieve_blast print "rc:", $result,"\n"; } } } &blast; ------------------------------------------------------------------------ here you can see the shell window: bbeckert at tatooine:~/Script_perl$ ./test.pl waiting... my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x890bc54) my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x890bc30) my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x89eb7f4) my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x8a2cc74) my rid: 1161079157-766-185099855365.BLASTQ2 ... my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x886bbac) my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x89eb5f0) my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x8a2d2d4) my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x84fa054) ... PARTII: I also try to resolve the problem by replace the foreach loop in my script by: ------------------------------------------------------------------------ foreach my $rid (@rids) { while(1) { $result=$factory->retrieve_blast($rid)->next_result(); print "rc:", $result,"\n"; if ($result) { print $result->num_hits(),"\n"; } ------------------------------------------------------------------------ With tis loop I could explore the result Blast page. that is what I obtain in the shell window: bbeckert at tatooine:~/Script_perl$ ./test.pl waiting... my rid: 1161088606-9905-123050755601.BLASTQ4 Use of uninitialized value in print at ./retrieve_blast.pl line 30. rc: Use of uninitialized value in print at ./retrieve_blast.pl line 30. rc: Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::Search::Result::BlastResult=HASH(0x84fb8b8) 0 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::Search::Result::BlastResult=HASH(0x84fba8c) 0 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::Search::Result::BlastResult=HASH(0x84fb834) ---- -- Berrtrand BECKERT PhD student IBMC - UPR 9002 du CNRS - ARN 15, rue Rene Descartes F-67084 STRASBOURG Cedex b.beckert at ibmc.u-strasbg.fr bertrand.beckert at gmail.com From cjfields at uiuc.edu Tue Oct 17 13:50:49 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 12:50:49 -0500 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <3CACE1AB-D7F2-4F54-A1C3-35E1AE444CD1@genetics.utah.edu> Message-ID: <001201c6f214$c8934440$15327e82@pyrimidine> (Apologies for the top post, but I thought my response might get lost below) I use elinks in a similar fashion. It tends to format the tables a bit better than lynx. Chris > -----Original Message----- > From: Barry Moore [mailto:barry.moore at genetics.utah.edu] > Sent: Tuesday, October 17, 2006 11:58 AM > To: Chris Fields > Cc: 'Nathan S. Haigh'; 'bioperl-l' > Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN > > lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix > > does a reasonable job of textifying html. You get the links as > numbered references at the bottom or: > > lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix | > perl -ane 's/\[?\[\d+\](edit\])?//g;print' > > to remove the links all together. > > Barry > > P.S. Looks like this: > > #Creative Commons copyright > > Installing Bioperl for Unix > > From BioPerl > > Jump to: navigation, search > > Contents > > * 1 BIOPERL INSTALLATION > * 2 SYSTEM REQUIREMENTS > * 3 OPTIONAL > * 4 ADDITIONAL INSTALLATION INFORMATION > * 5 THE BIOPERL BUNDLE > * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN > * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make' > * 8 WHERE ARE THE MAN PAGES? > * 9 EXTERNAL PROGRAMS > + 9.1 Environment Variables > * 10 INSTALLING BIOPERL SCRIPTS > * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA > * 12 INSTALLING BIOPERL MODULES THE HARD WAY > * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION > * 14 THE TEST SYSTEM > * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE > + 15.1 CONFIGURING for BSD and Solaris boxes > + 15.2 INSTALLATION > * 16 DEPENDENCIES AND Bundle::BioPerl > > > BIOPERL INSTALLATION > > Bioperl has been installed on many forms of Unix, Win9X/NT/2000/XP, > and on Mac OS X (see the PLATFORMS file for more details). > Following are > instructions for installing Bioperl for Unix/Linux/Mac OS X; > Windows > installation instructions can be found here. For installing > Bioperl for > Mac OS X using Fink, see Getting BioPerl. > > > SYSTEM REQUIREMENTS > > * Perl 5.005 or later; version 5.6 and greater are recommended. > Note > that most modules will work with earlier versions of Perl. > The only ones > that will not are Bio::SimpleAlign and the Bio::Index::* > modules. If > you don't need these modules and you want to install Bioperl > using an > earlier version of Perl, edit the "require 5.005;" line in > Makefile.PL > as necessary. > > * External modules: Bioperl uses functionality provided in > other Perl > modules. Some of these are included in the standard perl > package but > some need to be obtained from the CPAN site. The list of > external > modules is included at the bottom of this document. > > The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of > these > external modules easy. Simply install the bundle using your CPAN > shell and > all necessary modules will be installed. See THE BIOPERL BUNDLE, > below. > > > OPTIONAL > > * ANSI C or GNU C compiler (gcc) for XS extensions (the > bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext > PACKAGE, below). > > > > ADDITIONAL INSTALLATION INFORMATION > > * Additional information on Bioperl and MAC OS: > + OS 9 - http://bioperl.org/Core/mac-bioperl.html > + OSX-http://www.tc.umn.edu/~cann0010/ > Bioperl_OSX_install.html > + OS X - Installing using Fink (in Getting BioPerl) > > > > THE BIOPERL BUNDLE > > You typically need root privileges to install using CPAN. If you > don't > have these privileges please see INSTALLING BIOPERL IN A PERSONAL > MODULE > AREA for additional information. > > Install Bundle::Bioperl using CPAN. One way: > >perl -MCPAN -e "install Bundle::BioPerl" > > Another way: > >perl -MCPAN -e shell > cpan>install Bundle::BioPerl > > > > On Oct 17, 2006, at 8:04 AM, Chris Fields wrote: > > > There isn't a very easy way since so many links have to be removed/ > > modified. > > I have found a few CPAN modules that could help, but for now I just > > dump the > > text output from a text browser (elinks) using the 'printable > > version' page > > and hand-edit, which works very quickly. That works for the time > > being > > until I can find another more automated solution. > > > > Fortunately there have been very few edits to either INSTALL wiki > > page so > > they should remain relatively stable. > > > > Christopher Fields > > Postdoctoral Researcher - Switzer Lab > > Dept. of Biochemistry > > University of Illinois Urbana-Champaign > > > >> -----Original Message----- > >> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk] > >> Sent: Tuesday, October 17, 2006 6:46 AM > >> To: Chris Fields > >> Cc: bioperl-l > >> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN > >> > >> Chris Fields wrote: > >>> The general consensus was to keep text versions available; we could > >>> add URL links to the wiki pages for the most up-to-dat version. > >>> BTW, > >>> I have modified INSTALL already. INSTALL.WIN is next in line (I was > >>> waiting for your changes). > >>> > >> Is it possible to generate these files from the wiki whenever > >> there is a > >> release? I now edits shouldn't be too severe or too often - but I can > >> see things getting a little messy/annoying if edits have to be > >> made in 2 > >> places. > >> > >> Nath > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Tue Oct 17 13:52:36 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 12:52:36 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca> Message-ID: <001301c6f215$07a9a070$15327e82@pyrimidine> What do you get when you run the SearchIO.t test by itself using 'perl -I. t/SearchIO.t'? It looks like something pretty catastrophic happened. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Paul Boutros > Sent: Tuesday, October 17, 2006 11:57 AM > To: bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2 > > Hi, > Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two failed > tests, the first seems to be just a result of me not having DBD::mysql > installed. > Paul > > Test Summary > ============ > > Failed Test Stat Wstat Total Fail List of Failed > -------------------------------------------------------------------------- > ----- > t/BioDBSeqFeature_mysql.t 46 46 1-46 > t/SearchIO.t 22 5632 1337 2671 2-1337 > 2 tests and 106 subtests skipped. > Failed 2/236 test scripts. 1382/11688 subtests failed. > Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys = > 159.61 CPU) > > BioDBSeqFeature_mysql > ===================== > pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t > 1..46 > install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC > contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t > /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8 > /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi > /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at > (eval 37) line 3. > Perhaps the DBD::mysql perl module hasn't been fully installed, > or perhaps the capitalisation of 'mysql' isn't right. > Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge. > at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208 > > SearchIO > ======== > pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more > 1..1337 > ok 1 > > -------------------- WARNING --------------------- > MSG: XML::SAX::Expat not currently supported; must have local copies > of NCBI DTD docs! > --------------------------------------------------- > > -------------------- WARNING --------------------- > MSG: error in parsing a report: > > 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' > does not exist > file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd > Handler couldn't resolve external entity at line 2, column 82, byte 104 > error in processing external entity reference at line 2, column 82, > byte 104 at > /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line > 187 > > --------------------------------------------------- > not ok 2 > # Failed test 2 in t/SearchIO.t at line 68 > Can't call method "database_name" on an undefined value at > t/SearchIO.t line 69. > > ------------------------------ > > Message: 10 > Date: Tue, 17 Oct 2006 11:32:54 +0100 > From: Sendu Bala > Subject: [Bioperl-l] Bioperl 1.5.2 RC2 > To: bioperl-l at bioperl.org > Message-ID: <4534B156.4090501 at sendu.me.uk> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. > See http://www.bioperl.org/wiki/Release_1.5.2 for > instructions on getting and testing this RC. > > Developers: > This should be the last RC before release ~next monday. Now would > be a good time for last minute documentaiton updates and additions. > > Users: > Even though 1.5.2 is a 'developer' release, we consider it the most > stable and capable version of Bioperl, and recommend that you use > it in all but the most critical production environments. Please > try it out and let us know of any problems or difficulties you run > into. > > > Thank you, > Sendu. > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From paul.boutros at utoronto.ca Tue Oct 17 13:59:33 2006 From: paul.boutros at utoronto.ca (Paul Boutros) Date: Tue, 17 Oct 2006 13:59:33 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <001301c6f215$07a9a070$15327e82@pyrimidine> References: <001301c6f215$07a9a070$15327e82@pyrimidine> Message-ID: <20061017135933.cu3li7pz8fzk8cko@webmail.utoronto.ca> Hi Chris, Here it is: pcboutro at ccb690[643] >> perl -I. t/SearchIO.t 1..1337 ok 1 -------------------- WARNING --------------------- MSG: XML::SAX::Expat not currently supported; must have local copies of NCBI DTD docs! --------------------------------------------------- -------------------- WARNING --------------------- MSG: error in parsing a report: 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' does not exist file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd Handler couldn't resolve external entity at line 2, column 82, byte 104 error in processing external entity reference at line 2, column 82, byte 104 at /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line 187 --------------------------------------------------- not ok 2 # Failed test 2 in t/SearchIO.t at line 68 Can't call method "database_name" on an undefined value at t/SearchIO.t line 69. Quoting Chris Fields : > What do you get when you run the SearchIO.t test by itself using 'perl -I. > t/SearchIO.t'? It looks like something pretty catastrophic happened. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > >> -----Original Message----- >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >> bounces at lists.open-bio.org] On Behalf Of Paul Boutros >> Sent: Tuesday, October 17, 2006 11:57 AM >> To: bioperl-l at lists.open-bio.org >> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2 >> >> Hi, >> Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two failed >> tests, the first seems to be just a result of me not having DBD::mysql >> installed. >> Paul >> >> Test Summary >> ============ >> >> Failed Test Stat Wstat Total Fail List of Failed >> -------------------------------------------------------------------------- >> ----- >> t/BioDBSeqFeature_mysql.t 46 46 1-46 >> t/SearchIO.t 22 5632 1337 2671 2-1337 >> 2 tests and 106 subtests skipped. >> Failed 2/236 test scripts. 1382/11688 subtests failed. >> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys = >> 159.61 CPU) >> >> BioDBSeqFeature_mysql >> ===================== >> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t >> 1..46 >> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC >> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t >> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8 >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi >> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at >> (eval 37) line 3. >> Perhaps the DBD::mysql perl module hasn't been fully installed, >> or perhaps the capitalisation of 'mysql' isn't right. >> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge. >> at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208 >> >> SearchIO >> ======== >> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more >> 1..1337 >> ok 1 >> >> -------------------- WARNING --------------------- >> MSG: XML::SAX::Expat not currently supported; must have local copies >> of NCBI DTD docs! >> --------------------------------------------------- >> >> -------------------- WARNING --------------------- >> MSG: error in parsing a report: >> >> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' >> does not exist >> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd >> Handler couldn't resolve external entity at line 2, column 82, byte 104 >> error in processing external entity reference at line 2, column 82, >> byte 104 at >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line >> 187 >> >> --------------------------------------------------- >> not ok 2 >> # Failed test 2 in t/SearchIO.t at line 68 >> Can't call method "database_name" on an undefined value at >> t/SearchIO.t line 69. >> >> ------------------------------ >> >> Message: 10 >> Date: Tue, 17 Oct 2006 11:32:54 +0100 >> From: Sendu Bala >> Subject: [Bioperl-l] Bioperl 1.5.2 RC2 >> To: bioperl-l at bioperl.org >> Message-ID: <4534B156.4090501 at sendu.me.uk> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed >> >> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. >> See http://www.bioperl.org/wiki/Release_1.5.2 for >> instructions on getting and testing this RC. >> >> Developers: >> This should be the last RC before release ~next monday. Now would >> be a good time for last minute documentaiton updates and additions. >> >> Users: >> Even though 1.5.2 is a 'developer' release, we consider it the most >> stable and capable version of Bioperl, and recommend that you use >> it in all but the most critical production environments. Please >> try it out and let us know of any problems or difficulties you run >> into. >> >> >> Thank you, >> Sendu. >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From barry.moore at genetics.utah.edu Tue Oct 17 14:07:12 2006 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Tue, 17 Oct 2006 12:07:12 -0600 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: References: Message-ID: <588DE26B-8F18-4540-BAEE-2B479CBDE8B3@genetics.utah.edu> In fact, I think it was you who taught me that trick in the first place. B On Oct 17, 2006, at 11:40 AM, Brian Osborne wrote: > Barry, > > I second that. lynx does the best job of converting HTML to text > I've seen. > > Brian O. > > > On 10/17/06 12:57 PM, "Barry Moore" > wrote: > >> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix >> >> does a reasonable job of textifying html. You get the links as >> numbered references at the bottom or: >> >> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix | >> perl -ane 's/\[?\[\d+\](edit\])?//g;print' >> >> to remove the links all together. >> >> Barry >> >> P.S. Looks like this: >> >> #Creative Commons copyright >> >> Installing Bioperl for Unix >> >> From BioPerl >> >> Jump to: navigation, search >> >> Contents >> >> * 1 BIOPERL INSTALLATION >> * 2 SYSTEM REQUIREMENTS >> * 3 OPTIONAL >> * 4 ADDITIONAL INSTALLATION INFORMATION >> * 5 THE BIOPERL BUNDLE >> * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN >> * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make' >> * 8 WHERE ARE THE MAN PAGES? >> * 9 EXTERNAL PROGRAMS >> + 9.1 Environment Variables >> * 10 INSTALLING BIOPERL SCRIPTS >> * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA >> * 12 INSTALLING BIOPERL MODULES THE HARD WAY >> * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION >> * 14 THE TEST SYSTEM >> * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE >> + 15.1 CONFIGURING for BSD and Solaris boxes >> + 15.2 INSTALLATION >> * 16 DEPENDENCIES AND Bundle::BioPerl >> >> >> BIOPERL INSTALLATION >> >> Bioperl has been installed on many forms of Unix, Win9X/NT/ >> 2000/XP, >> and on Mac OS X (see the PLATFORMS file for more details). >> Following are >> instructions for installing Bioperl for Unix/Linux/Mac OS X; >> Windows >> installation instructions can be found here. For installing >> Bioperl for >> Mac OS X using Fink, see Getting BioPerl. >> >> >> SYSTEM REQUIREMENTS >> >> * Perl 5.005 or later; version 5.6 and greater are recommended. >> Note >> that most modules will work with earlier versions of Perl. >> The only ones >> that will not are Bio::SimpleAlign and the Bio::Index::* >> modules. If >> you don't need these modules and you want to install Bioperl >> using an >> earlier version of Perl, edit the "require 5.005;" line in >> Makefile.PL >> as necessary. >> >> * External modules: Bioperl uses functionality provided in >> other Perl >> modules. Some of these are included in the standard perl >> package but >> some need to be obtained from the CPAN site. The list of >> external >> modules is included at the bottom of this document. >> >> The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of >> these >> external modules easy. Simply install the bundle using your CPAN >> shell and >> all necessary modules will be installed. See THE BIOPERL BUNDLE, >> below. >> >> >> OPTIONAL >> >> * ANSI C or GNU C compiler (gcc) for XS extensions >> (the >> bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext >> PACKAGE, below). >> >> >> >> ADDITIONAL INSTALLATION INFORMATION >> >> * Additional information on Bioperl and MAC OS: >> + OS 9 - http://bioperl.org/Core/mac-bioperl.html >> + OSX-http://www.tc.umn.edu/~cann0010/ >> Bioperl_OSX_install.html >> + OS X - Installing using Fink (in Getting BioPerl) >> >> >> >> THE BIOPERL BUNDLE >> >> You typically need root privileges to install using CPAN. If you >> don't >> have these privileges please see INSTALLING BIOPERL IN A PERSONAL >> MODULE >> AREA for additional information. >> >> Install Bundle::Bioperl using CPAN. One way: >>> perl -MCPAN -e "install Bundle::BioPerl" >> >> Another way: >>> perl -MCPAN -e shell >> cpan>install Bundle::BioPerl >> >> >> >> On Oct 17, 2006, at 8:04 AM, Chris Fields wrote: >> >>> There isn't a very easy way since so many links have to be removed/ >>> modified. >>> I have found a few CPAN modules that could help, but for now I just >>> dump the >>> text output from a text browser (elinks) using the 'printable >>> version' page >>> and hand-edit, which works very quickly. That works for the time >>> being >>> until I can find another more automated solution. >>> >>> Fortunately there have been very few edits to either INSTALL wiki >>> page so >>> they should remain relatively stable. >>> >>> Christopher Fields >>> Postdoctoral Researcher - Switzer Lab >>> Dept. of Biochemistry >>> University of Illinois Urbana-Champaign >>> >>>> -----Original Message----- >>>> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk] >>>> Sent: Tuesday, October 17, 2006 6:46 AM >>>> To: Chris Fields >>>> Cc: bioperl-l >>>> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN >>>> >>>> Chris Fields wrote: >>>>> The general consensus was to keep text versions available; we >>>>> could >>>>> add URL links to the wiki pages for the most up-to-dat version. >>>>> BTW, >>>>> I have modified INSTALL already. INSTALL.WIN is next in line >>>>> (I was >>>>> waiting for your changes). >>>>> >>>> Is it possible to generate these files from the wiki whenever >>>> there is a >>>> release? I now edits shouldn't be too severe or too often - but >>>> I can >>>> see things getting a little messy/annoying if edits have to be >>>> made in 2 >>>> places. >>>> >>>> Nath >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From bix at sendu.me.uk Tue Oct 17 14:07:04 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 17 Oct 2006 19:07:04 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca> References: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca> Message-ID: <45351BC8.9080507@sendu.me.uk> Paul Boutros wrote: > Hi, > Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two failed > tests, the first seems to be just a result of me not having DBD::mysql > installed. [snip] Thanks for those, very useful. Not something that's come up before afaik; I'll look into them. From cjfields at uiuc.edu Tue Oct 17 14:31:51 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 13:31:51 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <20061017135933.cu3li7pz8fzk8cko@webmail.utoronto.ca> Message-ID: <001401c6f21a$836f9fc0$15327e82@pyrimidine> Your local copy of XML::SAX has XML::SAX::Expat set as the default SAX backend parser. For some reason BLAST XML parsing doesn't work with that parser (it tries to verify the XML first before parsing, hence the DTD error). I may try getting this to work again, but so far I haven't found an easy way to prevent XML verification via XML::SAX::Expat. There are two options: 1) install XML::SAX::ExpatXS (the better option), which works AND is 4x faster than XML::SAX::Expat, or 2) set the default parser in the PareserDetails.ini file in your local to use XML::SAX::PurePerl. BTW, XML::SAX::ExpatXS is intended to replace XML::SAX::Expat (it just hasn't officially happened yet); the latter hasn't had significant development in about three years. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: Paul Boutros [mailto:paul.boutros at utoronto.ca] > Sent: Tuesday, October 17, 2006 1:00 PM > To: Chris Fields > Cc: bioperl-l at lists.open-bio.org > Subject: RE: [Bioperl-l] Bioperl 1.5.2 RC2 > > Hi Chris, > > Here it is: > pcboutro at ccb690[643] >> perl -I. t/SearchIO.t > 1..1337 > ok 1 > > -------------------- WARNING --------------------- > MSG: XML::SAX::Expat not currently supported; must have local copies > of NCBI DTD docs! > --------------------------------------------------- > > -------------------- WARNING --------------------- > MSG: error in parsing a report: > > 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' > does not exist > file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd > Handler couldn't resolve external entity at line 2, column 82, byte 104 > error in processing external entity reference at line 2, column 82, > byte 104 at > /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line > 187 > > --------------------------------------------------- > not ok 2 > # Failed test 2 in t/SearchIO.t at line 68 > Can't call method "database_name" on an undefined value at > t/SearchIO.t line 69. > > > Quoting Chris Fields : > > > What do you get when you run the SearchIO.t test by itself using 'perl - > I. > > t/SearchIO.t'? It looks like something pretty catastrophic happened. > > > > Christopher Fields > > Postdoctoral Researcher - Switzer Lab > > Dept. of Biochemistry > > University of Illinois Urbana-Champaign > > > > > >> -----Original Message----- > >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > >> bounces at lists.open-bio.org] On Behalf Of Paul Boutros > >> Sent: Tuesday, October 17, 2006 11:57 AM > >> To: bioperl-l at lists.open-bio.org > >> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2 > >> > >> Hi, > >> Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two failed > >> tests, the first seems to be just a result of me not having DBD::mysql > >> installed. > >> Paul > >> > >> Test Summary > >> ============ > >> > >> Failed Test Stat Wstat Total Fail List of Failed > >> ----------------------------------------------------------------------- > --- > >> ----- > >> t/BioDBSeqFeature_mysql.t 46 46 1-46 > >> t/SearchIO.t 22 5632 1337 2671 2-1337 > >> 2 tests and 106 subtests skipped. > >> Failed 2/236 test scripts. 1382/11688 subtests failed. > >> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys = > >> 159.61 CPU) > >> > >> BioDBSeqFeature_mysql > >> ===================== > >> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t > >> 1..46 > >> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC > >> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t > >> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8 > >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi > >> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at > >> (eval 37) line 3. > >> Perhaps the DBD::mysql perl module hasn't been fully installed, > >> or perhaps the capitalisation of 'mysql' isn't right. > >> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge. > >> at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208 > >> > >> SearchIO > >> ======== > >> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more > >> 1..1337 > >> ok 1 > >> > >> -------------------- WARNING --------------------- > >> MSG: XML::SAX::Expat not currently supported; must have local copies > >> of NCBI DTD docs! > >> --------------------------------------------------- > >> > >> -------------------- WARNING --------------------- > >> MSG: error in parsing a report: > >> > >> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' > >> does not exist > >> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd > >> Handler couldn't resolve external entity at line 2, column 82, byte 104 > >> error in processing external entity reference at line 2, column 82, > >> byte 104 at > >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line > >> 187 > >> > >> --------------------------------------------------- > >> not ok 2 > >> # Failed test 2 in t/SearchIO.t at line 68 > >> Can't call method "database_name" on an undefined value at > >> t/SearchIO.t line 69. > >> > >> ------------------------------ > >> > >> Message: 10 > >> Date: Tue, 17 Oct 2006 11:32:54 +0100 > >> From: Sendu Bala > >> Subject: [Bioperl-l] Bioperl 1.5.2 RC2 > >> To: bioperl-l at bioperl.org > >> Message-ID: <4534B156.4090501 at sendu.me.uk> > >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed > >> > >> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. > >> See http://www.bioperl.org/wiki/Release_1.5.2 for > >> instructions on getting and testing this RC. > >> > >> Developers: > >> This should be the last RC before release ~next monday. Now would > >> be a good time for last minute documentaiton updates and additions. > >> > >> Users: > >> Even though 1.5.2 is a 'developer' release, we consider it the most > >> stable and capable version of Bioperl, and recommend that you use > >> it in all but the most critical production environments. Please > >> try it out and let us know of any problems or difficulties you run > >> into. > >> > >> > >> Thank you, > >> Sendu. > >> > >> > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > From cjfields at uiuc.edu Tue Oct 17 15:05:59 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 14:05:59 -0500 Subject: [Bioperl-l] split location problems In-Reply-To: Message-ID: <001b01c6f21f$48640420$15327e82@pyrimidine> > > From: Jason Stajich [mailto:jason.stajich at gmail.com] > > > > The whole point of split locations is to represent genes with > > introns > > so that is not the "rare" case. > > Absolutely. Right, but that specific kind of join statement is not commonly used in GenBank files, which seems to be the format predominately used (no offense to EBI). This may explain why we haven't seen this pop up more often. I believe we're seeing is a difference in the way these locations are described at NCBI vs EBI, which Nadeem Faruque seems to corroborate. He indicated that EBI may move to using similar GenBank-like location strings. Regardless, FTlocationFactory and Bio::Location::Split should handle both if they are present but only seems to like the GenBank version. > > I've added code to test this to bug 2101 including a C.glabrata > > chromsome downloaded from genbank. Perhaps the problem is on the > > EMBL parsing side, I didn't test that. > > Well, I don't know whether it's EMBL parsing, or a bit further down the > pipe, but I downloaded C.glabrata chromosome B for GenBank (NC_005968), > and it describes the complement/joins in the way that Bioperl is > handling correctly. > > GenBank: > CDS complement(join(10347..10372,10632..11157)) > /locus_tag="CAGL0B00242g" > > EMBL: > FT CDS > join(complement(10632..11157),complement(10347..10372)) > FT /locus_tag="CAGL0B00242g" Yes, something that I found out independently (and corroborated by Nadeem). > Here's the diff when I run the location-printing script I posted > yesterday: > > diff biogb bio > 1c1,5 > < complement(join(10347..10372,10632..11157)) > --- > > complement(1701..2651) > > complement(2635..3345) > > complement(3980..4408) > > complement(join(10632..11157,10347..10372)) > > 10379..10615 > 209a214,217 > > 498198..498890 > > 499712..500062 > > 499851..500702 > > 500579..501364 > > As you can see, the complement/join CDS is written out in a different > order, which is Bad. I think this can be handled directly in to_FTstring(). I'll have to add a method to get the strand info from the Split object w/o going through strand(). However, I'm thinking about trying a different tact which is a bit simpler and, if it proves fruitful, may simplify Split locations somewhat. It won't be ready for 1.5.2 but maybe the next release. > (I looked at at least one of the other differences: the GB file says > it's a "misc feature" and EMBL says it's a CDS. But they don't seem to > be relevant here.) > -Amir Probably not but something to keep in mind. -c Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From er at xs4all.nl Tue Oct 17 15:01:48 2006 From: er at xs4all.nl (Erikjan) Date: Tue, 17 Oct 2006 21:01:48 +0200 (CEST) Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <001301c6f215$07a9a070$15327e82@pyrimidine> References: <001301c6f215$07a9a070$15327e82@pyrimidine> Message-ID: <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> Hello, I noticed a little problem with the Annotation "DBLink" from GenBank entries When I run: perl -MBio::DB::GenBank -e 'my $gi = 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio = $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my $ac=$seq->annotation(); my @annotations = $ac->get_Annotations("dblink"); for(@annotations) { print $_, "\n";} print $INC{ "Bio/Annotation/DBLink.pm" }, "\n"; ' This yields: GenBank:AL591065.17.17 and the place where the used Bio/Annotation/DBLink.pm resides. Can others repeat this? I have dug into the source a little and Bio::Annotation::DBLink seems to be the place where this happens: it has a concatenation which leads to that repeated version number. It this something that I should fix "client-side", so to speak, or is it worthwhile to add some logic to that concatenation to prevent this? Thanks, Eric From bosborne11 at verizon.net Tue Oct 17 13:40:54 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 17 Oct 2006 13:40:54 -0400 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <3CACE1AB-D7F2-4F54-A1C3-35E1AE444CD1@genetics.utah.edu> Message-ID: Barry, I second that. lynx does the best job of converting HTML to text I've seen. Brian O. On 10/17/06 12:57 PM, "Barry Moore" wrote: > lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix > > does a reasonable job of textifying html. You get the links as > numbered references at the bottom or: > > lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix | > perl -ane 's/\[?\[\d+\](edit\])?//g;print' > > to remove the links all together. > > Barry > > P.S. Looks like this: > > #Creative Commons copyright > > Installing Bioperl for Unix > > From BioPerl > > Jump to: navigation, search > > Contents > > * 1 BIOPERL INSTALLATION > * 2 SYSTEM REQUIREMENTS > * 3 OPTIONAL > * 4 ADDITIONAL INSTALLATION INFORMATION > * 5 THE BIOPERL BUNDLE > * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN > * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make' > * 8 WHERE ARE THE MAN PAGES? > * 9 EXTERNAL PROGRAMS > + 9.1 Environment Variables > * 10 INSTALLING BIOPERL SCRIPTS > * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA > * 12 INSTALLING BIOPERL MODULES THE HARD WAY > * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION > * 14 THE TEST SYSTEM > * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE > + 15.1 CONFIGURING for BSD and Solaris boxes > + 15.2 INSTALLATION > * 16 DEPENDENCIES AND Bundle::BioPerl > > > BIOPERL INSTALLATION > > Bioperl has been installed on many forms of Unix, Win9X/NT/2000/XP, > and on Mac OS X (see the PLATFORMS file for more details). > Following are > instructions for installing Bioperl for Unix/Linux/Mac OS X; > Windows > installation instructions can be found here. For installing > Bioperl for > Mac OS X using Fink, see Getting BioPerl. > > > SYSTEM REQUIREMENTS > > * Perl 5.005 or later; version 5.6 and greater are recommended. > Note > that most modules will work with earlier versions of Perl. > The only ones > that will not are Bio::SimpleAlign and the Bio::Index::* > modules. If > you don't need these modules and you want to install Bioperl > using an > earlier version of Perl, edit the "require 5.005;" line in > Makefile.PL > as necessary. > > * External modules: Bioperl uses functionality provided in > other Perl > modules. Some of these are included in the standard perl > package but > some need to be obtained from the CPAN site. The list of > external > modules is included at the bottom of this document. > > The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of > these > external modules easy. Simply install the bundle using your CPAN > shell and > all necessary modules will be installed. See THE BIOPERL BUNDLE, > below. > > > OPTIONAL > > * ANSI C or GNU C compiler (gcc) for XS extensions (the > bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext > PACKAGE, below). > > > > ADDITIONAL INSTALLATION INFORMATION > > * Additional information on Bioperl and MAC OS: > + OS 9 - http://bioperl.org/Core/mac-bioperl.html > + OSX-http://www.tc.umn.edu/~cann0010/ > Bioperl_OSX_install.html > + OS X - Installing using Fink (in Getting BioPerl) > > > > THE BIOPERL BUNDLE > > You typically need root privileges to install using CPAN. If you > don't > have these privileges please see INSTALLING BIOPERL IN A PERSONAL > MODULE > AREA for additional information. > > Install Bundle::Bioperl using CPAN. One way: >> perl -MCPAN -e "install Bundle::BioPerl" > > Another way: >> perl -MCPAN -e shell > cpan>install Bundle::BioPerl > > > > On Oct 17, 2006, at 8:04 AM, Chris Fields wrote: > >> There isn't a very easy way since so many links have to be removed/ >> modified. >> I have found a few CPAN modules that could help, but for now I just >> dump the >> text output from a text browser (elinks) using the 'printable >> version' page >> and hand-edit, which works very quickly. That works for the time >> being >> until I can find another more automated solution. >> >> Fortunately there have been very few edits to either INSTALL wiki >> page so >> they should remain relatively stable. >> >> Christopher Fields >> Postdoctoral Researcher - Switzer Lab >> Dept. of Biochemistry >> University of Illinois Urbana-Champaign >> >>> -----Original Message----- >>> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk] >>> Sent: Tuesday, October 17, 2006 6:46 AM >>> To: Chris Fields >>> Cc: bioperl-l >>> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN >>> >>> Chris Fields wrote: >>>> The general consensus was to keep text versions available; we could >>>> add URL links to the wiki pages for the most up-to-dat version. >>>> BTW, >>>> I have modified INSTALL already. INSTALL.WIN is next in line (I was >>>> waiting for your changes). >>>> >>> Is it possible to generate these files from the wiki whenever >>> there is a >>> release? I now edits shouldn't be too severe or too often - but I can >>> see things getting a little messy/annoying if edits have to be >>> made in 2 >>> places. >>> >>> Nath >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Tue Oct 17 16:30:15 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 15:30:15 -0500 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> References: <001301c6f215$07a9a070$15327e82@pyrimidine> <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> Message-ID: <0FB91820-B2A1-4F7F-866C-8D4791DD8306@uiuc.edu> I can confirm this using bioperl-live: GenBank:AL591065.17.17 /Users/cjfields/src/bioperl-live/Bio/Annotation/DBLink.pm Could you file a bug report via bugzilla? Chris On Oct 17, 2006, at 2:01 PM, Erikjan wrote: > Hello, > > I noticed a little problem with the Annotation "DBLink" from > GenBank entries > > When I run: > > perl -MBio::DB::GenBank -e 'my $gi = > 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio = > $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my > $ac=$seq->annotation(); my @annotations = $ac->get_Annotations > ("dblink"); > for(@annotations) { print $_, "\n";} print $INC{ > "Bio/Annotation/DBLink.pm" }, "\n"; ' > > This yields: > > GenBank:AL591065.17.17 > > and the place where the used Bio/Annotation/DBLink.pm resides. > > Can others repeat this? > > I have dug into the source a little and Bio::Annotation::DBLink > seems to > be the place where this happens: it has a concatenation which leads to > that repeated version number. > > It this something that I should fix "client-side", so to speak, or > is it > worthwhile to add some logic to that concatenation to prevent this? > > > Thanks, > > Eric > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From paul.boutros at utoronto.ca Tue Oct 17 19:49:52 2006 From: paul.boutros at utoronto.ca (Paul Boutros) Date: Tue, 17 Oct 2006 19:49:52 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <001401c6f21a$836f9fc0$15327e82@pyrimidine> References: <001401c6f21a$836f9fc0$15327e82@pyrimidine> Message-ID: <20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca> Hi Chris, Yup, that's it. I installed XML::SAX::ExpatXS (make test output below). Should there be a note somewhere in the INSTALL docs saying basically what you just wrote? Or maybe it's already there somewhere and I missed it. Also, I'd propose changing BioDBSeqFeature_mysql.t so that it checks if DBD::mysql can be loaded, and if not doesn't run the test. Since the file is only one-line long, here's the modified file rather than a patch: ################################################################ BEGIN { # DBD::mysql is required eval { require DBD::mysql; }; if ( $@ ) { use Test::More skip_all => "DBD::mysql is not installed or is installed incorrectly - skipping BioDBSeqFeature _mysql.t"; exit(0); } } system "perl t/BioDBSeqFeature.t -adaptor DBI::mysql -create 1 -temp 1 -dsn test"; ################################################################ And when I run it I get: t/BioDBSeqFeature_mysql......skipped all skipped: DBD::mysql is not installed or is installed incorrectly - skipping BioDBSeqFeature_mysql.t And for the overall make test: All tests successful, 3 tests and 106 subtests skipped. Files=236, Tests=11642, 247 wallclock secs (143.85 cusr + 20.39 csys = 164.24 CPU) Hope this helps, Paul Quoting Chris Fields : > Your local copy of XML::SAX has XML::SAX::Expat set as the default SAX > backend parser. For some reason BLAST XML parsing doesn't work with that > parser (it tries to verify the XML first before parsing, hence the DTD > error). I may try getting this to work again, but so far I haven't found an > easy way to prevent XML verification via XML::SAX::Expat. > > There are two options: 1) install XML::SAX::ExpatXS (the better option), > which works AND is 4x faster than XML::SAX::Expat, or 2) set the default > parser in the PareserDetails.ini file in your local to use > XML::SAX::PurePerl. > > BTW, XML::SAX::ExpatXS is intended to replace XML::SAX::Expat (it just > hasn't officially happened yet); the latter hasn't had significant > development in about three years. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > >> -----Original Message----- >> From: Paul Boutros [mailto:paul.boutros at utoronto.ca] >> Sent: Tuesday, October 17, 2006 1:00 PM >> To: Chris Fields >> Cc: bioperl-l at lists.open-bio.org >> Subject: RE: [Bioperl-l] Bioperl 1.5.2 RC2 >> >> Hi Chris, >> >> Here it is: >> pcboutro at ccb690[643] >> perl -I. t/SearchIO.t >> 1..1337 >> ok 1 >> >> -------------------- WARNING --------------------- >> MSG: XML::SAX::Expat not currently supported; must have local copies >> of NCBI DTD docs! >> --------------------------------------------------- >> >> -------------------- WARNING --------------------- >> MSG: error in parsing a report: >> >> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' >> does not exist >> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd >> Handler couldn't resolve external entity at line 2, column 82, byte 104 >> error in processing external entity reference at line 2, column 82, >> byte 104 at >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line >> 187 >> >> --------------------------------------------------- >> not ok 2 >> # Failed test 2 in t/SearchIO.t at line 68 >> Can't call method "database_name" on an undefined value at >> t/SearchIO.t line 69. >> >> >> Quoting Chris Fields : >> >> > What do you get when you run the SearchIO.t test by itself using 'perl - >> I. >> > t/SearchIO.t'? It looks like something pretty catastrophic happened. >> > >> > Christopher Fields >> > Postdoctoral Researcher - Switzer Lab >> > Dept. of Biochemistry >> > University of Illinois Urbana-Champaign >> > >> > >> >> -----Original Message----- >> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >> >> bounces at lists.open-bio.org] On Behalf Of Paul Boutros >> >> Sent: Tuesday, October 17, 2006 11:57 AM >> >> To: bioperl-l at lists.open-bio.org >> >> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2 >> >> >> >> Hi, >> >> Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two failed >> >> tests, the first seems to be just a result of me not having DBD::mysql >> >> installed. >> >> Paul >> >> >> >> Test Summary >> >> ============ >> >> >> >> Failed Test Stat Wstat Total Fail List of Failed >> >> ----------------------------------------------------------------------- >> --- >> >> ----- >> >> t/BioDBSeqFeature_mysql.t 46 46 1-46 >> >> t/SearchIO.t 22 5632 1337 2671 2-1337 >> >> 2 tests and 106 subtests skipped. >> >> Failed 2/236 test scripts. 1382/11688 subtests failed. >> >> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys = >> >> 159.61 CPU) >> >> >> >> BioDBSeqFeature_mysql >> >> ===================== >> >> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t >> >> 1..46 >> >> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC >> >> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t >> >> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8 >> >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi >> >> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at >> >> (eval 37) line 3. >> >> Perhaps the DBD::mysql perl module hasn't been fully installed, >> >> or perhaps the capitalisation of 'mysql' isn't right. >> >> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge. >> >> at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208 >> >> >> >> SearchIO >> >> ======== >> >> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more >> >> 1..1337 >> >> ok 1 >> >> >> >> -------------------- WARNING --------------------- >> >> MSG: XML::SAX::Expat not currently supported; must have local copies >> >> of NCBI DTD docs! >> >> --------------------------------------------------- >> >> >> >> -------------------- WARNING --------------------- >> >> MSG: error in parsing a report: >> >> >> >> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' >> >> does not exist >> >> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd >> >> Handler couldn't resolve external entity at line 2, column 82, byte 104 >> >> error in processing external entity reference at line 2, column 82, >> >> byte 104 at >> >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line >> >> 187 >> >> >> >> --------------------------------------------------- >> >> not ok 2 >> >> # Failed test 2 in t/SearchIO.t at line 68 >> >> Can't call method "database_name" on an undefined value at >> >> t/SearchIO.t line 69. >> >> >> >> ------------------------------ >> >> >> >> Message: 10 >> >> Date: Tue, 17 Oct 2006 11:32:54 +0100 >> >> From: Sendu Bala >> >> Subject: [Bioperl-l] Bioperl 1.5.2 RC2 >> >> To: bioperl-l at bioperl.org >> >> Message-ID: <4534B156.4090501 at sendu.me.uk> >> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed >> >> >> >> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. >> >> See http://www.bioperl.org/wiki/Release_1.5.2 for >> >> instructions on getting and testing this RC. >> >> >> >> Developers: >> >> This should be the last RC before release ~next monday. Now would >> >> be a good time for last minute documentaiton updates and additions. >> >> >> >> Users: >> >> Even though 1.5.2 is a 'developer' release, we consider it the most >> >> stable and capable version of Bioperl, and recommend that you use >> >> it in all but the most critical production environments. Please >> >> try it out and let us know of any problems or difficulties you run >> >> into. >> >> >> >> >> >> Thank you, >> >> Sendu. >> >> >> >> >> >> >> >> _______________________________________________ >> >> Bioperl-l mailing list >> >> Bioperl-l at lists.open-bio.org >> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > >> > >> > > > From cjfields at uiuc.edu Tue Oct 17 20:51:35 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 19:51:35 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca> References: <001401c6f21a$836f9fc0$15327e82@pyrimidine> <20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca> Message-ID: <67CA244C-5A2C-46A0-B474-47C5B67D199B@uiuc.edu> On Oct 17, 2006, at 6:49 PM, Paul Boutros wrote: > Hi Chris, > > Yup, that's it. I installed XML::SAX::ExpatXS (make test output > below). Should there be a note somewhere in the INSTALL docs saying > basically what you just wrote? Or maybe it's already there somewhere > and I missed it. The INSTALL docs should have this, yes. I'll double-check though. Pretty much anything that plugs into XML::SAX except XML::SAX::Expat works (XML::LibXML also works, I found). > Also, I'd propose changing BioDBSeqFeature_mysql.t so that it checks > if DBD::mysql can be loaded, and if not doesn't run the test. Since > the file is only one-line long, here's the modified file rather than a > patch: > ################################################################ > BEGIN { > # DBD::mysql is required > eval { > require DBD::mysql; > }; > if ( $@ ) { > use Test::More skip_all => "DBD::mysql is not > installed or is installed incorrectly - skipping BioDBSeqFeature > _mysql.t"; > exit(0); > } > } > > system "perl t/BioDBSeqFeature.t -adaptor DBI::mysql -create 1 -temp 1 > -dsn test"; > ################################################################ > > And when I run it I get: > t/BioDBSeqFeature_mysql......skipped > all skipped: DBD::mysql is not installed or is installed > incorrectly - skipping BioDBSeqFeature_mysql.t > > And for the overall make test: > All tests successful, 3 tests and 106 subtests skipped. > Files=236, Tests=11642, 247 wallclock secs (143.85 cusr + 20.39 csys = > 164.24 CPU) It should check this when using 'perl Makefile.PL', since the tests are only set up if MySQL is present (so you would assume that it checks for DBD::mysql). I'll look into it. Chris > Hope this helps, > Paul > > > Quoting Chris Fields : > >> Your local copy of XML::SAX has XML::SAX::Expat set as the default >> SAX >> backend parser. For some reason BLAST XML parsing doesn't work >> with that >> parser (it tries to verify the XML first before parsing, hence the >> DTD >> error). I may try getting this to work again, but so far I >> haven't found an >> easy way to prevent XML verification via XML::SAX::Expat. >> >> There are two options: 1) install XML::SAX::ExpatXS (the better >> option), >> which works AND is 4x faster than XML::SAX::Expat, or 2) set the >> default >> parser in the PareserDetails.ini file in your local to use >> XML::SAX::PurePerl. >> >> BTW, XML::SAX::ExpatXS is intended to replace XML::SAX::Expat (it >> just >> hasn't officially happened yet); the latter hasn't had significant >> development in about three years. >> >> Christopher Fields >> Postdoctoral Researcher - Switzer Lab >> Dept. of Biochemistry >> University of Illinois Urbana-Champaign >> >> >>> -----Original Message----- >>> From: Paul Boutros [mailto:paul.boutros at utoronto.ca] >>> Sent: Tuesday, October 17, 2006 1:00 PM >>> To: Chris Fields >>> Cc: bioperl-l at lists.open-bio.org >>> Subject: RE: [Bioperl-l] Bioperl 1.5.2 RC2 >>> >>> Hi Chris, >>> >>> Here it is: >>> pcboutro at ccb690[643] >> perl -I. t/SearchIO.t >>> 1..1337 >>> ok 1 >>> >>> -------------------- WARNING --------------------- >>> MSG: XML::SAX::Expat not currently supported; must have local copies >>> of NCBI DTD docs! >>> --------------------------------------------------- >>> >>> -------------------- WARNING --------------------- >>> MSG: error in parsing a report: >>> >>> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' >>> does not exist >>> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd >>> Handler couldn't resolve external entity at line 2, column 82, >>> byte 104 >>> error in processing external entity reference at line 2, column 82, >>> byte 104 at >>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm >>> line >>> 187 >>> >>> --------------------------------------------------- >>> not ok 2 >>> # Failed test 2 in t/SearchIO.t at line 68 >>> Can't call method "database_name" on an undefined value at >>> t/SearchIO.t line 69. >>> >>> >>> Quoting Chris Fields : >>> >>>> What do you get when you run the SearchIO.t test by itself using >>>> 'perl - >>> I. >>>> t/SearchIO.t'? It looks like something pretty catastrophic >>>> happened. >>>> >>>> Christopher Fields >>>> Postdoctoral Researcher - Switzer Lab >>>> Dept. of Biochemistry >>>> University of Illinois Urbana-Champaign >>>> >>>> >>>>> -----Original Message----- >>>>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >>>>> bounces at lists.open-bio.org] On Behalf Of Paul Boutros >>>>> Sent: Tuesday, October 17, 2006 11:57 AM >>>>> To: bioperl-l at lists.open-bio.org >>>>> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2 >>>>> >>>>> Hi, >>>>> Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two >>>>> failed >>>>> tests, the first seems to be just a result of me not having >>>>> DBD::mysql >>>>> installed. >>>>> Paul >>>>> >>>>> Test Summary >>>>> ============ >>>>> >>>>> Failed Test Stat Wstat Total Fail List of Failed >>>>> ------------------------------------------------------------------ >>>>> ----- >>> --- >>>>> ----- >>>>> t/BioDBSeqFeature_mysql.t 46 46 1-46 >>>>> t/SearchIO.t 22 5632 1337 2671 2-1337 >>>>> 2 tests and 106 subtests skipped. >>>>> Failed 2/236 test scripts. 1382/11688 subtests failed. >>>>> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 >>>>> csys = >>>>> 159.61 CPU) >>>>> >>>>> BioDBSeqFeature_mysql >>>>> ===================== >>>>> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t >>>>> 1..46 >>>>> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC >>>>> (@INC >>>>> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t >>>>> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8 >>>>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi >>>>> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/ >>>>> site_perl) at >>>>> (eval 37) line 3. >>>>> Perhaps the DBD::mysql perl module hasn't been fully installed, >>>>> or perhaps the capitalisation of 'mysql' isn't right. >>>>> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge. >>>>> at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208 >>>>> >>>>> SearchIO >>>>> ======== >>>>> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more >>>>> 1..1337 >>>>> ok 1 >>>>> >>>>> -------------------- WARNING --------------------- >>>>> MSG: XML::SAX::Expat not currently supported; must have local >>>>> copies >>>>> of NCBI DTD docs! >>>>> --------------------------------------------------- >>>>> >>>>> -------------------- WARNING --------------------- >>>>> MSG: error in parsing a report: >>>>> >>>>> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/ >>>>> NCBI_BlastOutput.dtd' >>>>> does not exist >>>>> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd >>>>> Handler couldn't resolve external entity at line 2, column 82, >>>>> byte 104 >>>>> error in processing external entity reference at line 2, column >>>>> 82, >>>>> byte 104 at >>>>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/ >>>>> Parser.pm line >>>>> 187 >>>>> >>>>> --------------------------------------------------- >>>>> not ok 2 >>>>> # Failed test 2 in t/SearchIO.t at line 68 >>>>> Can't call method "database_name" on an undefined value at >>>>> t/SearchIO.t line 69. >>>>> >>>>> ------------------------------ >>>>> >>>>> Message: 10 >>>>> Date: Tue, 17 Oct 2006 11:32:54 +0100 >>>>> From: Sendu Bala >>>>> Subject: [Bioperl-l] Bioperl 1.5.2 RC2 >>>>> To: bioperl-l at bioperl.org >>>>> Message-ID: <4534B156.4090501 at sendu.me.uk> >>>>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed >>>>> >>>>> Bioperl 1.5.2 Release Candidate 2 is ready and available for >>>>> testing. >>>>> See http://www.bioperl.org/wiki/Release_1.5.2 for >>>>> instructions on getting and testing this RC. >>>>> >>>>> Developers: >>>>> This should be the last RC before release ~next monday. Now >>>>> would >>>>> be a good time for last minute documentaiton updates and >>>>> additions. >>>>> >>>>> Users: >>>>> Even though 1.5.2 is a 'developer' release, we consider it >>>>> the most >>>>> stable and capable version of Bioperl, and recommend that >>>>> you use >>>>> it in all but the most critical production environments. >>>>> Please >>>>> try it out and let us know of any problems or difficulties >>>>> you run >>>>> into. >>>>> >>>>> >>>>> Thank you, >>>>> Sendu. >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> >>> >> >> >> > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Wed Oct 18 02:52:05 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 18 Oct 2006 07:52:05 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4534B156.4090501@sendu.me.uk> References: <4534B156.4090501@sendu.me.uk> Message-ID: <4535CF15.4090502@sendu.me.uk> Sendu Bala wrote: > Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. > See http://www.bioperl.org/wiki/Release_1.5.2 for > instructions on getting and testing this RC. > > Developers: > This should be the last RC before release ~next monday. Now would > be a good time for last minute documentaiton updates and additions. Given the few issues that have come up, it would be prudent to have another RC, so expect one around the time the 'Needs investigation' issues on the release page have been solved. If you think there are more things that need investigation, please add them, but note the bias toward things that affect the successful completion of the test suite as opposed to general bugs which should go to Bugzilla as normal. From bix at sendu.me.uk Wed Oct 18 04:55:21 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 18 Oct 2006 09:55:21 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <45350BA6.3040102@genomics.dk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> Message-ID: <4535EBF9.1090706@sendu.me.uk> Niels Larsen wrote: > ------------ EBI > > I invoked the EBI script > > http://www.ebi.ac.uk/Tools/webservices/downloads/WSWUBlastSOAPLite.zip > > like this > > WSWUBlastClient.pl -p blastn -D embl test.fasta > > where the content of test.fasta is below, and got > > Can't find method element in the message at > /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311. As you admit, this is not a Bioperl issue. I would suggest you contact EBI support. In the mean time/alternatively I'd suggest investigating the Bioperl interface to the SOAP server, which is part of the Bioperl-run package. http://doc.bioperl.org/releases/bioperl-current/bioperl-run/Bio/Tools/Run/Analysis.html > ------------ DDBJ > > Inspired by this page, > > http://xml.nig.ac.jp/doc/Blast.txt > > I made this test script [snip] > which for me prints undef. Again, not something I can really help you with. You'll need to triple-check your code and then seek support from the providers of that SOAP service. > ------------- NCBI/Bioperl > > I installed 1.5.2-RC2, looked at the RemoteBlast example in > > http://www.bioperl.org/wiki/Bptutorial.pl > > and then put that into this test code, more or less cut/paste, [snip] > Maybe I am supposed to add a check for content in $rc and then stop > the inner loop? Yes, the wiki page example isn't really adequate. I'll update it. For a better code example see the RemoteBlast documentation: http://doc.bioperl.org/bioperl-live/Bio/Tools/Run/RemoteBlast.html > I could figure that out maybe, but I wish there was a > function which simply takes a single sequence + arguments and only > returns a list of matches when done, and does not return until then > (or until a specified timeout). Yes, I hardly find dealing with RIDs that pleasant. You might like to add a feature request to Bugzilla. From n.haigh at sheffield.ac.uk Wed Oct 18 05:58:00 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Wed, 18 Oct 2006 10:58:00 +0100 Subject: [Bioperl-l] RC2 test results on WinXP Message-ID: <4535FAA8.2050506@sheffield.ac.uk> I get all tests passing except for BioDBSeqFeature_mysql which fails all tests (1-46). During perl Makefile.PL I get: "I see you have Berkeleydb installed. I will create the DBD tests for Bio::DB::SeqFeature::Store..." I notice under the "needs investigation" there is mention about tests been generated even if DBD::mysql isn't installed. I assume this is the problem? If this is the problem should DBD::mysql be added to the dependencies in Makefile.PL? Is there an easy way to find out what tests are being skipped due to absent modules? Cheers Nath From n.haigh at sheffield.ac.uk Wed Oct 18 07:34:21 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Wed, 18 Oct 2006 12:34:21 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4535EBF9.1090706@sendu.me.uk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> <4535EBF9.1090706@sendu.me.uk> Message-ID: <4536113D.1080307@sheffield.ac.uk> I've just added test results for 1.5.2 RC2 to the wiki. There are lots of fails for packages other than bioperl-live. I'm not sure excatly how the test fails/skipps are/should be handled since my setups are as follows. Clean WinXP Pro: This is a clean install of WinXP Pro SP2 with no major software installed, other than ActivePerl 5.8.8.819 and a few tools for archive extracting, anti virus etc. Therefore, I'm unsure how tests in bioperl-network and bioperl-db should return. For example, I have made no effort to setup biosql-schema but I thought that maybe there would be a test that would detect this, and fail, then skip over other tests gracefully - like the bioperl-run tests when a piece of software is not installed??? Debian Linux: This is a Bio-Linux machine with quite a lot of bioinformatics software installed in the Path. So most of the tests in bioperl-run should probably have passed. The same goes for bioperl-network and bioperl-db as with my Windows setup. If my thoughts are totally wrong - let me know! Nath From bix at sendu.me.uk Wed Oct 18 08:03:11 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 18 Oct 2006 13:03:11 +0100 Subject: [Bioperl-l] RC2 test results on WinXP In-Reply-To: <4535FAA8.2050506@sheffield.ac.uk> References: <4535FAA8.2050506@sheffield.ac.uk> Message-ID: <453617FF.9080508@sendu.me.uk> Nathan Haigh wrote: > I get all tests passing except for BioDBSeqFeature_mysql which fails all > tests (1-46). > > During perl Makefile.PL I get: > "I see you have Berkeleydb installed. I will create the DBD tests for > Bio::DB::SeqFeature::Store..." > > I notice under the "needs investigation" there is mention about tests > been generated even if DBD::mysql isn't installed. I assume this is the > problem? Probably. I'm looking into it. Not sure why it wasn't causing a problem before now. > If this is the problem should DBD::mysql be added to the > dependencies in Makefile.PL? No. You can use the modules in question without mysql (presumably; ie. you have a different sql setup), so it makes no sense to warn people they don't have a module they absolutely do not need. > Is there an easy way to find out what tests are being skipped due to > absent modules? Ideally, when the skip occurs the test script will issue a message. I think that happens in most, if not all cases. From bix at sendu.me.uk Wed Oct 18 09:02:50 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 18 Oct 2006 14:02:50 +0100 Subject: [Bioperl-l] RC2 test results on WinXP In-Reply-To: <453617FF.9080508@sendu.me.uk> References: <4535FAA8.2050506@sheffield.ac.uk> <453617FF.9080508@sendu.me.uk> Message-ID: <453625FA.6090907@sendu.me.uk> Sendu Bala wrote: > Nathan Haigh wrote: ? >> I notice under the "needs investigation" there is mention about tests >> been generated even if DBD::mysql isn't installed. I assume this is the >> problem? > > Probably. I'm looking into it. Not sure why it wasn't causing a problem > before now. > > > If this is the problem should DBD::mysql be added to the > > dependencies in Makefile.PL? > > No. You can use the modules in question without mysql (presumably; ie. > you have a different sql setup), so it makes no sense to warn people > they don't have a module they absolutely do not need. Oops. It /is/ in the pre-reqs in Makefile.PL. Maybe DBD::mysql is the only supported driver? From bix at sendu.me.uk Wed Oct 18 09:16:24 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 18 Oct 2006 14:16:24 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <67CA244C-5A2C-46A0-B474-47C5B67D199B@uiuc.edu> References: <001401c6f21a$836f9fc0$15327e82@pyrimidine> <20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca> <67CA244C-5A2C-46A0-B474-47C5B67D199B@uiuc.edu> Message-ID: <45362928.8070104@sendu.me.uk> Chris Fields wrote: > On Oct 17, 2006, at 6:49 PM, Paul Boutros wrote: > >> Hi Chris, >> >> Yup, that's it. I installed XML::SAX::ExpatXS (make test output >> below). Should there be a note somewhere in the INSTALL docs saying >> basically what you just wrote? Or maybe it's already there somewhere >> and I missed it. > > The INSTALL docs should have this, yes. I'll double-check though. > > Pretty much anything that plugs into XML::SAX except XML::SAX::Expat > works (XML::LibXML also works, I found). > >> Also, I'd propose changing BioDBSeqFeature_mysql.t so that it checks >> if DBD::mysql can be loaded, [snip] > It should check this when using 'perl Makefile.PL', since the tests > are only set up if MySQL is present (so you would assume that it > checks for DBD::mysql). I'll look into it. This looks like a booboo on my part. I left t/BioDBSeqFeature_mysql.t in my t directory when I packed it up for release. I'm tweaking Makefile.PL right now in any case; there are a few errors and t/BioDBSeqFeature_mysql.t et al. don't get deleted in a make clean. From cjfields at uiuc.edu Wed Oct 18 09:55:37 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 08:55:37 -0500 Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in GenBank/EMBL/DDBJ Message-ID: <001e01c6f2bd$1769aca0$15327e82@pyrimidine> Ding dong the witch is dead! As announce previously, from the latest GenBank release (156.0): ----------------------------------------------- 1.3.8 Feature location syntax X.Y no longer supported The Feature Table has supported feature locations of the form 'X.Y', to represent a base position which is greater or equal to X, and less than or equal to Y. For example: misc_feature 1.10..20 misc_feature join(100..150,200.210..250) In the first example, the misc_feature starts somewhere between bases 1 and 10 (inclusive), and ends at basepair 20. In the second, the 51 bases from 100..150 are joined together with a second basepair interval, which could be anywhere from 200..250 to 210..250 . Although this syntax seems like a reasonable way to capture an uncertain interval, it is used for features on a vanishingly small number of sequence records, most database submission mechanisms don't support it, and the meaning of its use in a join() context is not entirely clear. As of October 2006, this type of location is no longer supported. Those records with features which utilize X.Y locations will be reviewed and converted to a non-uncertain format. ----------------------------------------------- EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations. Not sure about UniProt/SwissProt. I guess we're keeping this in for backwards compatibility only, but how do we handle any bugs that pop up related to this? Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Wed Oct 18 10:10:07 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 09:10:07 -0500 Subject: [Bioperl-l] RC2 test results on WinXP In-Reply-To: <453617FF.9080508@sendu.me.uk> Message-ID: <001f01c6f2bf$20737270$15327e82@pyrimidine> > Nathan Haigh wrote: > > I get all tests passing except for BioDBSeqFeature_mysql which fails all > > tests (1-46). > > > > During perl Makefile.PL I get: > > "I see you have Berkeleydb installed. I will create the DBD tests for > > Bio::DB::SeqFeature::Store..." > > > > I notice under the "needs investigation" there is mention about tests > > been generated even if DBD::mysql isn't installed. I assume this is the > > problem? > > Probably. I'm looking into it. Not sure why it wasn't causing a problem > before now. Funny, I don't have problems with the BioDBSeqFeature_mysql tests on WinXP because 'perl Makefile.PL' doesn't detect my MySQL installation, so the MySQL-based tests don't run even though I have DBD::mysql installed. I thought this might just be a WinXP issue, but apparently not. If I can get to it I'll run a few checks. > > If this is the problem should DBD::mysql be added to the > > dependencies in Makefile.PL? > > No. You can use the modules in question without mysql (presumably; ie. > you have a different sql setup), so it makes no sense to warn people > they don't have a module they absolutely do not need. Agreed, though I don't know if other relational DB's are supported like PostgreSQL. > > Is there an easy way to find out what tests are being skipped due to > > absent modules? > > Ideally, when the skip occurs the test script will issue a message. I > think that happens in most, if not all cases. Yes, though we may run into the same issue we had with XEMBL tests not reporting the reasons it skipped. Each test suite should run an eval{} to check the required modules, then only skip blocks of tests that rely on those modules. I think we have caught most of those, but who knows w/o doing a complete test suite audit? Our eventual complete switchover to Test::More should hopefully clean these up. I don't consider it a pressing issue for this release, though Sendu may feel differently. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Wed Oct 18 10:12:52 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 09:12:52 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <45362928.8070104@sendu.me.uk> Message-ID: <002001c6f2bf$807849c0$15327e82@pyrimidine> ... > This looks like a booboo on my part. I left t/BioDBSeqFeature_mysql.t in > my t directory when I packed it up for release. > > I'm tweaking Makefile.PL right now in any case; there are a few errors > and t/BioDBSeqFeature_mysql.t et al. don't get deleted in a make clean. Okay, makes sense now. No big deal, it's still an RC (a developer's RC at that!). Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Wed Oct 18 10:17:35 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Wed, 18 Oct 2006 15:17:35 +0100 Subject: [Bioperl-l] RC2 test results on WinXP In-Reply-To: <001f01c6f2bf$20737270$15327e82@pyrimidine> References: <001f01c6f2bf$20737270$15327e82@pyrimidine> Message-ID: <4536377F.6000408@sheffield.ac.uk> Chris Fields wrote: >> Nathan Haigh wrote: >> >>> I get all tests passing except for BioDBSeqFeature_mysql which fails all >>> tests (1-46). >>> >>> During perl Makefile.PL I get: >>> "I see you have Berkeleydb installed. I will create the DBD tests for >>> Bio::DB::SeqFeature::Store..." >>> >>> I notice under the "needs investigation" there is mention about tests >>> been generated even if DBD::mysql isn't installed. I assume this is the >>> problem? >>> >> Probably. I'm looking into it. Not sure why it wasn't causing a problem >> before now. >> > > Funny, I don't have problems with the BioDBSeqFeature_mysql tests on WinXP > because 'perl Makefile.PL' doesn't detect my MySQL installation, so the > MySQL-based tests don't run even though I have DBD::mysql installed. I > thought this might just be a WinXP issue, but apparently not. If I can get > to it I'll run a few checks. > > This was on WinXP. >> > If this is the problem should DBD::mysql be added to the >> > dependencies in Makefile.PL? >> >> No. You can use the modules in question without mysql (presumably; ie. >> you have a different sql setup), so it makes no sense to warn people >> they don't have a module they absolutely do not need. >> > > Agreed, though I don't know if other relational DB's are supported like > PostgreSQL. > > >>> Is there an easy way to find out what tests are being skipped due to >>> absent modules? >>> >> Ideally, when the skip occurs the test script will issue a message. I >> think that happens in most, if not all cases. >> > > Yes, though we may run into the same issue we had with XEMBL tests not > reporting the reasons it skipped. Each test suite should run an eval{} to > check the required modules, then only skip blocks of tests that rely on > those modules. I think we have caught most of those, but who knows w/o > doing a complete test suite audit? > > Our eventual complete switchover to Test::More should hopefully clean these > up. I don't consider it a pressing issue for this release, though Sendu may > feel differently. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > From hlapp at gmx.net Wed Oct 18 10:36:31 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 18 Oct 2006 10:36:31 -0400 Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in GenBank/EMBL/DDBJ In-Reply-To: <001e01c6f2bd$1769aca0$15327e82@pyrimidine> References: <001e01c6f2bd$1769aca0$15327e82@pyrimidine> Message-ID: On Oct 18, 2006, at 9:55 AM, Chris Fields wrote: > how do we handle any bugs that pop up related to this? By an evil grin, followed by deflecting the blame to NCBI, followed by another evil grin. -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Oct 18 10:43:31 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 09:43:31 -0500 Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in GenBank/EMBL/DDBJ In-Reply-To: Message-ID: <002401c6f2c3$c83c7e30$15327e82@pyrimidine> > On Oct 18, 2006, at 9:55 AM, Chris Fields wrote: > > > how do we handle any bugs that pop up related to this? > > By an evil grin, followed by deflecting the blame to NCBI, followed > by another evil grin. > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== Sounds good to me! One less thing to worry about. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Wed Oct 18 10:45:57 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Wed, 18 Oct 2006 15:45:57 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4536113D.1080307@sheffield.ac.uk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> <4535EBF9.1090706@sendu.me.uk> <4536113D.1080307@sheffield.ac.uk> Message-ID: <45363E25.8010806@sheffield.ac.uk> Nathan Haigh wrote: > I've just added test results for 1.5.2 RC2 to the wiki. > > There are lots of fails for packages other than bioperl-live. I'm not > sure excatly how the test fails/skipps are/should be handled since my > setups are as follows. > > Clean WinXP Pro: > This is a clean install of WinXP Pro SP2 with no major software > installed, other than ActivePerl 5.8.8.819 and a few tools for archive > extracting, anti virus etc. Therefore, I'm unsure how tests in > bioperl-network and bioperl-db should return. For example, I have made > no effort to setup biosql-schema but I thought that maybe there would be > a test that would detect this, and fail, then skip over other tests > gracefully - like the bioperl-run tests when a piece of software is not > installed??? > > Debian Linux: > This is a Bio-Linux machine with quite a lot of bioinformatics software > installed in the Path. So most of the tests in bioperl-run should > probably have passed. The same goes for bioperl-network and bioperl-db > as with my Windows setup. > > If my thoughts are totally wrong - let me know! > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > Just looking into the failed Linux tests. Several of the tests result in errors like: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Unallowed parameter: ARGUMENTS ! STACK: Error::throw STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328 STACK: Bio::Tools::Run::Alignment::Exonerate::AUTOLOAD /home/bo1nsh/cvswc/bioperl-run-live/blib/lib/Bio/Tools/Run/Alignment/Exonerate.pm:126 STACK: Bio::Tools::Run::Alignment::Exonerate::new /home/bo1nsh/cvswc/bioperl-run-live/blib/lib/Bio/Tools/Run/Alignment/Exonerate.pm:154 STACK: t/Exonerate.t:32 ----------------------------------------------------------- ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Unallowed parameter: 'arguments' ! STACK: Error::throw STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328 STACK: Bio::Tools::Run::Hmmer::AUTOLOAD Bio/Tools/Run/Hmmer.pm:172 STACK: Bio::Tools::Run::Hmmer::_run Bio/Tools/Run/Hmmer.pm:253 STACK: Bio::Tools::Run::Hmmer::run Bio/Tools/Run/Hmmer.pm:228 STACK: t/Hmmer.t:54 ----------------------------------------------------------- ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Unallowed parameter: ARGUMENTS ! STACK: Error::throw STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328 STACK: Bio::Tools::Run::Phrap::AUTOLOAD Bio/Tools/Run/Phrap.pm:137 STACK: Bio::Tools::Run::Phrap::new Bio/Tools/Run/Phrap.pm:165 STACK: t/Phrap.t:34 ----------------------------------------------------------- Any ideas?? Nath From hlapp at gmx.net Wed Oct 18 10:51:36 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 18 Oct 2006 10:51:36 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4536113D.1080307@sheffield.ac.uk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> <4535EBF9.1090706@sendu.me.uk> <4536113D.1080307@sheffield.ac.uk> Message-ID: On Oct 18, 2006, at 7:34 AM, Nathan Haigh wrote: > For example, I have made > no effort to setup biosql-schema but I thought that maybe there > would be > a test that would detect this I'm afraid there isn't. Bioperl-db is meaningless without biosql-schema. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bosborne11 at verizon.net Wed Oct 18 10:43:06 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 18 Oct 2006 10:43:06 -0400 Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in GenBank/EMBL/DDBJ In-Reply-To: <001e01c6f2bd$1769aca0$15327e82@pyrimidine> Message-ID: Chris, I'm fairly sure that Swissprot/Uniprot still supports fuzzy locations, all of the more recent examples in t/LocationFactory.t come from there. Brian O. On 10/18/06 9:55 AM, "Chris Fields" wrote: > EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations. > Not sure about UniProt/SwissProt. From cjfields at uiuc.edu Wed Oct 18 11:00:30 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 10:00:30 -0500 Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in GenBank/EMBL/DDBJ In-Reply-To: Message-ID: <002501c6f2c6$27625540$15327e82@pyrimidine> Do they still use the X.Y notations? Those are the most troublesome. I guess we still don't support the ones containing '?'. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: Brian Osborne [mailto:bosborne11 at verizon.net] > Sent: Wednesday, October 18, 2006 9:43 AM > To: Chris Fields; bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] X.Y Fuzzy locations no longer supported in > GenBank/EMBL/DDBJ > > Chris, > > I'm fairly sure that Swissprot/Uniprot still supports fuzzy locations, all > of the more recent examples in t/LocationFactory.t come from there. > > Brian O. > > > On 10/18/06 9:55 AM, "Chris Fields" wrote: > > > EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations. > > Not sure about UniProt/SwissProt. From Kevin.M.Brown at asu.edu Wed Oct 18 11:16:50 2006 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Wed, 18 Oct 2006 08:16:50 -0700 Subject: [Bioperl-l] Blast information Message-ID: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu> I just recently upgraded to 1.5.1 on WinXP to bring this version closer to live to parse some locally created blast files. I'm trying to find the method that returns the values that are underneath the Identities and Positives information as I'm trying to replicate the output of an old blast parser we have here written in RealBasic which is showing its age. Once I have it replicating the old output I then intend to add more features in terms of filtering returned hits (like not returning self->self hits or a->b so don't show b->a). Example: I'm looking for the methods that will return 117 from identities and 117 from positives. I can't just use num_identical/percent_identity as that isn't 100% accurate. >BurkM_2016 Length = 241 Score = 43.2 bits (88), Expect = 7e-005 Identities = 26/117 (22%), Positives = 51/117 (43%) Query: 298 QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL 357 Q F F + A+ ++ + + + L +R GL + P E + A+L Sbjct: 111 QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL 170 Query: 358 MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414 A+ G++L++DV ++ H R ++ L L + +S R +T+ L + L Sbjct: 171 KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227 Thanks, Kevin From cjfields at uiuc.edu Wed Oct 18 11:25:59 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 10:25:59 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4536113D.1080307@sheffield.ac.uk> Message-ID: <002601c6f2c9$b6d04a90$15327e82@pyrimidine> > I've just added test results for 1.5.2 RC2 to the wiki. > > There are lots of fails for packages other than bioperl-live. I'm not > sure excatly how the test fails/skipps are/should be handled since my > setups are as follows. > > Clean WinXP Pro: > This is a clean install of WinXP Pro SP2 with no major software > installed, other than ActivePerl 5.8.8.819 and a few tools for archive > extracting, anti virus etc. Therefore, I'm unsure how tests in > bioperl-network and bioperl-db should return. For example, I have made > no effort to setup biosql-schema but I thought that maybe there would be > a test that would detect this, and fail, then skip over other tests > gracefully - like the bioperl-run tests when a piece of software is not > installed??? > > Debian Linux: > This is a Bio-Linux machine with quite a lot of bioinformatics software > installed in the Path. So most of the tests in bioperl-run should > probably have passed. The same goes for bioperl-network and bioperl-db > as with my Windows setup. > > If my thoughts are totally wrong - let me know! > Nath The bioperl-db tests rely on a local BioSQL database and on having a properly set up configuration file (these are detailed in the bioperl-db INSTALL doc). Furthermore, there are serious problems with bioperl-db and WinXP (see Bug 1938 in bugzilla). There is a workaround, but it isn't perfect by any means. http://bugzilla.open-bio.org/show_bug.cgi?id=1938 Many of the bioperl-run tests rely on env. variables being set properly, so maybe that's why they failed. These should all be detailed in the INSTALL file (but maybe they aren't?). I haven't tried installing bioperl-network or bioperl-run on WinXP or Mac OS X yet but intended on doing this within the week. The INSTALL file details the requirements for the packages (Graph 0.80 is the only one for bioperl-network, for instance, and there isn't a PPM for that version available yet). It would be nice to skip the tests based on absence of the particular modules or installed programs, and I think the final goal is to possibly attempt to do this. However, all of the bioperl-related distributions have their own documentation which outline their installation, requirements, and use. At least we can point to that, which works for now. We could always start up a wiki page for the various bioperl distributions to monitor problems or issues with each based on OS, proposed enhancements/ideas, etc. Also, most (if not all, including core) have been primarily tested on some *nix-related system, which means that they may not work on Win32 systems. Though the Windows support is light-years ahead of what it used to be circa rel 0.7, I don't think it is full-proof yet, as witnessed by the bioperl-db bug. Frankly, we need more WinXP users for those packages willing to test them out and offer suggestions. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign l From bosborne11 at verizon.net Wed Oct 18 11:13:51 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 18 Oct 2006 11:13:51 -0400 Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in GenBank/EMBL/DDBJ In-Reply-To: <002501c6f2c6$27625540$15327e82@pyrimidine> Message-ID: Chris, No, I don't think they use the form X.Y. See below, from t/LocationFactory.t, we do support most of the forms using ?. Supposedly these tests accommodate all of the possible fuzzy locations encountered in Swissprot, I wrote these a year or so ago. Brian O. # UNCERTAIN locations and positions (Swissprot) "?2465..2774" => [$fuzzy_impl, 2465, 2465, "UNCERTAIN", 2774, 2774, "EXACT", "EXACT", 1, 1], "22..?64" => [$fuzzy_impl, 22, 22, "EXACT", 64, 64, "UNCERTAIN", "EXACT", 1, 1], "?22..?64" => [$fuzzy_impl, 22, 22, "UNCERTAIN", 64, 64, "UNCERTAIN", "EXACT", 1, 1], "?..>393" => [$fuzzy_impl, undef, undef, "UNCERTAIN", 393, undef, "AFTER", "UNCERTAIN", 1, 1], "<1..?" => [$fuzzy_impl, undef, 1, "BEFORE", undef, undef, "UNCERTAIN", "UNCERTAIN", 1, 1], "?..536" => [$fuzzy_impl, undef, undef, "UNCERTAIN", 536, 536, "EXACT", "UNCERTAIN", 1, 1], "1..?" => [$fuzzy_impl, 1, 1, "EXACT", undef, undef, "UNCERTAIN", "UNCERTAIN", 1, 1], "?..?" => [$fuzzy_impl, undef, undef, "UNCERTAIN", undef, undef, "UNCERTAIN", "UNCERTAIN", 1, 1], # Not working yet: #"12..?1" => [$fuzzy_impl, # 1, 1, "UNCERTAIN", 12, 12, "EXACT", "EXACT", 1, 1] On 10/18/06 11:00 AM, "Chris Fields" wrote: > Do they still use the X.Y notations? Those are the most troublesome. I > guess we still don't support the ones containing '?'. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > >> -----Original Message----- >> From: Brian Osborne [mailto:bosborne11 at verizon.net] >> Sent: Wednesday, October 18, 2006 9:43 AM >> To: Chris Fields; bioperl-l at lists.open-bio.org >> Subject: Re: [Bioperl-l] X.Y Fuzzy locations no longer supported in >> GenBank/EMBL/DDBJ >> >> Chris, >> >> I'm fairly sure that Swissprot/Uniprot still supports fuzzy locations, all >> of the more recent examples in t/LocationFactory.t come from there. >> >> Brian O. >> >> >> On 10/18/06 9:55 AM, "Chris Fields" wrote: >> >>> EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations. >>> Not sure about UniProt/SwissProt. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Wed Oct 18 12:56:07 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 11:56:07 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <002601c6f2c9$b6d04a90$15327e82@pyrimidine> Message-ID: <000401c6f2d6$5144e2f0$15327e82@pyrimidine> ... > I haven't tried installing bioperl-network or bioperl-run on WinXP or Mac > OS All, > X yet but intended on doing this within the week. The INSTALL file > details > the requirements for the packages (Graph 0.80 is the only one for > bioperl-network, for instance, and there isn't a PPM for that version > available yet). ... As a followup in this, I tried bioperl-network and had similar failed tests with Graph 0.79 (the only PPM available from ActiveState). However, the INSTALL docs state that Graph 0.80 is needed, and the test run gave several warnings about not having Graph 0.80 installed. I made a PPM of Graph 0.80, installed, retried bioperl-network tests, and everything passed. Maybe we need to have a Graph PPM available for those who want bioperl-network? As for bioperl-run, all tests passed from a new CVS checkout even though I have none of the programs installed, so they seem to skip properly. The test run also printed warnings when a program wasn't available or installed. Chris From bosborne11 at verizon.net Wed Oct 18 13:10:34 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 18 Oct 2006 13:10:34 -0400 Subject: [Bioperl-l] Blast information In-Reply-To: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu> Message-ID: Kevin, Are you looking for hsp_length()? See the SearchIO HOWTO for a list of methods: http://www.bioperl.org/wiki/HOWTO:SearchIO Brian O. On 10/18/06 11:16 AM, "Kevin Brown" wrote: > I just recently upgraded to 1.5.1 on WinXP to bring this version closer > to live to parse some locally created blast files. I'm trying to find > the method that returns the values that are underneath the Identities > and Positives information as I'm trying to replicate the output of an > old blast parser we have here written in RealBasic which is showing its > age. Once I have it replicating the old output I then intend to add > more features in terms of filtering returned hits (like not returning > self->self hits or a->b so don't show b->a). > > Example: > I'm looking for the methods that will return 117 from identities and 117 > from positives. I can't just use num_identical/percent_identity as that > isn't 100% accurate. > >> BurkM_2016 > Length = 241 > > Score = 43.2 bits (88), Expect = 7e-005 > Identities = 26/117 (22%), Positives = 51/117 (43%) > > Query: 298 QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL > 357 > Q F F + A+ ++ + + + L +R GL + P E + A+L > Sbjct: 111 QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL > 170 > > Query: 358 MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414 > A+ G++L++DV ++ H R ++ L L + +S R +T+ L + L > Sbjct: 171 KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227 > > Thanks, > Kevin > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From Kevin.M.Brown at asu.edu Wed Oct 18 17:25:48 2006 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Wed, 18 Oct 2006 14:25:48 -0700 Subject: [Bioperl-l] Blast information Message-ID: <1A4207F8295607498283FE9E93B775B4022A71C3@EX02.asurite.ad.asu.edu> Yes, that does indeed look like what I was after. > -----Original Message----- > From: Brian Osborne [mailto:bosborne11 at verizon.net] > Sent: Wednesday, October 18, 2006 10:11 AM > To: Kevin Brown; bioperl-l > Subject: Re: [Bioperl-l] Blast information > > Kevin, > > Are you looking for hsp_length()? See the SearchIO HOWTO for a list of > methods: > > http://www.bioperl.org/wiki/HOWTO:SearchIO > > > Brian O. > > > On 10/18/06 11:16 AM, "Kevin Brown" wrote: > > > I just recently upgraded to 1.5.1 on WinXP to bring this > version closer > > to live to parse some locally created blast files. I'm > trying to find > > the method that returns the values that are underneath the > Identities > > and Positives information as I'm trying to replicate the > output of an > > old blast parser we have here written in RealBasic which is > showing its > > age. Once I have it replicating the old output I then intend to add > > more features in terms of filtering returned hits (like not > returning > > self->self hits or a->b so don't show b->a). > > > > Example: > > I'm looking for the methods that will return 117 from > identities and 117 > > from positives. I can't just use > num_identical/percent_identity as that > > isn't 100% accurate. > > > >> BurkM_2016 > > Length = 241 > > > > Score = 43.2 bits (88), Expect = 7e-005 > > Identities = 26/117 (22%), Positives = 51/117 (43%) > > > > Query: 298 > QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL > > 357 > > Q F F + A+ ++ + + + L +R GL + > P E + A+L > > Sbjct: 111 > QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL > > 170 > > > > Query: 358 > MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414 > > A+ G++L++DV ++ H R ++ L L + +S R +T+ L + L > > Sbjct: 171 > KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227 > > > > Thanks, > > Kevin > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > From n.appleby at uq.edu.au Wed Oct 18 17:58:06 2006 From: n.appleby at uq.edu.au (Nikki Appleby) Date: Thu, 19 Oct 2006 07:58:06 +1000 Subject: [Bioperl-l] CONTIG dealing Message-ID: <000f01c6f300$7e2f3390$1b776682@IMBPC.AD> I have just entered the wonderful new world of BioPerl, so the answer to my question may be obvious to any of the gurus reading this. I need to collect sequence features and ontology annotations. Here goes. I am retrieving sequences from SwissProt via Bio::DB::SwissProt and get_Seq_by_id, for this example Q8RZV7. Once I have parsed it into an RDBMS format that I am happy with I can get at the xref ids. In this case, they are AP003451; BAB86144.1; -; Genomic_DNA. AP008207; BAF07116.1; -; Genomic_DNA. AB103395; BAC81207.1; -; mRNA. I can happily go off and fetch those from Bio::DB::GenBank (first column), and Bio::DB::GenPept (second). All good, except... AP008207 is a contig. I don't want to get all of the features for the entire thing, just the single contig that actually matches the original sequence. It takes a couple of hours to get at it and then it gives me way too much. I will come across this problem with other sequences. How do I (a) find out if it is a contig without downloading it in it's entirety and (b) extract the list of sequences that are about to be contigged together. I have searched the web for answers, including this list, but see nothing. Help! Nikki Appleby. From bosborne11 at verizon.net Wed Oct 18 20:54:04 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 18 Oct 2006 20:54:04 -0400 Subject: [Bioperl-l] LocatableSeq object vs Sequence Object In-Reply-To: <830b42e70610170926h5bdaabd6hc1352b8d4f3c8c4c@mail.gmail.com> Message-ID: Peter, I'm not understanding your question, partly because your letter and your code are saying different things. You say you want to call location_from_column() but your code shows you calling species(). What happens when you call location_from_column? Do you see errors? Brian O. On 10/17/06 12:26 PM, "Peter H. Baenziger" wrote: > I was thinking I could use: > foreach $seq ($alignment->each_seq()) > to loop through the sequences and call: > $seq->location_from_column($pos) > on each of the sequences. From cjfields at uiuc.edu Wed Oct 18 22:46:14 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 21:46:14 -0500 Subject: [Bioperl-l] CONTIG dealing In-Reply-To: <000f01c6f300$7e2f3390$1b776682@IMBPC.AD> References: <000f01c6f300$7e2f3390$1b776682@IMBPC.AD> Message-ID: On Oct 18, 2006, at 4:58 PM, Nikki Appleby wrote: > > I have just entered the wonderful new world of BioPerl, so the > answer to my > question may be obvious to any of the gurus reading this. > > I need to collect sequence features and ontology annotations. Here > goes. > > I am retrieving sequences from SwissProt via Bio::DB::SwissProt and > get_Seq_by_id, for this example Q8RZV7. Once I have parsed it into > an RDBMS > format that I am happy with I can get at the xref ids. In this > case, they > are > > AP003451; BAB86144.1; -; Genomic_DNA. > AP008207; BAF07116.1; -; Genomic_DNA. > AB103395; BAC81207.1; -; mRNA. > > I can happily go off and fetch those from Bio::DB::GenBank (first > column), > and Bio::DB::GenPept (second). All good, except... > > AP008207 is a contig. I don't want to get all of the features for > the entire > thing, just the single contig that actually matches the original > sequence. > It takes a couple of hours to get at it and then it gives me way > too much. > > I will come across this problem with other sequences. How do I (a) > find out > if it is a contig without downloading it in it's entirety and (b) > extract > the list of sequences that are about to be contigged together. > > I have searched the web for answers, including this list, but see > nothing. > Help! > > Nikki Appleby. The default setting for the retrieval format for GenBank is 'gbwithparts' (which gets the full sequence at all times). You can set this to 'gb' using request_format() to retrieve the sequence file with the contig information instead of the sequence, if it contains such (otherwise it just retrieves the sequence anyway). However, I have noticed this particular file does not represent a true contig record but is the entire chromosome sequence. The contig information is in the comments section, probably b/c the record is converted over. You could just download the sequence record and run regexp to grab the comments section, then parse out the contigs (a pain) if you really want that. Or you could try to find the equivalent GenBank record, such as the ones derived from the WGS records. I did notice the list of dbxrefs in your swissprot record indicate three EMBL sequences. If the order is consistent for the SwissProt entries you want, they probably represent: The contig (what you want): AP003451; BAB86144.1; -; Genomic_DNA. The supercontig (chromosome) : AP008207; BAF07116.1; -; Genomic_DNA. The cDNA : AB103395; BAC81207.1; -; mRNA. I checked the first one (AP003451), which seems to confirm this. Since the chromosome supercontig is built from the smaller sequence contigs you could just grab the first EMBL dbxref instead of all of them. It parses much faster than the chromosome file. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Wed Oct 18 11:47:14 2006 From: jason at bioperl.org (Jason Stajich) Date: Wed, 18 Oct 2006 08:47:14 -0700 Subject: [Bioperl-l] Blast information In-Reply-To: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu> References: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu> Message-ID: <6B7D24F3-69F1-498D-AB53-B4CEB14E4F3D@bioperl.org> I think this will work for you. The seq_inds method parses the middle homology sequence and classifies each alignment column and returns a list of the columns meeting the criteria. You can interrogate query or hit in this case since you are requiring it to be identical my $identicalbases = scalar $hsp->seq_inds('query', 'identical'); my $conservedbases = scalar $hsp->seq_inds('query','conserved'); Conserved returns those identical or conserved, if you want just those with conservative replacements use 'conserved-not-identical' See http://bioperl.org/wiki/HOWTO:SearchIO#Table_of_Methods for more info. -jason On Oct 18, 2006, at 8:16 AM, Kevin Brown wrote: > I just recently upgraded to 1.5.1 on WinXP to bring this version > closer > to live to parse some locally created blast files. I'm trying to find > the method that returns the values that are underneath the Identities > and Positives information as I'm trying to replicate the output of an > old blast parser we have here written in RealBasic which is showing > its > age. Once I have it replicating the old output I then intend to add > more features in terms of filtering returned hits (like not returning > self->self hits or a->b so don't show b->a). > > Example: > I'm looking for the methods that will return 117 from identities > and 117 > from positives. I can't just use num_identical/percent_identity as > that > isn't 100% accurate. > >> BurkM_2016 > Length = 241 > > Score = 43.2 bits (88), Expect = 7e-005 > Identities = 26/117 (22%), Positives = 51/117 (43%) > > Query: 298 > QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL > 357 > Q F F + A+ ++ + + + L +R GL + P E + > A+L > Sbjct: 111 > QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL > 170 > > Query: 358 > MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414 > A+ G++L++DV ++ H R ++ L L + +S R +T+ L + L > Sbjct: 171 > KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227 > > Thanks, > Kevin > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From jason at bioperl.org Thu Oct 19 01:00:28 2006 From: jason at bioperl.org (Jason Stajich) Date: Wed, 18 Oct 2006 22:00:28 -0700 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> References: <001301c6f215$07a9a070$15327e82@pyrimidine> <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> Message-ID: <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org> So I'm unsure what we should do here. We can certainly fix the problem which you report which is relying on the "" method -- if you were to do instead: print $_->database, ":", $_->primary_id, "\n"; you'll get the right answer. We at a minimum just fix the auto- string converting method to do The Right Thing. But I am not sure if we should keep the version out of the primary_id field. This will require some rejiggering in several modules when it comes to printing DBlinks and I don't want to do this before the release. I also am not sure if there was an explicit reason why someone did put the version information in the primary_id. (I hope it wasn't me because I don't think I'm going to remember why). Does anyone else have a strong feeling? -jason On Oct 17, 2006, at 12:01 PM, Erikjan wrote: > Hello, > > I noticed a little problem with the Annotation "DBLink" from > GenBank entries > > When I run: > > perl -MBio::DB::GenBank -e 'my $gi = > 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio = > $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my > $ac=$seq->annotation(); my @annotations = $ac->get_Annotations > ("dblink"); > for(@annotations) { print $_, "\n";} print $INC{ > "Bio/Annotation/DBLink.pm" }, "\n"; ' > > This yields: > > GenBank:AL591065.17.17 > > and the place where the used Bio/Annotation/DBLink.pm resides. > > Can others repeat this? > > I have dug into the source a little and Bio::Annotation::DBLink > seems to > be the place where this happens: it has a concatenation which leads to > that repeated version number. > > It this something that I should fix "client-side", so to speak, or > is it > worthwhile to add some logic to that concatenation to prevent this? > > > Thanks, > > Eric > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From n.haigh at sheffield.ac.uk Thu Oct 19 02:41:02 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 19 Oct 2006 07:41:02 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <000401c6f2d6$5144e2f0$15327e82@pyrimidine> References: <000401c6f2d6$5144e2f0$15327e82@pyrimidine> Message-ID: <45371DFE.6050306@sheffield.ac.uk> > As a followup in this, I tried bioperl-network and had similar failed tests > with Graph 0.79 (the only PPM available from ActiveState). However, the > INSTALL docs state that Graph 0.80 is needed, and the test run gave several > warnings about not having Graph 0.80 installed. > > I made a PPM of Graph 0.80, installed, retried bioperl-network tests, and > everything passed. Maybe we need to have a Graph PPM available for those > who want bioperl-network? > > As for bioperl-run, all tests passed from a new CVS checkout even though I > have none of the programs installed, so they seem to skip properly. The > test run also printed warnings when a program wasn't available or installed. > > > Chris > > If you can pop the Graph 0.80 ppd files in DIST/RC/ i'll make modifications to integrate them into the package.xml file for PPM4 clients. Nath From n.haigh at sheffield.ac.uk Thu Oct 19 06:40:21 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 19 Oct 2006 11:40:21 +0100 Subject: [Bioperl-l] bioperl-run t/EMBOSS.t Message-ID: <45375615.1020603@sheffield.ac.uk> Should line 25 read: require Bio::Factory::EMBOSS instead of: require Bio::EMBOSS::Factory; Nath From hlapp at gmx.net Thu Oct 19 09:56:05 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 19 Oct 2006 09:56:05 -0400 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org> References: <001301c6f215$07a9a070$15327e82@pyrimidine> <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org> Message-ID: <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net> Here is the overload code: use overload '""' => sub { (($_[0]->database ? $_[0]->database . ':' : '' ) . ($_[0]->primary_id ? $_[0]->primary_id : '') . ($_[0]->version ? '.' . $_[0]->version : '')) || '' }; Except that the last '||' is redundant and unnecessary (it either does nothing or replaces an empty string with an empty string), I don't see the potential for duplicating the version number here - unless primary_id() did that, which I don't see it doing. So, to me this seems to come from a parsing error in the beginning, rather than an erroneous mangling of version into primary_id later. Is someone in the position to confirm this? -hilmar On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: > So I'm unsure what we should do here. > > We can certainly fix the problem which you report which is relying on > the "" method -- if you were to do instead: > print $_->database, ":", $_->primary_id, "\n"; > > you'll get the right answer. We at a minimum just fix the auto- > string converting method to do The Right Thing. > > But I am not sure if we should keep the version out of the primary_id > field. This will require some rejiggering in several modules when it > comes to printing DBlinks and I don't want to do this before the > release. I also am not sure if there was an explicit reason why > someone did put the version information in the primary_id. (I hope it > wasn't me because I don't think I'm going to remember why). > > Does anyone else have a strong feeling? > > -jason > On Oct 17, 2006, at 12:01 PM, Erikjan wrote: > >> Hello, >> >> I noticed a little problem with the Annotation "DBLink" from >> GenBank entries >> >> When I run: >> >> perl -MBio::DB::GenBank -e 'my $gi = >> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio = >> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my >> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations >> ("dblink"); >> for(@annotations) { print $_, "\n";} print $INC{ >> "Bio/Annotation/DBLink.pm" }, "\n"; ' >> >> This yields: >> >> GenBank:AL591065.17.17 >> >> and the place where the used Bio/Annotation/DBLink.pm resides. >> >> Can others repeat this? >> >> I have dug into the source a little and Bio::Annotation::DBLink >> seems to >> be the place where this happens: it has a concatenation which >> leads to >> that repeated version number. >> >> It this something that I should fix "client-side", so to speak, or >> is it >> worthwhile to add some logic to that concatenation to prevent this? >> >> >> Thanks, >> >> Eric >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich, PhD > Miller Research Fellow > University of California > Dept of Plant and Microbial Biology > 321 Koshland Hall #3102 > Berkeley, CA 94720-3102 > lab: 510.642.8441 > http://pmb.berkeley.edu/~taylor/people/js.html > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From dmessina at wustl.edu Thu Oct 19 09:55:31 2006 From: dmessina at wustl.edu (David Messina) Date: Thu, 19 Oct 2006 08:55:31 -0500 Subject: [Bioperl-l] missing documentation (request for help) Message-ID: <69453D5F-7794-4DC7-BAE1-A8B2191752E6@wustl.edu> Hi all, There are a few modules missing a one-line description, and by one- line description, I'm referring to the part that comes after the module name in the POD. e.g. in =head1 NAME Bio::SearchIO - Driver for parsing Sequence Database Searches (BLAST, FASTA, ...) =head1 SYNOPSIS [etc...] "Driver for parsing Sequence Database Searches (BLAST, FASTA, ...)" is the one-line description (even though it falls onto two lines) :). I fixed the modules that I knew something about, but there are some I haven't used. Perhaps the author, or someone else familiar with these modules, could fill in an appropriate short description? Here is the list of affected modules: Bio::DB::Expression Bio::Expression::Contact Bio::Expression::DataSet Bio::Expression::Platform Bio::Expression::Sample Bio::Search::Processor Bio::DB::EUtilities::ElinkData Bio::DB::GFF::Adaptor::memory::feature_serializer Bio::DB::SeqFeature::Store::DBI::Iterator Bio::Expression::FeatureGroup::FeatureGroupMas50 Bio::Expression::FeatureSet::FeatureSetMas50 Bio::Matrix::PSM::PsmHeaderI Bio::OntologyIO::Handlers::BaseSAXHandler Some of these are missing other POD parts as well -- please add those too if you can. Thanks, Dave From mckays at cshl.edu Thu Oct 19 09:51:18 2006 From: mckays at cshl.edu (Sheldon McKay) Date: Thu, 19 Oct 2006 09:51:18 -0400 Subject: [Bioperl-l] chromosome ideograms Message-ID: <6b0de00426b3c04b0d0d7641bc8e14e3@cshl.edu> Hi, Sorry for the late reply. I have been working on a karyotype drawing tool as part of the Generic Genome Browser that may be useful. In addition to drawing features next to chromosome ideograms, it also supports making chromosome 'bands' from any kind of scored features to create a sort of heat map on the chromosome itself. I have a demo running at http://mckay.cshl.edu/cgi-bin/gbrowse_karyotype and the source is available from the GMOD CVS HEAD http://www.gmod.org/cvs Sheldon -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Sheldon McKay, PhD Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 From n.haigh at sheffield.ac.uk Thu Oct 19 11:37:31 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 19 Oct 2006 15:37:31 +0000 Subject: [Bioperl-l] bioperl-run t/EMBOSS.t In-Reply-To: <45375615.1020603@sheffield.ac.uk> References: <45375615.1020603@sheffield.ac.uk> Message-ID: <45379BBB.1040400@sheffield.ac.uk> Thanks for committing that change Brian. Now the tests proceed from this point, I get the following error: ------------- EXCEPTION: Bio::Root::NotImplemented ------------- MSG: Abstract method "Bio::Tools::Run::WrapperBase::program_dir" is not implemented by package Bio::Tools::Run::EMBOSSApplication. This is not your fault - author of Bio::Tools::Run::EMBOSSApplication should be blamed! STACK: Error::throw STACK: Bio::Root::Root::throw /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/Root.pm:350 STACK: Bio::Root::RootI::throw_not_implemented /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/RootI.pm:522 STACK: Bio::Tools::Run::WrapperBase::program_dir /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:346 STACK: Bio::Tools::Run::WrapperBase::program_path /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:327 STACK: Bio::Tools::Run::WrapperBase::executable /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:297 STACK: t/EMBOSS.t:58 ---------------------------------------------------------------- From N.Haigh at sheffield.ac.uk Thu Oct 19 11:03:00 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 16:03:00 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <45379BBB.1040400@sheffield.ac.uk> References: <45375615.1020603@sheffield.ac.uk> <45379BBB.1040400@sheffield.ac.uk> Message-ID: <1161270180.453793a432e4f@webmail.shef.ac.uk> I thought I'd have my first proper try at writing some tests. I was wondering if there is a template test file that I should use/study in order to be consistent with other tests. Failing that - Is there a good test writing style I should follow in one of the other test files? Thanks Nathan From bosborne11 at verizon.net Thu Oct 19 11:06:08 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Thu, 19 Oct 2006 11:06:08 -0400 Subject: [Bioperl-l] bioperl-run t/EMBOSS.t In-Reply-To: <45379BBB.1040400@sheffield.ac.uk> Message-ID: Nathan, Yes, I see. Those EMBOSS programs work a bit differently from the typical app run by bioperl-run, there's no need for WrapperBase methods like program_dir(), executable(), it seems. Well, I can try and take a look at this tonight but there's probably someone better suited to this than me, I've spent very little time with bioperl-run. Volunteer? Brian O. On 10/19/06 11:37 AM, "Nathan S. Haigh" wrote: > Thanks for committing that change Brian. Now the tests proceed from this > point, I get the following error: > > ------------- EXCEPTION: Bio::Root::NotImplemented ------------- > MSG: Abstract method "Bio::Tools::Run::WrapperBase::program_dir" is not > implemented by package Bio::Tools::Run::EMBOSSApplication. > This is not your fault - author of Bio::Tools::Run::EMBOSSApplication > should be blamed! > > STACK: Error::throw > STACK: Bio::Root::Root::throw > /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/Root.pm:350 > STACK: Bio::Root::RootI::throw_not_implemented > /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/RootI.pm:522 > STACK: Bio::Tools::Run::WrapperBase::program_dir > /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:346 > STACK: Bio::Tools::Run::WrapperBase::program_path > /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:327 > STACK: Bio::Tools::Run::WrapperBase::executable > /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:297 > STACK: t/EMBOSS.t:58 > ---------------------------------------------------------------- > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From niels at genomics.dk Thu Oct 19 11:16:37 2006 From: niels at genomics.dk (Niels Larsen) Date: Thu, 19 Oct 2006 17:16:37 +0200 Subject: [Bioperl-l] From EBI support re WU-Blast SOAP service In-Reply-To: <4535EBF9.1090706@sendu.me.uk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> <4535EBF9.1090706@sendu.me.uk> Message-ID: <453796D5.2070808@genomics.dk> Sendu Bala wrote: >> I invoked the EBI script >> >> http://www.ebi.ac.uk/Tools/webservices/downloads/WSWUBlastSOAPLite.zip >> >> like this >> >> WSWUBlastClient.pl -p blastn -D embl test.fasta >> >> where the content of test.fasta is below, and got >> >> Can't find method element in the message at >> /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311. > > As you admit, this is not a Bioperl issue. I would suggest you contact > EBI support. > To use EBI's WU-blast SOAP interface from perl, EBI support says it one must use SOAP::Lite v 0.60 (no later version) and include '--email you.example.com' on the command line. This is neither evident from their web pages or the script usage statement, but they promised to fix. ------------------------------------------------------------------------ Niels Larsen Danish Genome Institute Gustav Wieds vej 10 C DK-8000 Aarhus C Denmark Electronic mail: niels at genomics.dk Skype: niels_larsen_denmark Telephone: +45-8942-5268 Telefax: +45-8620-1222 ------------------------------------------------------------------------ From cjfields at uiuc.edu Thu Oct 19 11:31:45 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 10:31:45 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <45371DFE.6050306@sheffield.ac.uk> Message-ID: <001501c6f393$b66bd4a0$15327e82@pyrimidine> > > As a followup in this, I tried bioperl-network and had similar failed > tests > > with Graph 0.79 (the only PPM available from ActiveState). However, the > > INSTALL docs state that Graph 0.80 is needed, and the test run gave > several > > warnings about not having Graph 0.80 installed. > > > > I made a PPM of Graph 0.80, installed, retried bioperl-network tests, > and > > everything passed. Maybe we need to have a Graph PPM available for > those > > who want bioperl-network? > > > > As for bioperl-run, all tests passed from a new CVS checkout even though > I > > have none of the programs installed, so they seem to skip properly. The > > test run also printed warnings when a program wasn't available or > installed. > > > > > > Chris > > > > > If you can pop the Graph 0.80 ppd files in DIST/RC/ i'll make > modifications to integrate them into the package.xml file for PPM4 > clients. > > Nath Will do. Should these be forwarded to Mauricio? Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From N.Haigh at sheffield.ac.uk Thu Oct 19 11:38:05 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 16:38:05 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <001501c6f393$b66bd4a0$15327e82@pyrimidine> References: <001501c6f393$b66bd4a0$15327e82@pyrimidine> Message-ID: <1161272285.45379bdd1aea4@webmail.shef.ac.uk> > > If you can pop the Graph 0.80 ppd files in DIST/RC/ i'll make > > modifications to integrate them into the package.xml file for PPM4 > > clients. > > > > Nath > > Will do. Should these be forwarded to Mauricio? > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > If you don't have access to the web, you can send them to me - I now have an account on that server. Cheers Nath From cjfields at uiuc.edu Thu Oct 19 11:45:00 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 10:45:00 -0500 Subject: [Bioperl-l] test::more template In-Reply-To: <1161270180.453793a432e4f@webmail.shef.ac.uk> Message-ID: <001601c6f395$8a752ed0$15327e82@pyrimidine> > I thought I'd have my first proper try at writing some tests. I was > wondering if there is a template test file that I should use/study in > order to be > consistent with other tests. > > Failing that - Is there a good test writing style I should follow in one > of the other test files? > > Thanks > Nathan I would start with the Test::Simple and Test::More perldoc; they're pretty self-explanatory. You can look at the various test suites using Test::More as well for pointers. By far, most tests will use is(). You can use SKIP blocks to skip tests that have a requirement, or skip all tests if they all require something. Pretty flexible. We should probably get a wiki page for the developers underway, maybe a HOWTO on writing tests. At least have these focus on BioPerl, OOP, remote DB tests, etc. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Oct 19 12:23:40 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 11:23:40 -0500 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net> Message-ID: <001b01c6f39a$f0288ba0$15327e82@pyrimidine> > Here is the overload code: > > use overload '""' => sub { > (($_[0]->database ? $_[0]->database . ':' : '' ) > . ($_[0]->primary_id ? $_[0]->primary_id : '') > . ($_[0]->version ? '.' . $_[0]->version : '')) > || '' }; > > Except that the last '||' is redundant and unnecessary (it either > does nothing or replaces an empty string with an empty string), I > don't see the potential for duplicating the version number here - > unless primary_id() did that, which I don't see it doing. > > So, to me this seems to come from a parsing error in the beginning, > rather than an erroneous mangling of version into primary_id later. > > Is someone in the position to confirm this? > > -hilmar I have attached a script to the bug report on bugzilla, as well as the test output sequence and the actual GenBank record. There are a number of problems: 1) primary_id() is assigned both the id and version. 2) version() is still assigned the version. The above explain when printing the object directly using the overload (it concatenates them). However, there are a few more issues. The ID is printed normally (accession.version), but the source DB is not present when SeqIO handles the sequence. I have attached the output and the original GenBank record to the bug report. I can look into it but it won't be today; got my hands full with enzyme assays. Chris From N.Haigh at sheffield.ac.uk Thu Oct 19 12:50:57 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 17:50:57 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <001601c6f395$8a752ed0$15327e82@pyrimidine> References: <001601c6f395$8a752ed0$15327e82@pyrimidine> Message-ID: <1161276657.4537acf1edc80@webmail.shef.ac.uk> Quoting Chris Fields : > > I thought I'd have my first proper try at writing some tests. I was > > wondering if there is a template test file that I should use/study in > > order to be > > consistent with other tests. > > > > Failing that - Is there a good test writing style I should follow in one > > of the other test files? > > > > Thanks > > Nathan > > I would start with the Test::Simple and Test::More perldoc; they're pretty > self-explanatory. You can look at the various test suites using Test::More > as well for pointers. By far, most tests will use is(). You can use SKIP > blocks to skip tests that have a requirement, or skip all tests if they all > require something. Pretty flexible. > > We should probably get a wiki page for the developers underway, maybe a > HOWTO on writing tests. At least have these focus on BioPerl, OOP, remote > DB tests, etc. > > Chris > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > > Just working through some test things now, I thought I'd start on the bioperl-run stuff as I thought it might be a bit more straight forward, i'm familiar with some of them and they seem to get neglected. I'm heavily commenting my tests with the thought of starting a wiki guide to testing Bioperl modules. See how far I get! Nath From hlapp at gmx.net Thu Oct 19 13:11:27 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 19 Oct 2006 13:11:27 -0400 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org> References: <001301c6f215$07a9a070$15327e82@pyrimidine> <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org> <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net> <8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org> Message-ID: <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net> Actually you did that Jason: http://tinyurl.com/ye2edk Apparently the motivation was to "parse swissprot fields in genpept file (dbsource)"? It clearly looks wrong to add the version. You've probably had a reason why you did this at the time but if we (you :) can't recover that I guess it's best to just fix it to do the right thing (in both places obviously). -hilmar On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote: > Well there is explicit addition of the version to the primary id so > it isn't so much a parsing error as a deliberate decision to append > it. > see Bio::SeqIO::genbank > > to make the dblink > $annotation- > >add_Annotation > ('dblink', > > Bio::Annotation::DBLink->new > (-primary_id > => $id . "." . $version, > -version => > $version, > -database => > $db, > -tagname => > 'dblink')); > > and the code to print the dblink back out in the writer already > assumes the version number is appended... > > foreach my $ref ( $seq->annotation->get_Annotations > ('dblink') ) { > # if ($ref->comment eq 'DBSOURCE') { > $self->_print('DBSOURCE accession ', > $ref->primary_id, "\n"); > # } > } > > On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote: > >> Here is the overload code: >> >> use overload '""' => sub { >> (($_[0]->database ? $_[0]->database . ':' : '' ) >> . ($_[0]->primary_id ? $_[0]->primary_id : '') >> . ($_[0]->version ? '.' . $_[0]->version : '')) >> || '' }; >> >> Except that the last '||' is redundant and unnecessary (it either >> does nothing or replaces an empty string with an empty string), I >> don't see the potential for duplicating the version number here - >> unless primary_id() did that, which I don't see it doing. >> >> So, to me this seems to come from a parsing error in the >> beginning, rather than an erroneous mangling of version into >> primary_id later. >> >> Is someone in the position to confirm this? >> >> -hilmar >> >> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: >> >>> So I'm unsure what we should do here. >>> >>> We can certainly fix the problem which you report which is >>> relying on >>> the "" method -- if you were to do instead: >>> print $_->database, ":", $_->primary_id, "\n"; >>> >>> you'll get the right answer. We at a minimum just fix the auto- >>> string converting method to do The Right Thing. >>> >>> But I am not sure if we should keep the version out of the >>> primary_id >>> field. This will require some rejiggering in several modules >>> when it >>> comes to printing DBlinks and I don't want to do this before the >>> release. I also am not sure if there was an explicit reason why >>> someone did put the version information in the primary_id. (I >>> hope it >>> wasn't me because I don't think I'm going to remember why). >>> >>> Does anyone else have a strong feeling? >>> >>> -jason >>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote: >>> >>>> Hello, >>>> >>>> I noticed a little problem with the Annotation "DBLink" from >>>> GenBank entries >>>> >>>> When I run: >>>> >>>> perl -MBio::DB::GenBank -e 'my $gi = >>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my >>>> $seqio = >>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my >>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations >>>> ("dblink"); >>>> for(@annotations) { print $_, "\n";} print $INC{ >>>> "Bio/Annotation/DBLink.pm" }, "\n"; ' >>>> >>>> This yields: >>>> >>>> GenBank:AL591065.17.17 >>>> >>>> and the place where the used Bio/Annotation/DBLink.pm resides. >>>> >>>> Can others repeat this? >>>> >>>> I have dug into the source a little and Bio::Annotation::DBLink >>>> seems to >>>> be the place where this happens: it has a concatenation which >>>> leads to >>>> that repeated version number. >>>> >>>> It this something that I should fix "client-side", so to speak, or >>>> is it >>>> worthwhile to add some logic to that concatenation to prevent this? >>>> >>>> >>>> Thanks, >>>> >>>> Eric >>>> >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> -- >>> Jason Stajich, PhD >>> Miller Research Fellow >>> University of California >>> Dept of Plant and Microbial Biology >>> 321 Koshland Hall #3102 >>> Berkeley, CA 94720-3102 >>> lab: 510.642.8441 >>> http://pmb.berkeley.edu/~taylor/people/js.html >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> > > -- > Jason Stajich, PhD > Miller Research Fellow > University of California > Dept of Plant and Microbial Biology > 321 Koshland Hall #3102 > Berkeley, CA 94720-3102 > lab: 510.642.8441 > http://pmb.berkeley.edu/~taylor/people/js.html > > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From N.Haigh at sheffield.ac.uk Thu Oct 19 13:17:33 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 18:17:33 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <001601c6f395$8a752ed0$15327e82@pyrimidine> References: <001601c6f395$8a752ed0$15327e82@pyrimidine> Message-ID: <1161278253.4537b32dd3d15@webmail.shef.ac.uk> Quoting Chris Fields : > > I thought I'd have my first proper try at writing some tests. I was > > wondering if there is a template test file that I should use/study in > > order to be > > consistent with other tests. > > > > Failing that - Is there a good test writing style I should follow in one > > of the other test files? > > > > Thanks > > Nathan > > I would start with the Test::Simple and Test::More perldoc; they're pretty > self-explanatory. You can look at the various test suites using Test::More > as well for pointers. By far, most tests will use is(). You can use SKIP > blocks to skip tests that have a requirement, or skip all tests if they all > require something. Pretty flexible. > > We should probably get a wiki page for the developers underway, maybe a > HOWTO on writing tests. At least have these focus on BioPerl, OOP, remote > DB tests, etc. > > Chris > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > Just wrote a partial and small test script for t/Amap.t in bioperl-run. When I run "perl -I. t/Amap.t" I get the following output: 1..10 ok 1 - use Bio::Tools::Run::Alignment::Amap; ok 2 - use Bio::AlignIO; ok 3 - use Bio::SeqIO; ok 4 - use Bio::Root::IO; ok 5 - All the required modules are present ok 6 - new() returned something ok 7 - and its the right class not ok 8 - executable() got the correct filename # Failed test 'executable() got the correct filename' # in t/Amap.t at line 90. # got: undef # expected: 'filename' ok 9 # skip Got incorrect filename for executable ok 10 # skip Got incorrect filename for executable # Looks like you failed 1 test of 10. So far this looks good (well, that it's failing passing expected tests). However, when i run "make test" the output is unexpected and I don't know why. It seems to die and produce the results of the testing before the rest of the test suit is run: t/Amap....................NOK 8 # Failed test 'executable() got the correct filename' # in t/Amap.t at line 90. # got: undef # expected: 'filename' # Looks like you failed 1 test of 10. t/Amap....................dubious Test returned status 1 (wstat 256, 0x100) DIED. FAILED test 8 Failed 1/10 tests, 90.00% okay (less 2 skipped tests: 7 okay, 70.00%) t/Analysis_soap...........ok 7/17make: *** wait: No child processes. Stop. Is there something I'm missing?? If it's something less obvious, let me know and i'll post whole test file. Nath From cjfields at uiuc.edu Thu Oct 19 13:26:45 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 12:26:45 -0500 Subject: [Bioperl-l] test::more template In-Reply-To: <1161278253.4537b32dd3d15@webmail.shef.ac.uk> Message-ID: <002001c6f3a3$c00b9080$15327e82@pyrimidine> ... > Just wrote a partial and small test script for t/Amap.t in bioperl-run. > When I run "perl -I. t/Amap.t" I get the following output: > 1..10 > ok 1 - use Bio::Tools::Run::Alignment::Amap; > ok 2 - use Bio::AlignIO; > ok 3 - use Bio::SeqIO; > ok 4 - use Bio::Root::IO; > ok 5 - All the required modules are present > ok 6 - new() returned something > ok 7 - and its the right class > not ok 8 - executable() got the correct filename > # Failed test 'executable() got the correct filename' > # in t/Amap.t at line 90. > # got: undef > # expected: 'filename' > ok 9 # skip Got incorrect filename for executable > ok 10 # skip Got incorrect filename for executable > # Looks like you failed 1 test of 10. > > > So far this looks good (well, that it's failing passing expected tests). > However, when i run "make test" the output is unexpected and I don't know > why. It seems to die and produce the results of the testing before the > rest of the test suit is run: > t/Amap....................NOK 8 > # Failed test 'executable() got the correct filename' > # in t/Amap.t at line 90. > # got: undef > # expected: 'filename' > # Looks like you failed 1 test of 10. > t/Amap....................dubious > Test returned status 1 (wstat 256, 0x100) > DIED. FAILED test 8 > Failed 1/10 tests, 90.00% okay (less 2 skipped tests: 7 okay, > 70.00%) > t/Analysis_soap...........ok 7/17make: *** wait: No child processes. > Stop. > > > > Is there something I'm missing?? If it's something less obvious, let me > know and i'll post whole test file. > Nath Well, Analysis_soap.t ran immediately after, so Amap.t tests shouldn't be the problem. The only issue I can think of is that Test::More TODO blocks require a newer version of Test::Harness (which most users have anyway). Are you using a TODO block? You can send me Amap.t and I'll give it a try, but I can't promise I'll get to it immediately (busy day). Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From N.Haigh at sheffield.ac.uk Thu Oct 19 13:38:25 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 18:38:25 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <002001c6f3a3$c00b9080$15327e82@pyrimidine> References: <002001c6f3a3$c00b9080$15327e82@pyrimidine> Message-ID: <1161279505.4537b811e143f@webmail.shef.ac.uk> > Well, Analysis_soap.t ran immediately after, so Amap.t tests shouldn't be > the problem. The only issue I can think of is that Test::More TODO blocks > require a newer version of Test::Harness (which most users have anyway). > Are you using a TODO block? > > You can send me Amap.t and I'll give it a try, but I can't promise I'll get > to it immediately (busy day). > > Chris > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > No TODO blocks. I must have done something wrong - it's the first time I've seen this - but then again, I don't look that closely at the output of "make test" unless something shows as a fail. Anyway, below is the short bit of code. Thanks Nath use strict; use Bio::Root::IO; # cant test for this, might be needed to get Test::More BEGIN { # Things to do ASAP once the script is run # even before anything else in the file is parsed use vars qw($NUMTESTS $DEBUG $error); $DEBUG = $ENV{'BIOIPERLDEBUG'} || 0; # Use installed Test module, otherwise fall back # to copy of Test.pm located in the t dir eval { require Test::More; }; if ( $@ ) { use lib Bio::Root::IO->catfile('t','lib'); } # Currently no errors $error = 0; # Setup the number of tests to be run # what about using: # use Test::More 'no_plan'; use Test::More; $NUMTESTS = 10; plan tests => $NUMTESTS; # Use modules that are needed in this test that are from # any of the Bioperl packages: Bioperl-core, Bioperl-run ... etc # use_ok(''); use_ok('Bio::Tools::Run::Alignment::Amap'); use_ok('Bio::AlignIO'); use_ok('Bio::SeqIO'); use_ok('Bio::Root::IO'); } # Multiple END blocks are run in reverse order of their definition # Last In, First Out (LIFO) END { # Things to do right at the very end, just # when the interpreter finishes/exits # E.g. deleting intermediate files produced during the test foreach my $file ( qw(cysprot.dnd cysprot1a.dnd) ) { unlink $file; # check it was deleted } #unlink qw(cysprot.dnd cysprot1a.dnd) } END { # Not sure what this is doing? #for ( $Test::ntest..$NUMTESTS ) { # skip("Amap program not found. Skipping.\n",1); #} } # if we got to here, thats OK! # is this really needed? ok( 1, 'All the required modules are present'); # setup input files etc my $inputfilename = Bio::Root::IO->catfile("t","data","cysprot.fa"); # setup output files etc # none in this test # setup global objects that are to be used in more than one test # Also test they were initialised correctly my @params = (); my $aln; my $factory = Bio::Tools::Run::Alignment::Amap->new(@params); ok( defined $factory, 'new() returned something' ); ok( $factory->isa('Bio::Tools::Run::Alignment::Amap'), ' and its the right class' ); # Now onto the nitty gritty tests of the modules methods my $executable_file = $factory->executable(); #is( $factory->executable(), 'filename', 'executable() got the correct filename' ); # block of tests to skip if you know the tests will fail # under some condition. E.g.: # Need network access, # Wont work on particular OS, # Cant find the exectuable # Do not just skip tests that seem to fail for an unknown reason SKIP: { # condition used to skip this block of tests #skip($why, $how_many_in_block); skip("Got incorrect filename for executable", 2) unless is($factory->executable(), 'filename', 'executable() got the correct filename'); ok( -e $executable_file, 'Found executable' ); ok( $factory->version >= 2.0, 'Code tested on Amap versions >= 2.0' ); } From jason at bioperl.org Thu Oct 19 13:44:51 2006 From: jason at bioperl.org (Jason Stajich) Date: Thu, 19 Oct 2006 10:44:51 -0700 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net> References: <001301c6f215$07a9a070$15327e82@pyrimidine> <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org> <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net> <8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org> <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net> Message-ID: Yikes - I was worried that it might have been me..... Okay I'll look into fixing it -- ChrisF - check in with me before diving in, in case I've gotten it done and I expect your enzyme assays might take up the time. -jason On Oct 19, 2006, at 10:11 AM, Hilmar Lapp wrote: > Actually you did that Jason: http://tinyurl.com/ye2edk > > Apparently the motivation was to "parse swissprot fields in genpept > file (dbsource)"? > > It clearly looks wrong to add the version. You've probably had a > reason why you did this at the time but if we (you :) can't recover > that I guess it's best to just fix it to do the right thing (in > both places obviously). > > -hilmar > > On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote: > >> Well there is explicit addition of the version to the primary id >> so it isn't so much a parsing error as a deliberate decision to >> append it. >> see Bio::SeqIO::genbank >> >> to make the dblink >> $annotation- >> >add_Annotation >> ('dblink', >> >> Bio::Annotation::DBLink->new >> (-primary_id >> => $id . "." . $version, >> -version => >> $version, >> -database => >> $db, >> -tagname => >> 'dblink')); >> >> and the code to print the dblink back out in the writer already >> assumes the version number is appended... >> >> foreach my $ref ( $seq->annotation->get_Annotations >> ('dblink') ) { >> # if ($ref->comment eq 'DBSOURCE') { >> $self->_print('DBSOURCE accession ', >> $ref->primary_id, "\n"); >> # } >> } >> >> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote: >> >>> Here is the overload code: >>> >>> use overload '""' => sub { >>> (($_[0]->database ? $_[0]->database . ':' : '' ) >>> . ($_[0]->primary_id ? $_[0]->primary_id : '') >>> . ($_[0]->version ? '.' . $_[0]->version : '')) >>> || '' }; >>> >>> Except that the last '||' is redundant and unnecessary (it either >>> does nothing or replaces an empty string with an empty string), I >>> don't see the potential for duplicating the version number here - >>> unless primary_id() did that, which I don't see it doing. >>> >>> So, to me this seems to come from a parsing error in the >>> beginning, rather than an erroneous mangling of version into >>> primary_id later. >>> >>> Is someone in the position to confirm this? >>> >>> -hilmar >>> >>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: >>> >>>> So I'm unsure what we should do here. >>>> >>>> We can certainly fix the problem which you report which is >>>> relying on >>>> the "" method -- if you were to do instead: >>>> print $_->database, ":", $_->primary_id, "\n"; >>>> >>>> you'll get the right answer. We at a minimum just fix the auto- >>>> string converting method to do The Right Thing. >>>> >>>> But I am not sure if we should keep the version out of the >>>> primary_id >>>> field. This will require some rejiggering in several modules >>>> when it >>>> comes to printing DBlinks and I don't want to do this before the >>>> release. I also am not sure if there was an explicit reason why >>>> someone did put the version information in the primary_id. (I >>>> hope it >>>> wasn't me because I don't think I'm going to remember why). >>>> >>>> Does anyone else have a strong feeling? >>>> >>>> -jason >>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote: >>>> >>>>> Hello, >>>>> >>>>> I noticed a little problem with the Annotation "DBLink" from >>>>> GenBank entries >>>>> >>>>> When I run: >>>>> >>>>> perl -MBio::DB::GenBank -e 'my $gi = >>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my >>>>> $seqio = >>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my >>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations >>>>> ("dblink"); >>>>> for(@annotations) { print $_, "\n";} print $INC{ >>>>> "Bio/Annotation/DBLink.pm" }, "\n"; ' >>>>> >>>>> This yields: >>>>> >>>>> GenBank:AL591065.17.17 >>>>> >>>>> and the place where the used Bio/Annotation/DBLink.pm resides. >>>>> >>>>> Can others repeat this? >>>>> >>>>> I have dug into the source a little and Bio::Annotation::DBLink >>>>> seems to >>>>> be the place where this happens: it has a concatenation which >>>>> leads to >>>>> that repeated version number. >>>>> >>>>> It this something that I should fix "client-side", so to speak, or >>>>> is it >>>>> worthwhile to add some logic to that concatenation to prevent >>>>> this? >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Eric >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> -- >>>> Jason Stajich, PhD >>>> Miller Research Fellow >>>> University of California >>>> Dept of Plant and Microbial Biology >>>> 321 Koshland Hall #3102 >>>> Berkeley, CA 94720-3102 >>>> lab: 510.642.8441 >>>> http://pmb.berkeley.edu/~taylor/people/js.html >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> >>> >>> >>> >>> >> >> -- >> Jason Stajich, PhD >> Miller Research Fellow >> University of California >> Dept of Plant and Microbial Biology >> 321 Koshland Hall #3102 >> Berkeley, CA 94720-3102 >> lab: 510.642.8441 >> http://pmb.berkeley.edu/~taylor/people/js.html >> >> > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From cjfields at uiuc.edu Thu Oct 19 14:03:52 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 13:03:52 -0500 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net> Message-ID: <000001c6f3a8$f0a46a00$15327e82@pyrimidine> Also seems that the DBSOURCE line isn't caught correctly and stuffs it by default into a GenBank dblink (the dbsource ihn the test case is EMBL, not GenBank). http://bugzilla.open-bio.org/show_bug.cgi?id=2124 It looks like NCBI may be now using: DBSOURCE embl accession Z49548.1 instead of the old version: DBSOURCE embl locus SCYJR048W, accession Z49548.1 I don't recall NCBI mentioning changes regarding DBSOURCE in any of the recent release notes. Chris > Actually you did that Jason: http://tinyurl.com/ye2edk > > Apparently the motivation was to "parse swissprot fields in genpept > file (dbsource)"? > > It clearly looks wrong to add the version. You've probably had a > reason why you did this at the time but if we (you :) can't recover > that I guess it's best to just fix it to do the right thing (in both > places obviously). > > -hilmar > > On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote: > > > Well there is explicit addition of the version to the primary id so > > it isn't so much a parsing error as a deliberate decision to append > > it. > > see Bio::SeqIO::genbank > > > > to make the dblink > > $annotation- > > >add_Annotation > > ('dblink', > > > > Bio::Annotation::DBLink->new > > (-primary_id > > => $id . "." . $version, > > -version => > > $version, > > -database => > > $db, > > -tagname => > > 'dblink')); > > > > and the code to print the dblink back out in the writer already > > assumes the version number is appended... > > > > foreach my $ref ( $seq->annotation->get_Annotations > > ('dblink') ) { > > # if ($ref->comment eq 'DBSOURCE') { > > $self->_print('DBSOURCE accession ', > > $ref->primary_id, "\n"); > > # } > > } > > > > On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote: > > > >> Here is the overload code: > >> > >> use overload '""' => sub { > >> (($_[0]->database ? $_[0]->database . ':' : '' ) > >> . ($_[0]->primary_id ? $_[0]->primary_id : '') > >> . ($_[0]->version ? '.' . $_[0]->version : '')) > >> || '' }; > >> > >> Except that the last '||' is redundant and unnecessary (it either > >> does nothing or replaces an empty string with an empty string), I > >> don't see the potential for duplicating the version number here - > >> unless primary_id() did that, which I don't see it doing. > >> > >> So, to me this seems to come from a parsing error in the > >> beginning, rather than an erroneous mangling of version into > >> primary_id later. > >> > >> Is someone in the position to confirm this? > >> > >> -hilmar > >> > >> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: > >> > >>> So I'm unsure what we should do here. > >>> > >>> We can certainly fix the problem which you report which is > >>> relying on > >>> the "" method -- if you were to do instead: > >>> print $_->database, ":", $_->primary_id, "\n"; > >>> > >>> you'll get the right answer. We at a minimum just fix the auto- > >>> string converting method to do The Right Thing. > >>> > >>> But I am not sure if we should keep the version out of the > >>> primary_id > >>> field. This will require some rejiggering in several modules > >>> when it > >>> comes to printing DBlinks and I don't want to do this before the > >>> release. I also am not sure if there was an explicit reason why > >>> someone did put the version information in the primary_id. (I > >>> hope it > >>> wasn't me because I don't think I'm going to remember why). > >>> > >>> Does anyone else have a strong feeling? > >>> > >>> -jason > >>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote: > >>> > >>>> Hello, > >>>> > >>>> I noticed a little problem with the Annotation "DBLink" from > >>>> GenBank entries > >>>> > >>>> When I run: > >>>> > >>>> perl -MBio::DB::GenBank -e 'my $gi = > >>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my > >>>> $seqio = > >>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my > >>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations > >>>> ("dblink"); > >>>> for(@annotations) { print $_, "\n";} print $INC{ > >>>> "Bio/Annotation/DBLink.pm" }, "\n"; ' > >>>> > >>>> This yields: > >>>> > >>>> GenBank:AL591065.17.17 > >>>> > >>>> and the place where the used Bio/Annotation/DBLink.pm resides. > >>>> > >>>> Can others repeat this? > >>>> > >>>> I have dug into the source a little and Bio::Annotation::DBLink > >>>> seems to > >>>> be the place where this happens: it has a concatenation which > >>>> leads to > >>>> that repeated version number. > >>>> > >>>> It this something that I should fix "client-side", so to speak, or > >>>> is it > >>>> worthwhile to add some logic to that concatenation to prevent this? > >>>> > >>>> > >>>> Thanks, > >>>> > >>>> Eric > >>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> Bioperl-l mailing list > >>>> Bioperl-l at lists.open-bio.org > >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>> > >>> -- > >>> Jason Stajich, PhD > >>> Miller Research Fellow > >>> University of California > >>> Dept of Plant and Microbial Biology > >>> 321 Koshland Hall #3102 > >>> Berkeley, CA 94720-3102 > >>> lab: 510.642.8441 > >>> http://pmb.berkeley.edu/~taylor/people/js.html > >>> > >>> > >>> _______________________________________________ > >>> Bioperl-l mailing list > >>> Bioperl-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>> > >> > >> -- > >> =========================================================== > >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > >> =========================================================== > >> > >> > >> > >> > >> > > > > -- > > Jason Stajich, PhD > > Miller Research Fellow > > University of California > > Dept of Plant and Microbial Biology > > 321 Koshland Hall #3102 > > Berkeley, CA 94720-3102 > > lab: 510.642.8441 > > http://pmb.berkeley.edu/~taylor/people/js.html > > > > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From N.Haigh at sheffield.ac.uk Thu Oct 19 14:06:11 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 19:06:11 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <002001c6f3a3$c00b9080$15327e82@pyrimidine> References: <002001c6f3a3$c00b9080$15327e82@pyrimidine> Message-ID: <1161281171.4537be93b63c9@webmail.shef.ac.uk> > > Well, Analysis_soap.t ran immediately after, so Amap.t tests shouldn't be > the problem. The only issue I can think of is that Test::More TODO blocks > require a newer version of Test::Harness (which most users have anyway). > Are you using a TODO block? > > You can send me Amap.t and I'll give it a try, but I can't promise I'll get > to it immediately (busy day). > > Chris > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > Nevermind about this - It's working as expected! I got confused as a previous run threw errors but wasn't included in the final table of failed tests - working now. Nath From N.Haigh at sheffield.ac.uk Thu Oct 19 14:14:54 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 19:14:54 +0100 Subject: [Bioperl-l] bioperl-run executable In-Reply-To: <002001c6f3a3$c00b9080$15327e82@pyrimidine> References: <002001c6f3a3$c00b9080$15327e82@pyrimidine> Message-ID: <1161281694.4537c09ebd0f8@webmail.shef.ac.uk> I have a few questions about How bioperl-run modules. 1) How do modules define what the name of the executable is that it uses? 2) Is there a way to test what this is? 3) Does $factory->executable return this or does it only return the name if it successfully found it? Thanks Nath From cjfields at uiuc.edu Thu Oct 19 14:15:08 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 13:15:08 -0500 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: Message-ID: <000001c6f3aa$82845ba0$15327e82@pyrimidine> Go for it. I haven't got the time to spare at the moment, sucky protein assays.... Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Jason Stajich > Sent: Thursday, October 19, 2006 12:45 PM > To: Hilmar Lapp > Cc: bioperl-l at lists.open-bio.org; Erikjan > Subject: Re: [Bioperl-l] Annotation-DBLink- version numbers repeating > > Yikes - I was worried that it might have been me..... > > Okay I'll look into fixing it -- ChrisF - check in with me before > diving in, in case I've gotten it done and I expect your enzyme > assays might take up the time. > > -jason > On Oct 19, 2006, at 10:11 AM, Hilmar Lapp wrote: > > > Actually you did that Jason: http://tinyurl.com/ye2edk > > > > Apparently the motivation was to "parse swissprot fields in genpept > > file (dbsource)"? > > > > It clearly looks wrong to add the version. You've probably had a > > reason why you did this at the time but if we (you :) can't recover > > that I guess it's best to just fix it to do the right thing (in > > both places obviously). > > > > -hilmar > > > > On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote: > > > >> Well there is explicit addition of the version to the primary id > >> so it isn't so much a parsing error as a deliberate decision to > >> append it. > >> see Bio::SeqIO::genbank > >> > >> to make the dblink > >> $annotation- > >> >add_Annotation > >> ('dblink', > >> > >> Bio::Annotation::DBLink->new > >> (-primary_id > >> => $id . "." . $version, > >> -version => > >> $version, > >> -database => > >> $db, > >> -tagname => > >> 'dblink')); > >> > >> and the code to print the dblink back out in the writer already > >> assumes the version number is appended... > >> > >> foreach my $ref ( $seq->annotation->get_Annotations > >> ('dblink') ) { > >> # if ($ref->comment eq 'DBSOURCE') { > >> $self->_print('DBSOURCE accession ', > >> $ref->primary_id, "\n"); > >> # } > >> } > >> > >> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote: > >> > >>> Here is the overload code: > >>> > >>> use overload '""' => sub { > >>> (($_[0]->database ? $_[0]->database . ':' : '' ) > >>> . ($_[0]->primary_id ? $_[0]->primary_id : '') > >>> . ($_[0]->version ? '.' . $_[0]->version : '')) > >>> || '' }; > >>> > >>> Except that the last '||' is redundant and unnecessary (it either > >>> does nothing or replaces an empty string with an empty string), I > >>> don't see the potential for duplicating the version number here - > >>> unless primary_id() did that, which I don't see it doing. > >>> > >>> So, to me this seems to come from a parsing error in the > >>> beginning, rather than an erroneous mangling of version into > >>> primary_id later. > >>> > >>> Is someone in the position to confirm this? > >>> > >>> -hilmar > >>> > >>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: > >>> > >>>> So I'm unsure what we should do here. > >>>> > >>>> We can certainly fix the problem which you report which is > >>>> relying on > >>>> the "" method -- if you were to do instead: > >>>> print $_->database, ":", $_->primary_id, "\n"; > >>>> > >>>> you'll get the right answer. We at a minimum just fix the auto- > >>>> string converting method to do The Right Thing. > >>>> > >>>> But I am not sure if we should keep the version out of the > >>>> primary_id > >>>> field. This will require some rejiggering in several modules > >>>> when it > >>>> comes to printing DBlinks and I don't want to do this before the > >>>> release. I also am not sure if there was an explicit reason why > >>>> someone did put the version information in the primary_id. (I > >>>> hope it > >>>> wasn't me because I don't think I'm going to remember why). > >>>> > >>>> Does anyone else have a strong feeling? > >>>> > >>>> -jason > >>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote: > >>>> > >>>>> Hello, > >>>>> > >>>>> I noticed a little problem with the Annotation "DBLink" from > >>>>> GenBank entries > >>>>> > >>>>> When I run: > >>>>> > >>>>> perl -MBio::DB::GenBank -e 'my $gi = > >>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my > >>>>> $seqio = > >>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my > >>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations > >>>>> ("dblink"); > >>>>> for(@annotations) { print $_, "\n";} print $INC{ > >>>>> "Bio/Annotation/DBLink.pm" }, "\n"; ' > >>>>> > >>>>> This yields: > >>>>> > >>>>> GenBank:AL591065.17.17 > >>>>> > >>>>> and the place where the used Bio/Annotation/DBLink.pm resides. > >>>>> > >>>>> Can others repeat this? > >>>>> > >>>>> I have dug into the source a little and Bio::Annotation::DBLink > >>>>> seems to > >>>>> be the place where this happens: it has a concatenation which > >>>>> leads to > >>>>> that repeated version number. > >>>>> > >>>>> It this something that I should fix "client-side", so to speak, or > >>>>> is it > >>>>> worthwhile to add some logic to that concatenation to prevent > >>>>> this? > >>>>> > >>>>> > >>>>> Thanks, > >>>>> > >>>>> Eric > >>>>> > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> Bioperl-l mailing list > >>>>> Bioperl-l at lists.open-bio.org > >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>>> > >>>> -- > >>>> Jason Stajich, PhD > >>>> Miller Research Fellow > >>>> University of California > >>>> Dept of Plant and Microbial Biology > >>>> 321 Koshland Hall #3102 > >>>> Berkeley, CA 94720-3102 > >>>> lab: 510.642.8441 > >>>> http://pmb.berkeley.edu/~taylor/people/js.html > >>>> > >>>> > >>>> _______________________________________________ > >>>> Bioperl-l mailing list > >>>> Bioperl-l at lists.open-bio.org > >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>>> > >>> > >>> -- > >>> =========================================================== > >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > >>> =========================================================== > >>> > >>> > >>> > >>> > >>> > >> > >> -- > >> Jason Stajich, PhD > >> Miller Research Fellow > >> University of California > >> Dept of Plant and Microbial Biology > >> 321 Koshland Hall #3102 > >> Berkeley, CA 94720-3102 > >> lab: 510.642.8441 > >> http://pmb.berkeley.edu/~taylor/people/js.html > >> > >> > > > > -- > > =========================================================== > > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > > =========================================================== > > > > > > > > > > > > -- > Jason Stajich, PhD > Miller Research Fellow > University of California > Dept of Plant and Microbial Biology > 321 Koshland Hall #3102 > Berkeley, CA 94720-3102 > lab: 510.642.8441 > http://pmb.berkeley.edu/~taylor/people/js.html > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Thu Oct 19 14:35:08 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 13:35:08 -0500 Subject: [Bioperl-l] bioperl-run executable In-Reply-To: <1161281694.4537c09ebd0f8@webmail.shef.ac.uk> Message-ID: <000301c6f3ad$4ec9af10$15327e82@pyrimidine> I think a lot of the bioperl-run modules use Bio::Tools::Run::WrapperBase but I'm not sure. I haven't used them very much myself but plan on making wrappers at some point soon for some programs I use. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: Nathan Haigh [mailto:N.Haigh at sheffield.ac.uk] > Sent: Thursday, October 19, 2006 1:15 PM > To: Chris Fields > Cc: 'bioperl-l' > Subject: bioperl-run executable > > I have a few questions about How bioperl-run modules. > > 1) How do modules define what the name of the executable is that it uses? > 2) Is there a way to test what this is? > 3) Does $factory->executable return this or does it only return the name > if it successfully found it? > > Thanks > Nath From N.Haigh at sheffield.ac.uk Thu Oct 19 14:47:01 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 19:47:01 +0100 Subject: [Bioperl-l] bioperl-run executable In-Reply-To: <000301c6f3ad$4ec9af10$15327e82@pyrimidine> References: <000301c6f3ad$4ec9af10$15327e82@pyrimidine> Message-ID: <1161283620.4537c82501c43@webmail.shef.ac.uk> Quoting Chris Fields : > I think a lot of the bioperl-run modules use Bio::Tools::Run::WrapperBase > but I'm not sure. I haven't used them very much myself but plan on making > wrappers at some point soon for some programs I use. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > On closer inspection of a couple of other modules (Clustalw.pm and TCoffee.pm), they seem to hardcode the exe name in $PROGRAM_NAME and have a sub (program_name) that simply returns this value. I'd like to see the program_name become a getter/setter so users can change the default and have the string stored in the factory object. Does it matter that Bio::Tools::Run::WrapperBase is in bioperl-core not bioperl-run? I suppose not since bioperl-core is a prerep for bioperl-run but wouldn't it make sence to go in bioperl-run? Nath From cjfields at uiuc.edu Thu Oct 19 15:07:05 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 14:07:05 -0500 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: Message-ID: <000701c6f3b1$c5914230$15327e82@pyrimidine> Jason, Hilmar, How about changing the default parsed dblink in SeqIO::genbank (line 520) to if( $dbsource =~ /^(\S*?)\s*accession\s+(\S+)\.(\d+)/ ) { my ($db,$id,$version) = ($1,$2,$3); $annotation->add_Annotation ('dblink', Bio::Annotation::DBLink->new (-primary_id => $id, -version => $version, -database => $db || 'GenBank', -tagname => 'dblink')); } It passes tests and catches the optional database ('embl' for the bugzilla report). The output sequence still doesn't print the DB if it isn't GenBank via write_seq(), but that should be too hard to fix (famous last words). Okay, okay, back to the assays... Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Jason Stajich > Sent: Thursday, October 19, 2006 12:45 PM > To: Hilmar Lapp > Cc: bioperl-l at lists.open-bio.org; Erikjan > Subject: Re: [Bioperl-l] Annotation-DBLink- version numbers repeating > > Yikes - I was worried that it might have been me..... > > Okay I'll look into fixing it -- ChrisF - check in with me before > diving in, in case I've gotten it done and I expect your enzyme > assays might take up the time. > > -jason > On Oct 19, 2006, at 10:11 AM, Hilmar Lapp wrote: > > > Actually you did that Jason: http://tinyurl.com/ye2edk > > > > Apparently the motivation was to "parse swissprot fields in genpept > > file (dbsource)"? > > > > It clearly looks wrong to add the version. You've probably had a > > reason why you did this at the time but if we (you :) can't recover > > that I guess it's best to just fix it to do the right thing (in > > both places obviously). > > > > -hilmar > > > > On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote: > > > >> Well there is explicit addition of the version to the primary id > >> so it isn't so much a parsing error as a deliberate decision to > >> append it. > >> see Bio::SeqIO::genbank > >> > >> to make the dblink > >> $annotation- > >> >add_Annotation > >> ('dblink', > >> > >> Bio::Annotation::DBLink->new > >> (-primary_id > >> => $id . "." . $version, > >> -version => > >> $version, > >> -database => > >> $db, > >> -tagname => > >> 'dblink')); > >> > >> and the code to print the dblink back out in the writer already > >> assumes the version number is appended... > >> > >> foreach my $ref ( $seq->annotation->get_Annotations > >> ('dblink') ) { > >> # if ($ref->comment eq 'DBSOURCE') { > >> $self->_print('DBSOURCE accession ', > >> $ref->primary_id, "\n"); > >> # } > >> } > >> > >> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote: > >> > >>> Here is the overload code: > >>> > >>> use overload '""' => sub { > >>> (($_[0]->database ? $_[0]->database . ':' : '' ) > >>> . ($_[0]->primary_id ? $_[0]->primary_id : '') > >>> . ($_[0]->version ? '.' . $_[0]->version : '')) > >>> || '' }; > >>> > >>> Except that the last '||' is redundant and unnecessary (it either > >>> does nothing or replaces an empty string with an empty string), I > >>> don't see the potential for duplicating the version number here - > >>> unless primary_id() did that, which I don't see it doing. > >>> > >>> So, to me this seems to come from a parsing error in the > >>> beginning, rather than an erroneous mangling of version into > >>> primary_id later. > >>> > >>> Is someone in the position to confirm this? > >>> > >>> -hilmar > >>> > >>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: > >>> > >>>> So I'm unsure what we should do here. > >>>> > >>>> We can certainly fix the problem which you report which is > >>>> relying on > >>>> the "" method -- if you were to do instead: > >>>> print $_->database, ":", $_->primary_id, "\n"; > >>>> > >>>> you'll get the right answer. We at a minimum just fix the auto- > >>>> string converting method to do The Right Thing. > >>>> > >>>> But I am not sure if we should keep the version out of the > >>>> primary_id > >>>> field. This will require some rejiggering in several modules > >>>> when it > >>>> comes to printing DBlinks and I don't want to do this before the > >>>> release. I also am not sure if there was an explicit reason why > >>>> someone did put the version information in the primary_id. (I > >>>> hope it > >>>> wasn't me because I don't think I'm going to remember why). > >>>> > >>>> Does anyone else have a strong feeling? > >>>> > >>>> -jason > >>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote: > >>>> > >>>>> Hello, > >>>>> > >>>>> I noticed a little problem with the Annotation "DBLink" from > >>>>> GenBank entries > >>>>> > >>>>> When I run: > >>>>> > >>>>> perl -MBio::DB::GenBank -e 'my $gi = > >>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my > >>>>> $seqio = > >>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my > >>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations > >>>>> ("dblink"); > >>>>> for(@annotations) { print $_, "\n";} print $INC{ > >>>>> "Bio/Annotation/DBLink.pm" }, "\n"; ' > >>>>> > >>>>> This yields: > >>>>> > >>>>> GenBank:AL591065.17.17 > >>>>> > >>>>> and the place where the used Bio/Annotation/DBLink.pm resides. > >>>>> > >>>>> Can others repeat this? > >>>>> > >>>>> I have dug into the source a little and Bio::Annotation::DBLink > >>>>> seems to > >>>>> be the place where this happens: it has a concatenation which > >>>>> leads to > >>>>> that repeated version number. > >>>>> > >>>>> It this something that I should fix "client-side", so to speak, or > >>>>> is it > >>>>> worthwhile to add some logic to that concatenation to prevent > >>>>> this? > >>>>> > >>>>> > >>>>> Thanks, > >>>>> > >>>>> Eric > >>>>> > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> Bioperl-l mailing list > >>>>> Bioperl-l at lists.open-bio.org > >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>>> > >>>> -- > >>>> Jason Stajich, PhD > >>>> Miller Research Fellow > >>>> University of California > >>>> Dept of Plant and Microbial Biology > >>>> 321 Koshland Hall #3102 > >>>> Berkeley, CA 94720-3102 > >>>> lab: 510.642.8441 > >>>> http://pmb.berkeley.edu/~taylor/people/js.html > >>>> > >>>> > >>>> _______________________________________________ > >>>> Bioperl-l mailing list > >>>> Bioperl-l at lists.open-bio.org > >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>>> > >>> > >>> -- > >>> =========================================================== > >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > >>> =========================================================== > >>> > >>> > >>> > >>> > >>> > >> > >> -- > >> Jason Stajich, PhD > >> Miller Research Fellow > >> University of California > >> Dept of Plant and Microbial Biology > >> 321 Koshland Hall #3102 > >> Berkeley, CA 94720-3102 > >> lab: 510.642.8441 > >> http://pmb.berkeley.edu/~taylor/people/js.html > >> > >> > > > > -- > > =========================================================== > > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > > =========================================================== > > > > > > > > > > > > -- > Jason Stajich, PhD > Miller Research Fellow > University of California > Dept of Plant and Microbial Biology > 321 Koshland Hall #3102 > Berkeley, CA 94720-3102 > lab: 510.642.8441 > http://pmb.berkeley.edu/~taylor/people/js.html > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jason at bioperl.org Thu Oct 19 14:48:28 2006 From: jason at bioperl.org (Jason Stajich) Date: Thu, 19 Oct 2006 11:48:28 -0700 Subject: [Bioperl-l] bioperl-run executable In-Reply-To: <1161281694.4537c09ebd0f8@webmail.shef.ac.uk> References: <002001c6f3a3$c00b9080$15327e82@pyrimidine> <1161281694.4537c09ebd0f8@webmail.shef.ac.uk> Message-ID: <67650240-D61B-4842-AE7C-75F15F608F6F@bioperl.org> program_name() Should return the name of the program executable() Is a function that you don't have to mess with that tries to find the executable named program_name() based on your PATH. -jason On Oct 19, 2006, at 11:14 AM, Nathan Haigh wrote: > I have a few questions about How bioperl-run modules. > > 1) How do modules define what the name of the executable is that it > uses? > 2) Is there a way to test what this is? > 3) Does $factory->executable return this or does it only return the > name if it successfully found it? > > Thanks > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From jason at bioperl.org Thu Oct 19 17:06:43 2006 From: jason at bioperl.org (Jason Stajich) Date: Thu, 19 Oct 2006 14:06:43 -0700 Subject: [Bioperl-l] bioperl-run executable In-Reply-To: <1161283620.4537c82501c43@webmail.shef.ac.uk> References: <000301c6f3ad$4ec9af10$15327e82@pyrimidine> <1161283620.4537c82501c43@webmail.shef.ac.uk> Message-ID: It can be reset now but of course this not a very nice way of doing it: $Bio::Tools::Run::Alignment::Clustalw::PROGRAM_NAME = 'clustalw_smp'; I am not sure if there are pros and cons to making it a getter- setter, but if you want to run with it, please do. The whole run system has been hard to keep people adhering to a standard (and the standard has changed a bit) so some auditing is warranted. -jason On Oct 19, 2006, at 11:47 AM, Nathan Haigh wrote: > Quoting Chris Fields : > >> I think a lot of the bioperl-run modules use >> Bio::Tools::Run::WrapperBase >> but I'm not sure. I haven't used them very much myself but plan >> on making >> wrappers at some point soon for some programs I use. >> >> Christopher Fields >> Postdoctoral Researcher - Switzer Lab >> Dept. of Biochemistry >> University of Illinois Urbana-Champaign >> > > On closer inspection of a couple of other modules (Clustalw.pm and > TCoffee.pm), they seem to hardcode the exe name in $PROGRAM_NAME > and have a sub > (program_name) that simply returns this value. I'd like to see the > program_name become a getter/setter so users can change the default > and have the > string stored in the factory object. > > Does it matter that Bio::Tools::Run::WrapperBase is in bioperl-core > not bioperl-run? I suppose not since bioperl-core is a prerep for > bioperl-run but > wouldn't it make sence to go in bioperl-run? > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From torsten.seemann at infotech.monash.edu.au Thu Oct 19 19:24:03 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Fri, 20 Oct 2006 09:24:03 +1000 Subject: [Bioperl-l] test::more template In-Reply-To: <1161279505.4537b811e143f@webmail.shef.ac.uk> References: <002001c6f3a3$c00b9080$15327e82@pyrimidine> <1161279505.4537b811e143f@webmail.shef.ac.uk> Message-ID: <45380913.3070506@infotech.monash.edu.au> Nathan, > use strict; > use Bio::Root::IO; # cant test for this, might be needed to get Test::More use File::Spec->catfile as Bio::Root::IO->catfile just delegates anyway, and File::Spec is "guaranteed" to be installed with Perl 5.6+. > use lib Bio::Root::IO->catfile('t','lib'); Simpler as: use lib 't/lib'; I understand the 'lib.pm' accepts Unix style directories REGARDLESS of native platform. -- Torsten Seemann Victorian Bioinformatics Consortium, Monash University, Australia From prabubio at gmail.com Thu Oct 19 20:11:36 2006 From: prabubio at gmail.com (Prabu Raja) Date: 20 Oct 2006 00:11:36 -0000 Subject: [Bioperl-l] Prabu Raja sent you this link Message-ID: <20061020001136.86586.qmail@x05.namesdatabase.com> Remember your link from Prabu Raja: http://namesdatabase.com/2m.pl?k2=40892642637&s=nsender2 1 -> Use Prabu Raja's link by clicking above. 2 -> Enter your info for a membership connected to Prabu. 3 -> Share links with other friends, family and co-workers. 4 -> Use the members-only people search tools. Prabu selected you for this on 09-02-2004 22:52 ET. prabubio at gmail.com (Prabu Raja) initiated this to bioperl-l at lists.open-bio.org at 10-05-2006 02:30 on namesdatabase.com from the IP address 59.92.88.99. If you do not know a Prabu Raja, use http://namesdatabase.com/u.pl?bb2=40892642637&s=nsender2 to halt more reminders about this. For reference, the address of The Names Database is 1253 N. Research Way, Suite Q-2500, Orem, UT 84097. From cjfields at uiuc.edu Thu Oct 19 20:29:11 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 19:29:11 -0500 Subject: [Bioperl-l] test::more template In-Reply-To: <45380913.3070506@infotech.monash.edu.au> Message-ID: <000f01c6f3de$c3d91170$15327e82@pyrimidine> > Nathan, > > > use strict; > > use Bio::Root::IO; # cant test for this, might be needed to get > Test::More > > use File::Spec->catfile as Bio::Root::IO->catfile just delegates anyway, > and File::Spec is "guaranteed" to be installed with Perl 5.6+. > > > use lib Bio::Root::IO->catfile('t','lib'); > > Simpler as: > use lib 't/lib'; > I understand the 'lib.pm' accepts Unix style directories REGARDLESS of > native > platform. > > -- > Torsten Seemann > Victorian Bioinformatics Consortium, Monash University, Australia That is true, at least for WinXP (not sure about older Windows versions out there). I was using 'Root::IO->catfile' but found 'use lib 't/lib' works. I may have a few of the 'catfile' versions floating around out there, which may be where that originated. Note that if you plan on using Test::More with the bioperl-run test suite, you should add it to the bioperl-run CVS distribution directory in 't/lib'. Most people will have it installed, but you never know. Chris From cjfields at uiuc.edu Thu Oct 19 20:33:22 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 19:33:22 -0500 Subject: [Bioperl-l] Prabu Raja sent you this link In-Reply-To: <20061020001136.86586.qmail@x05.namesdatabase.com> Message-ID: <001001c6f3df$598a24c0$15327e82@pyrimidine> That Prabu Raja sure gets around... Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Prabu Raja > Sent: Thursday, October 19, 2006 7:12 PM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] Prabu Raja sent you this link > > Remember your link from Prabu Raja: > > http://namesdatabase.com/2m.pl?k2=40892642637&s=nsender2 > > > 1 -> Use Prabu Raja's link by clicking above. > > 2 -> Enter your info for a membership connected to Prabu. > > 3 -> Share links with other friends, family and co-workers. > > 4 -> Use the members-only people search tools. > > Prabu selected you for this on 09-02-2004 22:52 ET. > > > prabubio at gmail.com (Prabu Raja) initiated this to bioperl-l at lists.open- > bio.org > at 10-05-2006 02:30 on namesdatabase.com from the IP address 59.92.88.99. > If you do not know a Prabu Raja, use > http://namesdatabase.com/u.pl?bb2=40892642637&s=nsender2 to halt more > reminders about this. > For reference, the address of The Names Database is 1253 N. Research Way, > Suite Q-2500, Orem, UT 84097. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From keithplayer at hotmail.com Thu Oct 19 22:13:52 2006 From: keithplayer at hotmail.com (Keith Player) Date: Fri, 20 Oct 2006 02:13:52 +0000 (UTC) Subject: [Bioperl-l] Bio::DB::GFF::Util::Binning Message-ID: I know that there may be some changes resulting from new GFF3 implementations, but thought I would see if the following is useful anyway. I implemented the R-tree binning schema as used by Bio::DB::GFF::Util::Binning and as mention in this article: I tested the following query on a normal table (no binning), but it assumes that you know the longest range in the table. So for example with a table of human genes, where the longest gene we know of is around 2.4Mb. SELECT COUNT(*) as count FROM groups WHERE start > max(0,[start-2.4Mb]) AND g.start < [end] AND g.end > [start] AND g.chromosome = '1' so for 100Mb:101Mb SELECT COUNT(*) as count FROM groups WHERE start > 97600000 AND g.start < 101000000 AND g.end > 100000000 AND g.chromosome = '1' where [start] and [end] define the region of interest. This query outperforms the R-Tree implementation on all tests that I have performed (for lengths of 200bp to 10Mb across a whole chromsome). Could this be of some practical use? From jason at bioperl.org Thu Oct 19 11:50:49 2006 From: jason at bioperl.org (Jason Stajich) Date: Thu, 19 Oct 2006 08:50:49 -0700 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net> References: <001301c6f215$07a9a070$15327e82@pyrimidine> <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org> <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net> Message-ID: <8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org> Well there is explicit addition of the version to the primary id so it isn't so much a parsing error as a deliberate decision to append it. see Bio::SeqIO::genbank to make the dblink $annotation- >add_Annotation ('dblink', Bio::Annotation::DBLink->new (-primary_id => $id . "." . $version, -version => $version, -database => $db, -tagname => 'dblink')); and the code to print the dblink back out in the writer already assumes the version number is appended... foreach my $ref ( $seq->annotation->get_Annotations ('dblink') ) { # if ($ref->comment eq 'DBSOURCE') { $self->_print('DBSOURCE accession ', $ref->primary_id, "\n"); # } } On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote: > Here is the overload code: > > use overload '""' => sub { > (($_[0]->database ? $_[0]->database . ':' : '' ) > . ($_[0]->primary_id ? $_[0]->primary_id : '') > . ($_[0]->version ? '.' . $_[0]->version : '')) > || '' }; > > Except that the last '||' is redundant and unnecessary (it either > does nothing or replaces an empty string with an empty string), I > don't see the potential for duplicating the version number here - > unless primary_id() did that, which I don't see it doing. > > So, to me this seems to come from a parsing error in the beginning, > rather than an erroneous mangling of version into primary_id later. > > Is someone in the position to confirm this? > > -hilmar > > On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: > >> So I'm unsure what we should do here. >> >> We can certainly fix the problem which you report which is relying on >> the "" method -- if you were to do instead: >> print $_->database, ":", $_->primary_id, "\n"; >> >> you'll get the right answer. We at a minimum just fix the auto- >> string converting method to do The Right Thing. >> >> But I am not sure if we should keep the version out of the primary_id >> field. This will require some rejiggering in several modules when it >> comes to printing DBlinks and I don't want to do this before the >> release. I also am not sure if there was an explicit reason why >> someone did put the version information in the primary_id. (I hope it >> wasn't me because I don't think I'm going to remember why). >> >> Does anyone else have a strong feeling? >> >> -jason >> On Oct 17, 2006, at 12:01 PM, Erikjan wrote: >> >>> Hello, >>> >>> I noticed a little problem with the Annotation "DBLink" from >>> GenBank entries >>> >>> When I run: >>> >>> perl -MBio::DB::GenBank -e 'my $gi = >>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my >>> $seqio = >>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my >>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations >>> ("dblink"); >>> for(@annotations) { print $_, "\n";} print $INC{ >>> "Bio/Annotation/DBLink.pm" }, "\n"; ' >>> >>> This yields: >>> >>> GenBank:AL591065.17.17 >>> >>> and the place where the used Bio/Annotation/DBLink.pm resides. >>> >>> Can others repeat this? >>> >>> I have dug into the source a little and Bio::Annotation::DBLink >>> seems to >>> be the place where this happens: it has a concatenation which >>> leads to >>> that repeated version number. >>> >>> It this something that I should fix "client-side", so to speak, or >>> is it >>> worthwhile to add some logic to that concatenation to prevent this? >>> >>> >>> Thanks, >>> >>> Eric >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> Jason Stajich, PhD >> Miller Research Fellow >> University of California >> Dept of Plant and Microbial Biology >> 321 Koshland Hall #3102 >> Berkeley, CA 94720-3102 >> lab: 510.642.8441 >> http://pmb.berkeley.edu/~taylor/people/js.html >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From n.haigh at sheffield.ac.uk Fri Oct 20 04:35:03 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 20 Oct 2006 08:35:03 +0000 Subject: [Bioperl-l] test::more template In-Reply-To: <000f01c6f3de$c3d91170$15327e82@pyrimidine> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> Message-ID: <45388A37.7040505@sheffield.ac.uk> Chris Fields wrote: >> Nathan, >> >> >>> use strict; >>> use Bio::Root::IO; # cant test for this, might be needed to get >>> >> Test::More >> >> use File::Spec->catfile as Bio::Root::IO->catfile just delegates anyway, >> and File::Spec is "guaranteed" to be installed with Perl 5.6+. >> >> >>> use lib Bio::Root::IO->catfile('t','lib'); >>> >> Simpler as: >> use lib 't/lib'; >> I understand the 'lib.pm' accepts Unix style directories REGARDLESS of >> native >> platform. >> >> -- >> Torsten Seemann >> Victorian Bioinformatics Consortium, Monash University, Australia >> > > That is true, at least for WinXP (not sure about older Windows versions out > there). I was using 'Root::IO->catfile' but found 'use lib 't/lib' works. > I may have a few of the 'catfile' versions floating around out there, which > may be where that originated. > > Note that if you plan on using Test::More with the bioperl-run test suite, > you should add it to the bioperl-run CVS distribution directory in 't/lib'. > Most people will have it installed, but you never know. > > Chris > > > What is the reason for including Test::More in 't/lib' rather than having it as a prereq? -- > A: Yes. >> Q: Are you sure? >> >>> A: Because it reverses the logical flow of conversation. >>> >>>> Q: Why is top posting frowned upon? >>>> Get Thunderbird From n.haigh at sheffield.ac.uk Fri Oct 20 05:27:19 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Fri, 20 Oct 2006 10:27:19 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <000f01c6f3de$c3d91170$15327e82@pyrimidine> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> Message-ID: <45389677.1000709@sheffield.ac.uk> Is it really necessary to specify the number of tests that are to be conducted in advance? It seems a bit annoying to have to count the number of tests in the script or to run the test just to see how many tests were done, we could just use: use Test::More 'no_plan'; And then it's up to Test::More to keep a track of how many tests it's run. The only thing then to worry about is how many tests are in a SKIP block if the skip criteria are met. This is unless there is a good reason to use it that I am unaware of. Thanks Nath From bix at sendu.me.uk Fri Oct 20 06:01:09 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 20 Oct 2006 11:01:09 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <45389677.1000709@sheffield.ac.uk> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> <45389677.1000709@sheffield.ac.uk> Message-ID: <45389E65.6080908@sendu.me.uk> Nathan Haigh wrote: > Is it really necessary to specify the number of tests that are to be > conducted in advance? It seems a bit annoying to have to count the > number of tests in the script or to run the test just to see how many > tests were done, we could just use: > use Test::More 'no_plan'; It's very important to have a plan. That way you know all the tests actually ran and weren't skipped (either due to an actual SKIP block or an if block that returned false due to a bug, or a for/foreach/while that didn't loop enough times due to a bug, or any number of other reasons). From bix at sendu.me.uk Fri Oct 20 06:04:48 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 20 Oct 2006 11:04:48 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <45388A37.7040505@sheffield.ac.uk> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> <45388A37.7040505@sheffield.ac.uk> Message-ID: <45389F40.5060601@sendu.me.uk> Nathan S. Haigh wrote: > Chris Fields wrote: > >> Note that if you plan on using Test::More with the bioperl-run test suite, >> you should add it to the bioperl-run CVS distribution directory in 't/lib'. >> Most people will have it installed, but you never know. > > What is the reason for including Test::More in 't/lib' rather than > having it as a prereq? Because we want to ensure that the test suite runs and tells you real problems (if any) about the code (Bioperl) that it is testing, not problems about actually running the tests (which are NOT required for using Bioperl, so cannot be considered 'pre-requisites'). From n.haigh at sheffield.ac.uk Fri Oct 20 06:54:30 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Fri, 20 Oct 2006 11:54:30 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <45389E65.6080908@sendu.me.uk> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> <45389677.1000709@sheffield.ac.uk> <45389E65.6080908@sendu.me.uk> Message-ID: <4538AAE6.5070600@sheffield.ac.uk> If there are known bugs in a particular version of software, what is the best approach for dealing with tests that would fail due to this bug? Simply skip those tests that would be affected by the bug, or to fail if the affected version is detected and report the reason so the user is informed? Or simply bump the minimum version to one above the affected versions? For example, t/Clustalw has a test for at least version 1.8. It then has some profile alignment tests that are only run if version > 1.82 is installed. It states that versions 1.81 and 1.82 are affected by a profile alignment bug - which i assume would make the tests fail. Cheers Nath From bix at sendu.me.uk Fri Oct 20 07:06:07 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 20 Oct 2006 12:06:07 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <4538AAE6.5070600@sheffield.ac.uk> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> <45389677.1000709@sheffield.ac.uk> <45389E65.6080908@sendu.me.uk> <4538AAE6.5070600@sheffield.ac.uk> Message-ID: <4538AD9F.8040003@sendu.me.uk> Nathan Haigh wrote: > If there are known bugs in a particular version of software, what is the > best approach for dealing with tests that would fail due to this bug? > Simply skip those tests that would be affected by the bug, or to fail if > the affected version is detected and report the reason so the user is > informed? Or simply bump the minimum version to one above the affected > versions? > > For example, t/Clustalw has a test for at least version 1.8. It then has > some profile alignment tests that are only run if version > 1.82 is > installed. It states that versions 1.81 and 1.82 are affected by a > profile alignment bug - which i assume would make the tests fail. Specific cases like this, I'd discuss on the list/ with the author of the module in question. Maybe there is some great need to allow usage with <1.81? My view, based purely on what you've said above, bump the pre-requisite to a version that works. From cjfields at uiuc.edu Fri Oct 20 08:36:37 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 20 Oct 2006 07:36:37 -0500 Subject: [Bioperl-l] test::more template In-Reply-To: <45388A37.7040505@sheffield.ac.uk> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> <45388A37.7040505@sheffield.ac.uk> Message-ID: <80A2D210-B0DB-4CD2-9B56-A38097F4F63F@uiuc.edu> >> ,,, >> > What is the reason for including Test::More in 't/lib' rather than > having it as a prereq? We could do that. Many CPAN modules include it in 't/lib' b/c it is only needed for testing purposes. Chris > > -- >> A: Yes. >>> Q: Are you sure? >>> >>>> A: Because it reverses the logical flow of conversation. >>>> >>>>> Q: Why is top posting frowned upon? >>>>> > Get Thunderbird Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Fri Oct 20 10:44:29 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 20 Oct 2006 15:44:29 +0100 Subject: [Bioperl-l] Updated Makefile.PL Message-ID: <4538E0CD.1030908@sendu.me.uk> Hi, I've just committed an updated Makefile.PL to HEAD for bioperl-live. Could some people test it on multiple platforms and confirm it is ok (try out the different possible options as well)? (NB. in the below, 'pre-reqs' are things the makefile considers optional dependencies) Note that some pre-reqs have been removed: # DBD::mysql (nothing in Bioperl-live uses it; Bioperl modules may end up requiring it but only after the user makes an explicit choice by typing 'DBD::mysql' in their own code to supply as an option to Bioperl code) # File::Temp (standard in 5.6.1) This pre-req was wrong: # Data::Stag::Writer and has been replaced with: Data::Stag::XMLWriter Also, I note that very many Bioperl modules need IO::String, including Bio::SeqIO, so I'm not sure to what extent we can pretend it is an optional module. I didn't make any change though. I don't know if these changes affect the Windows ppm Nathan, or anything else (Bundle?)? The INSTALL docs need updating with these new and improved pre-reqs (note that some pre-reqs had wrong/not enough Bioperl modules listed as needing them); does someone want to correct the wiki (based on the new Makefile.PL) and then Chris can re-create the text version? From hlapp at gmx.net Fri Oct 20 11:03:34 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 20 Oct 2006 11:03:34 -0400 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <4538E0CD.1030908@sendu.me.uk> References: <4538E0CD.1030908@sendu.me.uk> Message-ID: <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net> On Oct 20, 2006, at 10:44 AM, Sendu Bala wrote: > Also, I note that very many Bioperl modules need IO::String, including > Bio::SeqIO, so I'm not sure to what extent we can pretend it is an > optional module. I didn't make any change though. I agree. There's really not that many terribly useful things you can do with Bioperl w/o having IO::String installed, which is in stark contrast to many other dependencies. I don't have a problem with making it (and a few others used all over the place) required, to better contrast them with the dependencies that are really optional (and not needed for 90% of users). -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Fri Oct 20 11:18:32 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 20 Oct 2006 10:18:32 -0500 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <4538E0CD.1030908@sendu.me.uk> Message-ID: <001501c6f45b$019103c0$15327e82@pyrimidine> > Hi, > I've just committed an updated Makefile.PL to HEAD for bioperl-live. > Could some people test it on multiple platforms and confirm it is ok > (try out the different possible options as well)? > > (NB. in the below, 'pre-reqs' are things the makefile considers optional > dependencies) > > Note that some pre-reqs have been removed: > # DBD::mysql (nothing in Bioperl-live uses it; Bioperl modules may end > up requiring it but only after the user makes an explicit choice by > typing 'DBD::mysql' in their own code to supply as an option to Bioperl > code) > # File::Temp (standard in 5.6.1) I'll try it out on WinXP and Mac OS X. BTW, do any of Lincoln's Bio::DB* use DBD::mySQL? Bio::DB::GFF comes to mind. I don't think it should be an absolute requirement, though. If we plan on removing those, then we should also remove them from Bundle::Bioperl (if they are present). > This pre-req was wrong: > # Data::Stag::Writer > and has been replaced with: > Data::Stag::XMLWriter > > > Also, I note that very many Bioperl modules need IO::String, including > Bio::SeqIO, so I'm not sure to what extent we can pretend it is an > optional module. I didn't make any change though. Do they all require IO::String or is it an option? There are a few instances (WebDBSeqI-implementing, for instance) where this is presented as an option for most OS's (along with the default, pipeline, and tempfile). However, it is currently used by default with Windows due to lack of pipe/fork support at the time. BTW, the latter may now work with WinXP ActivePerl. ActiveState has been working on WinXP fork() emulation for a while, but I think it is still somewhat experimental. > I don't know if these changes affect the Windows ppm Nathan, or anything > else (Bundle?)? > > The INSTALL docs need updating with these new and improved pre-reqs > (note that some pre-reqs had wrong/not enough Bioperl modules listed as > needing them); does someone want to correct the wiki (based on the new > Makefile.PL) and then Chris can re-create the text version? Easier to just modify the text version based on what is changed in the wiki, at least for the time being. The text dumping from elinks/lynx isn't full-proof re: tables and such, which is one reason I think we should move the prereqs to a separate file as it's easier to maintain long-term (this seems to be where most changes occur anyway). Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Fri Oct 20 11:23:38 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 20 Oct 2006 16:23:38 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <1161270180.453793a432e4f@webmail.shef.ac.uk> References: <45375615.1020603@sheffield.ac.uk> <45379BBB.1040400@sheffield.ac.uk> <1161270180.453793a432e4f@webmail.shef.ac.uk> Message-ID: <4538E9FA.60701@sendu.me.uk> Nathan Haigh wrote: > I thought I'd have my first proper try at writing some tests. I was wondering if there is a template test file that I should use/study in order to be > consistent with other tests. > > Failing that - Is there a good test writing style I should follow in one of the other test files? I originally based mine on one of Chris's EUtilities tests, but now refer to t/ESEfinder.t since it is small and demonstrates all the major tricky things you might have to do - skip remote tests if no BIOPERLDEBUG, skip remote tests on remote server failure, skip all tests under some condition, fall-back to t/lib for Test::More if necessary. (Though I just spotted an oops in the latter...) From cjfields at uiuc.edu Fri Oct 20 11:38:02 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 20 Oct 2006 10:38:02 -0500 Subject: [Bioperl-l] test::more template In-Reply-To: <4538E9FA.60701@sendu.me.uk> Message-ID: <001601c6f45d$bb824350$15327e82@pyrimidine> > Nathan Haigh wrote: > > I thought I'd have my first proper try at writing some tests. I was > wondering if there is a template test file that I should use/study in > order to be > > consistent with other tests. > > > > Failing that - Is there a good test writing style I should follow in one > of the other test files? > > I originally based mine on one of Chris's EUtilities tests, but now > refer to t/ESEfinder.t since it is small and demonstrates all the major > tricky things you might have to do - skip remote tests if no > BIOPERLDEBUG, skip remote tests on remote server failure, skip all tests > under some condition, fall-back to t/lib for Test::More if necessary. > > (Though I just spotted an oops in the latter...) I agree. The EUtilities tests are quite long. I plan on eventually cutting out some of them Making them somewhat less prone to changes in returned XML data has also been a pain, as demonstrated by some of the tests from MAIN now failing... d'oh! Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Fri Oct 20 11:39:32 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 20 Oct 2006 16:39:32 +0100 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <001501c6f45b$019103c0$15327e82@pyrimidine> References: <001501c6f45b$019103c0$15327e82@pyrimidine> Message-ID: <4538EDB4.3030500@sendu.me.uk> Chris Fields wrote: > BTW, do any of Lincoln's Bio::DB* > use DBD::mySQL? Bio::DB::GFF comes to mind. No, just a require on a user-passed variable as I described. >> Also, I note that very many Bioperl modules need IO::String, including >> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an >> optional module. I didn't make any change though. > > Do they all require IO::String or is it an option? Oops, I take that back. Bio::SeqIO doesn't use IO::String. That's what you get for relying on grep output... It's still many modules that use it, but I suppose you could do useful things without. So actually, let's keep it optional. From cjfields at uiuc.edu Fri Oct 20 16:32:32 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 20 Oct 2006 15:32:32 -0500 Subject: [Bioperl-l] Error retrieving sequence from BioSQL Message-ID: <000001c6f486$df508930$15327e82@pyrimidine> Seth, Did you work out the problem here? There was a recent CVS update to OBDA tests (16obda.t) that fixed similar problems on Mac OS X. Old OBDA tests apparently left data from tests in the database, which caused problems with repeated test runs. Chris > > -----Original Message----- > > From: bioperl-l-bounces lists.open-bio.org [mailto:bioperl-l- > > bounces lists.open-bio.org] On Behalf Of Seth Johnson > > Sent: Saturday, September 30, 2006 6:35 PM > > To: Hilmar Lapp > > Cc: Chris Fields; Bioperl List > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > > > Here're complete test details: > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > ... > > > FAILED tests 10-12 > > Failed 3/12 tests, 75.00% okay > > Failed Test Stat Wstat Total Fail Failed List of Failed > > > -------------------------------------------------------------------------- > > ----- > > t\02species.t 65 2 3.08% 63 65 > > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > > t\12ontology.t 2 512 738 1471 199.32% 3-738 > > t\16obda.t 12 3 25.00% 10-12 > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- Best Regards, Seth Johnson Senior Bioinformatics Associate Ph: (202) 470-0900 Fx: (775) 251-0358 Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From olenka.m at gmail.com Fri Oct 20 17:47:15 2006 From: olenka.m at gmail.com (Olena Morozova) Date: Fri, 20 Oct 2006 14:47:15 -0700 Subject: [Bioperl-l] GO annotations Message-ID: <259a224c0610201447i44bd5226x3d66c31dbf3f715d@mail.gmail.com> Dear all, Does anyone know an easy way to get GO-BP annotations for ensembl genes? Thank you very much for your help, Olena From olenka.m at gmail.com Fri Oct 20 17:47:15 2006 From: olenka.m at gmail.com (Olena Morozova) Date: Fri, 20 Oct 2006 14:47:15 -0700 Subject: [Bioperl-l] GO annotations Message-ID: <259a224c0610201447i44bd5226x3d66c31dbf3f715d@mail.gmail.com> Dear all, Does anyone know an easy way to get GO-BP annotations for ensembl genes? Thank you very much for your help, Olena From sdavis2 at mail.nih.gov Sat Oct 21 11:05:26 2006 From: sdavis2 at mail.nih.gov (Davis, Sean (NIH/NCI) [E]) Date: Sat, 21 Oct 2006 11:05:26 -0400 Subject: [Bioperl-l] GO annotations References: <259a224c0610201447i44bd5226x3d66c31dbf3f715d@mail.gmail.com> Message-ID: <014DBF86B19310419F0DF8910FC56457240CE3@nihcesmlbx10.nih.gov> You can use the ensembl perl API, or (more simply) use the Ensembl MART interface: http://www.ensembl.org/Multi/martview Sean -----Original Message----- From: Olena Morozova [mailto:olenka.m at gmail.com] Sent: Fri 10/20/2006 5:47 PM To: bioperl-l Subject: [Bioperl-l] GO annotations Dear all, Does anyone know an easy way to get GO-BP annotations for ensembl genes? Thank you very much for your help, Olena _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From n.haigh at sheffield.ac.uk Sun Oct 22 06:34:51 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sun, 22 Oct 2006 10:34:51 +0000 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net> References: <4538E0CD.1030908@sendu.me.uk> <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net> Message-ID: <453B494B.7040702@sheffield.ac.uk> Hilmar Lapp wrote: > On Oct 20, 2006, at 10:44 AM, Sendu Bala wrote: > > >> Also, I note that very many Bioperl modules need IO::String, including >> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an >> optional module. I didn't make any change though. >> > > I agree. There's really not that many terribly useful things you can > do with Bioperl w/o having IO::String installed, which is in stark > contrast to many other dependencies. > > I don't have a problem with making it (and a few others used all over > the place) required, to better contrast them with the dependencies > that are really optional (and not needed for 90% of users). > > -hilmar > > Is it possible to make a distinction in Makefile.PL between those modules that are an absolute must for Bioperl-core and those which are optional and should go into Bundle::BioPerl? Once I'm sure what should be "option" I'll do the Bundle::BioPerl package and PPD's. Cheers Nath From vitacolonna at appliedgenomics.org Sun Oct 22 09:04:48 2006 From: vitacolonna at appliedgenomics.org (Nicola Vitacolonna) Date: Sun, 22 Oct 2006 15:04:48 +0200 Subject: [Bioperl-l] Submission proposal: ABIF module Message-ID: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org> Hi everybody, I would like to submit to CPAN a module for reading and parsing the ABIF files (with .ab1 suffix) produced by Applied Biosequence sequencers. The need for such a module arose in our lab because the existing ABI module we found on CPAN had too limited functionality. As an example, our module allows us to easily produce analysis reports similar to the ones generated by the Sequencing Analysis software. May I call the module Bio::ABIF? Or should I follow other conventions? Nicola From cjfields at uiuc.edu Sun Oct 22 09:54:51 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 22 Oct 2006 08:54:51 -0500 Subject: [Bioperl-l] Submission proposal: ABIF module In-Reply-To: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org> References: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org> Message-ID: On Oct 22, 2006, at 8:04 AM, Nicola Vitacolonna wrote: > Hi everybody, > I would like to submit to CPAN a module for reading and parsing the > ABIF files (with .ab1 suffix) produced by Applied Biosequence > sequencers. The need for such a module arose in our lab because the > existing ABI module we found on CPAN had too limited functionality. > As an example, our module allows us to easily produce analysis > reports similar to the ones generated by the Sequencing Analysis > software. > > May I call the module Bio::ABIF? Or should I follow other conventions? > > Nicola It depends. Does it interact with bioperl in any way? Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Sun Oct 22 09:57:18 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 22 Oct 2006 08:57:18 -0500 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <453B494B.7040702@sheffield.ac.uk> References: <4538E0CD.1030908@sendu.me.uk> <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net> <453B494B.7040702@sheffield.ac.uk> Message-ID: <7E563B1E-11D7-44F4-9E3D-3E319CE8DDA4@uiuc.edu> On Oct 22, 2006, at 5:34 AM, Nathan S. Haigh wrote: > Is it possible to make a distinction in Makefile.PL between those > modules that are an absolute must for Bioperl-core and those which are > optional and should go into Bundle::BioPerl? > > Once I'm sure what should be "option" I'll do the Bundle::BioPerl > package and PPD's. > > Cheers > Nath We probably should steer this way eventually. Do you aim on placing prereqs required for bioperl core in the bioperl PPD and the 'optional' ones with the bundle? Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From vitacolonna at appliedgenomics.org Sun Oct 22 10:16:26 2006 From: vitacolonna at appliedgenomics.org (Nicola Vitacolonna) Date: Sun, 22 Oct 2006 16:16:26 +0200 Subject: [Bioperl-l] Submission proposal: ABIF module In-Reply-To: References: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org> Message-ID: <8619131E-0F3D-4718-B7E3-7B5118B1139E@appliedgenomics.org> On 22/ott/06, at 15:54, Chris Fields wrote: > > On Oct 22, 2006, at 8:04 AM, Nicola Vitacolonna wrote: > >> Hi everybody, >> I would like to submit to CPAN a module for reading and parsing the >> ABIF files (with .ab1 suffix) [...] >> May I call the module Bio::ABIF? Or should I follow other >> conventions? > > It depends. Does it interact with bioperl in any way? No. Can you suggest a suitable pattern for the name? Nicola From cjfields at uiuc.edu Sun Oct 22 10:55:46 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 22 Oct 2006 09:55:46 -0500 Subject: [Bioperl-l] Submission proposal: ABIF module In-Reply-To: <8619131E-0F3D-4718-B7E3-7B5118B1139E@appliedgenomics.org> References: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org> <8619131E-0F3D-4718-B7E3-7B5118B1139E@appliedgenomics.org> Message-ID: On Oct 22, 2006, at 9:16 AM, Nicola Vitacolonna wrote: > On 22/ott/06, at 15:54, Chris Fields wrote: > >> >> On Oct 22, 2006, at 8:04 AM, Nicola Vitacolonna wrote: >> >>> Hi everybody, >>> I would like to submit to CPAN a module for reading and parsing the >>> ABIF files (with .ab1 suffix) [...] >>> May I call the module Bio::ABIF? Or should I follow other >>> conventions? >> >> It depends. Does it interact with bioperl in any way? > > No. Can you suggest a suitable pattern for the name? > > Nicola I don't think it will be a problem to name it Bio::ABIF; there is already a Bio::ASN1::EntrezGene, and Rutger Vos's Bio::Phylo modules (the latter doesn't require BioPerl either). Saying that, if you plan on contributing more CPAN modules with similar functionality (such as parsing other trace files), you might want to consider using a namespace that isn't limiting but doesn't conflict with Bioperl core (like Bio::Trace or similar, then name your module Bio::Trace::ABIF). You can use search.cpan.org to check namespaces for conflicts. Just as an note: we have bioperl-ext, which also parses ABI and other trace file formats. It's a bit old now and needs updating, but is supposed to be quite fast (it uses the Staden io_lib C library via PerlXS). -c Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From arareko at campus.iztacala.unam.mx Sun Oct 22 13:26:37 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Sun, 22 Oct 2006 12:26:37 -0500 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <4538E0CD.1030908@sendu.me.uk> References: <4538E0CD.1030908@sendu.me.uk> Message-ID: <453BA9CD.4060107@campus.iztacala.unam.mx> Works fine on FreeBSD. Mauricio. Sendu Bala wrote: > Hi, > I've just committed an updated Makefile.PL to HEAD for bioperl-live. > Could some people test it on multiple platforms and confirm it is ok > (try out the different possible options as well)? > > (NB. in the below, 'pre-reqs' are things the makefile considers optional > dependencies) > > Note that some pre-reqs have been removed: > # DBD::mysql (nothing in Bioperl-live uses it; Bioperl modules may end > up requiring it but only after the user makes an explicit choice by > typing 'DBD::mysql' in their own code to supply as an option to Bioperl > code) > # File::Temp (standard in 5.6.1) > > > This pre-req was wrong: > # Data::Stag::Writer > and has been replaced with: > Data::Stag::XMLWriter > > > Also, I note that very many Bioperl modules need IO::String, including > Bio::SeqIO, so I'm not sure to what extent we can pretend it is an > optional module. I didn't make any change though. > > > I don't know if these changes affect the Windows ppm Nathan, or anything > else (Bundle?)? > > The INSTALL docs need updating with these new and improved pre-reqs > (note that some pre-reqs had wrong/not enough Bioperl modules listed as > needing them); does someone want to correct the wiki (based on the new > Makefile.PL) and then Chris can re-create the text version? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From n.haigh at sheffield.ac.uk Sun Oct 22 15:37:07 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sun, 22 Oct 2006 20:37:07 +0100 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <7E563B1E-11D7-44F4-9E3D-3E319CE8DDA4@uiuc.edu> References: <4538E0CD.1030908@sendu.me.uk> <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net> <453B494B.7040702@sheffield.ac.uk> <7E563B1E-11D7-44F4-9E3D-3E319CE8DDA4@uiuc.edu> Message-ID: <453BC863.4090803@sheffield.ac.uk> Chris Fields wrote: > > On Oct 22, 2006, at 5:34 AM, Nathan S. Haigh wrote: > >> Is it possible to make a distinction in Makefile.PL between those >> modules that are an absolute must for Bioperl-core and those which are >> optional and should go into Bundle::BioPerl? >> >> Once I'm sure what should be "option" I'll do the Bundle::BioPerl >> package and PPD's. >> >> Cheers >> Nath > > We probably should steer this way eventually. Do you aim on placing > prereqs required for bioperl core in the bioperl PPD and the > 'optional' ones with the bundle? > That's correct. However, PPM will always try to update packages to the latest available. Therefore, if at some point in the future, a dependency is removed, and thus removed from Bundle::BioPerl, a situation may arise where an older version of BioPerl is running with the a recent version of Bundle::BioPerl and could have missing dependencies - not ideal but it is how things currently stand. The process of making the Bundle::BioPerl PPD would be simplified if these "optional" dependencies are separated from the "core" dependencies. If one of the following solutions is possible (i'm not sure if they are), it would be very useful: 1) Maintain 2 hashes in Makefile.PL that contain the "core" and "optional" dependencies. In unsure of the way dependencies are ordered during a "make ppd", but it may be possible to pass hash references of both to PREREQS_PM in MakeMakefile and have the "optional" depenencies grouped separately from "core" depenedcies in the ppd file - thus making it easy to stip them out into a Bundle::BioPerl ppd. 2) Again, maintain 2 hashes in Makefile.PL that contain the "core" and "optional" dependencies. Have some Makefile setup that allows the generation of a Bundle::BioPerl ppd separately from the main Bioperl ppd. Like I said, these are just some thoughts and I'm not sure if they are even viable options. Nath From chhalling at alumni.ls.berkeley.edu Sun Oct 22 19:45:33 2006 From: chhalling at alumni.ls.berkeley.edu (Conrad Halling) Date: Sun, 22 Oct 2006 19:45:33 -0400 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl Message-ID: <453C029D.1070708@alumni.ls.berkeley.edu> I have found three misspellings in Bundle::BioPerl 2.1.6 of 17-Oct-2006 that prevent these modules from being installed: Data::Stag::Writer (listed as Data::Stag::writer) HTTP::Request::Common (listed as HTTP::Request::Common-) Spreadsheet::ParseExcel (listed as Spreadhseet::ParseExcel) -- Conrad Halling chhalling at alumni.ls.berkeley.edu From cjfields at uiuc.edu Sun Oct 22 22:24:07 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 22 Oct 2006 21:24:07 -0500 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C029D.1070708@alumni.ls.berkeley.edu> References: <453C029D.1070708@alumni.ls.berkeley.edu> Message-ID: <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> Thanks for letting us know! Did PPM4 throw errors or just silently pass them over? Chris On Oct 22, 2006, at 6:45 PM, Conrad Halling wrote: > I have found three misspellings in Bundle::BioPerl 2.1.6 of 17- > Oct-2006 > that prevent these modules from being installed: > > Data::Stag::Writer (listed as Data::Stag::writer) > HTTP::Request::Common (listed as HTTP::Request::Common-) > Spreadsheet::ParseExcel (listed as Spreadhseet::ParseExcel) > > -- > Conrad Halling > chhalling at alumni.ls.berkeley.edu > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Mon Oct 23 02:45:29 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 06:45:29 +0000 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> Message-ID: <453C6509.90005@sheffield.ac.uk> Chris Fields wrote: > Thanks for letting us know! Did PPM4 throw errors or just silently > pass them over? > > Chris > > On Oct 22, 2006, at 6:45 PM, Conrad Halling wrote: > > I believe he is talking about the bundle on cpan and not the ppd. I will get this updated as soon as possible. Sendu/Chris - can you confirm to me which Bioperl modules are essential to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any reason for not putting *all* dependencies into the bundle? Nath From bix at sendu.me.uk Mon Oct 23 02:43:36 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 07:43:36 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C029D.1070708@alumni.ls.berkeley.edu> References: <453C029D.1070708@alumni.ls.berkeley.edu> Message-ID: <453C6498.5@sendu.me.uk> Conrad Halling wrote: > I have found three misspellings in Bundle::BioPerl 2.1.6 of 17-Oct-2006 > that prevent these modules from being installed: > > Data::Stag::Writer (listed as Data::Stag::writer) This should be Data::Stag::XMLWriter > HTTP::Request::Common (listed as HTTP::Request::Common-) > Spreadsheet::ParseExcel (listed as Spreadhseet::ParseExcel) From bix at sendu.me.uk Mon Oct 23 02:52:47 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 07:52:47 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C6509.90005@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> Message-ID: <453C66BF.1060008@sendu.me.uk> Nathan S. Haigh wrote: > Sendu/Chris - can you confirm to me which Bioperl modules are essential > to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any > reason for not putting *all* dependencies into the bundle? AFAIK, there are no essential external dependencies. Everything in %packages in Makefile.PL, for example, is optional. We had the discussion about making all the easy-to-install ones a forced requirement anyway (so that most things work out of the box), but perhaps we'll hold off on making such a change until after 1.5.2. From jyotikshah at gmail.com Mon Oct 23 03:10:43 2006 From: jyotikshah at gmail.com (Jyoti Shah) Date: Mon, 23 Oct 2006 00:10:43 -0700 Subject: [Bioperl-l] short motif searches Message-ID: <769931430610230010t5f68e54eg7fbc1be658c2cb3a@mail.gmail.com> Hi, I am interested in searching motifs as small as 6 or 7 nucleotides in genomic databases. I need exact matches. Is there any bioperl module available which can help me do this? I tried WU BLAST with word size one, but I am getting warning messages such as "WARNING: the maximum achievable score of 7 in context 0 (frame +1) is less than the ungapped cutoff score S2 (=13). Exit code 0...". Any suggestions? Thanks in advance, Jyoti From bix at sendu.me.uk Mon Oct 23 03:55:40 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 08:55:40 +0100 Subject: [Bioperl-l] short motif searches In-Reply-To: <769931430610230010t5f68e54eg7fbc1be658c2cb3a@mail.gmail.com> References: <769931430610230010t5f68e54eg7fbc1be658c2cb3a@mail.gmail.com> Message-ID: <453C757C.1010408@sendu.me.uk> Jyoti Shah wrote: > Hi, > > I am interested in searching motifs as small as 6 or 7 nucleotides in > genomic databases. I need exact matches. Is there any bioperl module > available which can help me do this? At 6 or 7bp long doing a simple exact match I should point out you're going to get very many hits; are you sure this is an appropriate thing to do for your purposes? Assuming yes, you can use Bio::SeqIO, Bio::Index or Bio::DB:: to get your genomic sequences of interest, then simply use a normal perl regexp on the resulting $seq->seq strings. If your motifs are anything like transcription factor binding sites, and you have more information than just a single sequence string for the motif, investigate Bio::Matrix::PSM. From bix at sendu.me.uk Mon Oct 23 04:29:52 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 09:29:52 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C7648.8030004@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> Message-ID: <453C7D80.80207@sendu.me.uk> Nathan S. Haigh wrote: > Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> Sendu/Chris - can you confirm to me which Bioperl modules are essential >>> to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any >>> reason for not putting *all* dependencies into the bundle? >> AFAIK, there are no essential external dependencies. Everything in >> %packages in Makefile.PL, for example, is optional. >> >> We had the discussion about making all the easy-to-install ones a >> forced requirement anyway (so that most things work out of the box), >> but perhaps we'll hold off on making such a change until after 1.5.2. > > How are they forced? They're not. Right now they're optional. I'm suggesting we might change that in the future. If you're asking how we /would/ force them, probably by adding PREREQ_FATAL to the WriteMakefile() call. As it is, Bioperl installs successfully (or should!) without its optional dependencies given in PREREQ_PM because make test succeeds (because tests skip ok when the optional dependency isn't there). I don't really know how CPAN discovers dependencies and auto-installs them before a dependent module though. Anyone care to explain? From n.haigh at sheffield.ac.uk Mon Oct 23 06:09:12 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 10:09:12 +0000 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C7D80.80207@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> Message-ID: <453C94C8.5040900@sheffield.ac.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: >> Sendu Bala wrote: >>> Nathan S. Haigh wrote: >>>> Sendu/Chris - can you confirm to me which Bioperl modules are >>>> essential >>>> to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any >>>> reason for not putting *all* dependencies into the bundle? >>> AFAIK, there are no essential external dependencies. Everything in >>> %packages in Makefile.PL, for example, is optional. >>> >>> We had the discussion about making all the easy-to-install ones a >>> forced requirement anyway (so that most things work out of the box), >>> but perhaps we'll hold off on making such a change until after 1.5.2. > > >> How are they forced? > > They're not. Right now they're optional. I'm suggesting we might > change that in the future. > If you're asking how we /would/ force them, probably by adding > PREREQ_FATAL to the WriteMakefile() call. As it is, Bioperl installs > successfully (or should!) without its optional dependencies given in > PREREQ_PM because make test succeeds (because tests skip ok when the > optional dependency isn't there). > > I don't really know how CPAN discovers dependencies and auto-installs > them before a dependent module though. Anyone care to explain? I thought so! I misunderstood something earlier which confused me. Just to clarify for my own sanities sake: 1) Currently all dependencies are optional. 2) All dependencies are in %packages 3) all these are passed to PREREQ_PM As far as CPAN discovering dependencies, here is a snip from the CPAN FAQ's: --snip-- I installed a Bundle and had a couple of fails. When I retried, everything resolved nicely. Can this be fixed to work on first try? The reason for this is that CPAN does not know the dependencies of all modules when it starts out. To decide about the additional items to install, it just uses data found in the META.yml file or the generated Makefile. An undetected missing piece breaks the process. But it may well be that your Bundle installs some prerequisite later than some depending item and thus your second try is able to resolve everything. Please note, CPAN.pm does not know the dependency tree in advance and cannot sort the queue of things to install in a topologically correct order. It resolves perfectly well IF all modules declare the prerequisites correctly with the PREREQ_PM attribute to MakeMaker or the |requires| stanza of Module::Build. For bundles which fail and you need to install often, it is recommended to sort the Bundle definition file manually. --snip-- Therefore, recent modifications to Makefile.PL should result in a fully operational Bioperl installation, if installed via CPAN. Although only Bioperl 1.4 is available via CPAN currently. It is possible to upload a developer release to CPAN which can only be ownloaded via CPAN if specifically asked for - would be good for 1.5.x.: --snip-- How do I install a "DEVELOPER RELEASE" of a module? By default, CPAN will install the latest non-developer release of a module. If you want to install a dev release, you have to specify the partial path starting with the author id to the tarball you wish to install, like so: cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz Note that you can use the |ls| command to get this path listed. --snip-- HTH Nath From bix at sendu.me.uk Mon Oct 23 05:41:52 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 10:41:52 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C94C8.5040900@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> Message-ID: <453C8E60.7000105@sendu.me.uk> Nathan S. Haigh wrote: > Sendu Bala wrote: > >> I don't really know how CPAN discovers dependencies and auto-installs >> them before a dependent module though. Anyone care to explain? > > I thought so! I misunderstood something earlier which confused me. Just > to clarify for my own sanities sake: > > 1) Currently all dependencies are optional. > 2) All dependencies are in %packages > 3) all these are passed to PREREQ_PM All correct. > As far as CPAN discovering dependencies, here is a snip from the CPAN FAQ's: > --snip-- > > I installed a Bundle and had a couple of fails. When I retried, > everything resolved nicely. Can this be fixed to work on first try? > > The reason for this is that CPAN does not know the dependencies of > all modules when it starts out. To decide about the additional items > to install, it just uses data found in the META.yml file or the > generated Makefile. An undetected missing piece breaks the process. > But it may well be that your Bundle installs some prerequisite later > than some depending item and thus your second try is able to resolve > everything. Please note, CPAN.pm does not know the dependency tree > in advance and cannot sort the queue of things to install in a > topologically correct order. It resolves perfectly well IF all > modules declare the prerequisites correctly with the PREREQ_PM > attribute to MakeMaker or the |requires| stanza of Module::Build. > For bundles which fail and you need to install often, it is > recommended to sort the Bundle definition file manually. > > --snip-- > > Therefore, recent modifications to Makefile.PL should result in a fully > operational Bioperl installation, if installed via CPAN. Right, thanks for that. > Although only Bioperl 1.4 is available via CPAN currently. It is possible to upload a > developer release to CPAN which can only be ownloaded via CPAN if > specifically asked for - would be good for 1.5.x.: > --snip-- > > How do I install a "DEVELOPER RELEASE" of a module? > > By default, CPAN will install the latest non-developer release of a > module. If you want to install a dev release, you have to specify > the partial path starting with the author id to the tarball you wish > to install, like so: > > cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz > > Note that you can use the |ls| command to get this path listed. > > --snip-- That's the user point of view - how does the developer actually tell CPAN that something is a developer release so that normal users don't automatically install it? From bix at sendu.me.uk Mon Oct 23 05:59:52 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 10:59:52 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C8E60.7000105@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> Message-ID: <453C9298.9000900@sendu.me.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: >> As far as CPAN discovering dependencies, here is a snip from the CPAN >> FAQ's: >> --snip-- >> >> I installed a Bundle and had a couple of fails. When I retried, >> everything resolved nicely. Can this be fixed to work on first try? >> >> The reason for this is that CPAN does not know the dependencies of >> all modules when it starts out. To decide about the additional items >> to install, it just uses data found in the META.yml file or the >> generated Makefile. An undetected missing piece breaks the process. >> But it may well be that your Bundle installs some prerequisite later >> than some depending item and thus your second try is able to resolve >> everything. Please note, CPAN.pm does not know the dependency tree >> in advance and cannot sort the queue of things to install in a >> topologically correct order. It resolves perfectly well IF all >> modules declare the prerequisites correctly with the PREREQ_PM >> attribute to MakeMaker or the |requires| stanza of Module::Build. >> For bundles which fail and you need to install often, it is >> recommended to sort the Bundle definition file manually. >> >> --snip-- >> >> Therefore, recent modifications to Makefile.PL should result in a fully >> operational Bioperl installation, if installed via CPAN. > > Right, thanks for that. Oh, so this effectively means that our 'optional' dependencies are installed for CPAN users, which matches up to my 'force the optional ones anyway' desire, leaving Bundle::BioPerl without any use. Makefile.PL could be altered again to remove from PREREQ_PM those modules the user didn't already have installed, thus CPAN would only install Bioperl itself and nothing optional. The user could then install Bundle::BioPerl if they wanted a quick way of getting all the optional stuff to work. I'm happy either way; what do other people think? From n.haigh at sheffield.ac.uk Mon Oct 23 07:22:17 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 11:22:17 +0000 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C9298.9000900@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453C9298.9000900@sendu.me.uk> Message-ID: <453CA5E9.1060406@sheffield.ac.uk> Sendu Bala wrote: > Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> As far as CPAN discovering dependencies, here is a snip from the >>> CPAN FAQ's: >>> --snip-- >>> >>> I installed a Bundle and had a couple of fails. When I retried, >>> everything resolved nicely. Can this be fixed to work on first try? >>> >>> The reason for this is that CPAN does not know the dependencies of >>> all modules when it starts out. To decide about the additional >>> items >>> to install, it just uses data found in the META.yml file or the >>> generated Makefile. An undetected missing piece breaks the process. >>> But it may well be that your Bundle installs some prerequisite >>> later >>> than some depending item and thus your second try is able to >>> resolve >>> everything. Please note, CPAN.pm does not know the dependency tree >>> in advance and cannot sort the queue of things to install in a >>> topologically correct order. It resolves perfectly well IF all >>> modules declare the prerequisites correctly with the PREREQ_PM >>> attribute to MakeMaker or the |requires| stanza of Module::Build. >>> For bundles which fail and you need to install often, it is >>> recommended to sort the Bundle definition file manually. >>> >>> --snip-- >>> >>> Therefore, recent modifications to Makefile.PL should result in a fully >>> operational Bioperl installation, if installed via CPAN. >> >> Right, thanks for that. > > Oh, so this effectively means that our 'optional' dependencies are > installed for CPAN users, which matches up to my 'force the optional > ones anyway' desire, leaving Bundle::BioPerl without any use. > > Makefile.PL could be altered again to remove from PREREQ_PM those > modules the user didn't already have installed, thus CPAN would only > install Bioperl itself and nothing optional. The user could then > install Bundle::BioPerl if they wanted a quick way of getting all the > optional stuff to work. > > I'm happy either way; what do other people think? >From my point of view, removing them from PREREQ_PM means building the Bundle::BioPerl a bit of a pain :o( I prefer the way it is currently set up - most people have fast internet connections and GB of harddrive space. Other than the reason "why install something I won't ever need" I don't see much point maintaining Bundle::BioPerl and having "optional" dependencies. I think if there are any modules which are not going to be used by the majority of users, then this could be used as the rationale for removing them from bioperl-core into another package? Nath From n.haigh at sheffield.ac.uk Mon Oct 23 07:38:05 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 11:38:05 +0000 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C8E60.7000105@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> Message-ID: <453CA99D.9060009@sheffield.ac.uk> >> Although only Bioperl 1.4 is available via CPAN currently. It is >> possible to upload a >> developer release to CPAN which can only be ownloaded via CPAN if >> specifically asked for - would be good for 1.5.x.: >> --snip-- >> >> How do I install a "DEVELOPER RELEASE" of a module? >> >> By default, CPAN will install the latest non-developer release of a >> module. If you want to install a dev release, you have to specify >> the partial path starting with the author id to the tarball you wish >> to install, like so: >> >> cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz >> >> Note that you can use the |ls| command to get this path listed. >> >> --snip-- > > That's the user point of view - how does the developer actually tell > CPAN that something is a developer release so that normal users don't > automatically install it? I found this: http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt Is says that $VERSION should simply be changed from a naked number into a single quoted number and this should be recognized by the CPAN indexer. Nath From bix at sendu.me.uk Mon Oct 23 06:47:38 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 11:47:38 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> <4535EBF9.1090706@sendu.me.uk> <4536113D.1080307@sheffield.ac.uk> Message-ID: <453C9DCA.4020802@sendu.me.uk> Hilmar Lapp wrote: > On Oct 18, 2006, at 7:34 AM, Nathan Haigh wrote: > >> For example, I have made no effort to setup biosql-schema but I >> thought that maybe there would be a test that would detect this > > I'm afraid there isn't. Bioperl-db is meaningless without > biosql-schema. Can you suggest a way we might detect if biosql-schema has been installed prior to running the test suite, so we can give some meaningful error message? From bix at sendu.me.uk Mon Oct 23 08:43:30 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 13:43:30 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453CA5E9.1060406@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk> Message-ID: <453CB8F2.7070703@sendu.me.uk> Nathan S. Haigh wrote: > Sendu Bala wrote: > >> Makefile.PL could be altered again to remove from PREREQ_PM those >> modules the user didn't already have installed, thus CPAN would only >> install Bioperl itself and nothing optional. The user could then >> install Bundle::BioPerl if they wanted a quick way of getting all the >> optional stuff to work. >> >> I'm happy either way; what do other people think? > > From my point of view, removing them from PREREQ_PM means building the > Bundle::BioPerl a bit of a pain :o( Can I ask how you're generating Bundle::BioPerl? That is, how did the typos get in there? Is there a way to certainly avoid typos in the future? From n.haigh at sheffield.ac.uk Mon Oct 23 09:46:17 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 13:46:17 +0000 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453CB8F2.7070703@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk> <453CB8F2.7070703@sendu.me.uk> Message-ID: <453CC7A9.6090609@sheffield.ac.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: >> Sendu Bala wrote: >> >>> Makefile.PL could be altered again to remove from PREREQ_PM those >>> modules the user didn't already have installed, thus CPAN would only >>> install Bioperl itself and nothing optional. The user could then >>> install Bundle::BioPerl if they wanted a quick way of getting all the >>> optional stuff to work. >>> >>> I'm happy either way; what do other people think? > > >> From my point of view, removing them from PREREQ_PM means building the >> Bundle::BioPerl a bit of a pain :o( > > Can I ask how you're generating Bundle::BioPerl? That is, how did the > typos get in there? Is there a way to certainly avoid typos in the > future? I just modified the list by hand a while back :o( - I'm sure there must be a better way. From bix at sendu.me.uk Mon Oct 23 08:58:13 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 13:58:13 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453CC7A9.6090609@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk> <453CB8F2.7070703@sendu.me.uk> <453CC7A9.6090609@sheffield.ac.uk> Message-ID: <453CBC65.2020202@sendu.me.uk> Nathan S. Haigh wrote: > Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> Sendu Bala wrote: >>> >>>> Makefile.PL could be altered again to remove from PREREQ_PM those >>>> modules the user didn't already have installed, thus CPAN would only >>>> install Bioperl itself and nothing optional. The user could then >>>> install Bundle::BioPerl if they wanted a quick way of getting all the >>>> optional stuff to work. >>>> >>>> I'm happy either way; what do other people think? >>> >>> From my point of view, removing them from PREREQ_PM means building the >>> Bundle::BioPerl a bit of a pain :o( >> >> Can I ask how you're generating Bundle::BioPerl? That is, how did the >> typos get in there? Is there a way to certainly avoid typos in the >> future? > > I just modified the list by hand a while back :o( - I'm sure there must > be a better way. I'm not sure I understand why removing things from PREREQ_PM would be a problem for you then; the %packages hash would remain unchanged (ie. have everything) so you have something to refer to when manually editing the Bundle. http://www.cpan.org/misc/cpan-faq.html#How_make_bundle might be helpful? I didn't really pay too much attention to the advice - does it offer a typo-avoiding solution? From n.haigh at sheffield.ac.uk Mon Oct 23 10:04:12 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 14:04:12 +0000 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453CBC65.2020202@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk> <453CB8F2.7070703@sendu.me.uk> <453CC7A9.6090609@sheffield.ac.uk> <453CBC65.2020202@sendu.me.uk> Message-ID: <453CCBDC.6030904@sheffield.ac.uk> > I'm not sure I understand why removing things from PREREQ_PM would be > a problem for you then; the %packages hash would remain unchanged (ie. > have everything) so you have something to refer to when manually > editing the Bundle. > > http://www.cpan.org/misc/cpan-faq.html#How_make_bundle > might be helpful? I didn't really pay too much attention to the advice > - does it offer a typo-avoiding solution? It's helpful in producing the Bundle PPD as all the XML tags are present in the Bioperl PPD and they simply need to be copied over to a Bundle-BioPerl PPD file. Looks like manual editing of the relevant file is required for making a CPAN bundle. Unfortunately - no typo-avoiding solution. :o( From dhoworth at mrc-lmb.cam.ac.uk Mon Oct 23 08:46:29 2006 From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth) Date: Mon, 23 Oct 2006 13:46:29 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453CA99D.9060009@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> Message-ID: <453CB9A5.2020409@mrc-lmb.cam.ac.uk> >> That's the user point of view - how does the developer actually tell >> CPAN that something is a developer release so that normal users don't >> automatically install it? > > I found this: > http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt > > Is says that $VERSION should simply be changed from a naked number into > a single quoted number and this should be recognized by the CPAN indexer. Cheers, Dave From hlapp at gmx.net Mon Oct 23 09:40:29 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 23 Oct 2006 09:40:29 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <453C9DCA.4020802@sendu.me.uk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> <4535EBF9.1090706@sendu.me.uk> <4536113D.1080307@sheffield.ac.uk> <453C9DCA.4020802@sendu.me.uk> Message-ID: <5C22B9C8-CEF0-457B-8565-793D56389A86@gmx.net> You would need a lot of information to make that determination (host, port, db driver, db name, user, password; i.e., the entire connection information, and there is no 'standard'). You might just ask a simple question in Makefile.PL as to whether biosql is installed or not, similar to the DB::GFF tests. -hilmar On Oct 23, 2006, at 6:47 AM, Sendu Bala wrote: > Hilmar Lapp wrote: >> On Oct 18, 2006, at 7:34 AM, Nathan Haigh wrote: >> >>> For example, I have made no effort to setup biosql-schema but I >>> thought that maybe there would be a test that would detect this >> >> I'm afraid there isn't. Bioperl-db is meaningless without >> biosql-schema. > > Can you suggest a way we might detect if biosql-schema has been > installed prior to running the test suite, so we can give some > meaningful error message? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Mon Oct 23 09:59:23 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 14:59:23 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CB9A5.2020409@mrc-lmb.cam.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> Message-ID: <453CCABB.2060308@sendu.me.uk> Dave Howorth wrote: >>> That's the user point of view - how does the developer actually tell >>> CPAN that something is a developer release so that normal users don't >>> automatically install it? >> I found this: >> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt >> >> Is says that $VERSION should simply be changed from a naked number into >> a single quoted number and this should be recognized by the CPAN indexer. > > Thanks for that. I guess from that the 1.5.2 version number should be: $VERSION = 1.05_02 And 1.6 would be $VERSION = 1.06 But will this cause a problem wrt 1.4? 1.4 has: $VERSION = 1.4; Is 1.4 lower than 1.06? Should we keep to a single digit version, so 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them version fifty and version sixty? 1.50_02, 1.60? From cjfields at uiuc.edu Mon Oct 23 10:12:16 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 09:12:16 -0500 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C9298.9000900@sendu.me.uk> Message-ID: <002001c6f6ad$3f68dd90$15327e82@pyrimidine> ... > > Right, thanks for that. > > Oh, so this effectively means that our 'optional' dependencies are > installed for CPAN users, which matches up to my 'force the optional > ones anyway' desire, leaving Bundle::BioPerl without any use. > > Makefile.PL could be altered again to remove from PREREQ_PM those > modules the user didn't already have installed, thus CPAN would only > install Bioperl itself and nothing optional. The user could then install > Bundle::BioPerl if they wanted a quick way of getting all the optional > stuff to work. > > I'm happy either way; what do other people think? I think that we should have it so Bioperl installs as-is (no additional reqs) and have Bundle::BioPerl used as a convenient way to install all optional modules for full functionality. The catch is to make sure that any optional installations do not crash tests during a CPAN bioperl installation, otherwise they aren't considered optional by CPAN, and the install won't work without forcing it. Frankly, most users will find themselves wanting to install the Bundle anyway to get full functionality, so we could always 'strongly recommend' preceding the bioperl installation with a Bundle::Bioperl CPAN installation to avoid problems, at least for this release. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 23 10:23:04 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 09:23:04 -0500 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453CA5E9.1060406@sheffield.ac.uk> Message-ID: <002101c6f6ae$c14d7860$15327e82@pyrimidine> ... > >> Right, thanks for that. > > > > Oh, so this effectively means that our 'optional' dependencies are > > installed for CPAN users, which matches up to my 'force the optional > > ones anyway' desire, leaving Bundle::BioPerl without any use. > > > > Makefile.PL could be altered again to remove from PREREQ_PM those > > modules the user didn't already have installed, thus CPAN would only > > install Bioperl itself and nothing optional. The user could then > > install Bundle::BioPerl if they wanted a quick way of getting all the > > optional stuff to work. > > > > I'm happy either way; what do other people think? > >From my point of view, removing them from PREREQ_PM means building the > Bundle::BioPerl a bit of a pain :o( > > I prefer the way it is currently set up - most people have fast internet > connections and GB of harddrive space. Other than the reason "why > install something I won't ever need" I don't see much point maintaining > Bundle::BioPerl and having "optional" dependencies. I think if there are > any modules which are not going to be used by the majority of users, > then this could be used as the rationale for removing them from > bioperl-core into another package? > > Nath I think you'll likely find it much easier to maintain a Bundle package long-term and indicate that it should be installed along with bioperl, than to have users complain about a particular Bioperl module failing b/c a particular dependency wasn't installed. If we have the Bundle around in CPAN and in PPM for Win32 users, and indicate in the INSTALL docs and the wiki our preference that it be installed prior to or along with a Bioperl installation for beginners, we can mitigate most of those problems. Nip it in the bud, to quote a Mr. Barney Fife. My 2c Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 23 10:29:33 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 09:29:33 -0500 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CCABB.2060308@sendu.me.uk> Message-ID: <002201c6f6af$a91e4200$15327e82@pyrimidine> > Dave Howorth wrote: > >>> That's the user point of view - how does the developer actually tell > >>> CPAN that something is a developer release so that normal users don't > >>> automatically install it? > >> I found this: > >> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt > >> > >> Is says that $VERSION should simply be changed from a naked number into > >> a single quoted number and this should be recognized by the CPAN > indexer. > > > > 5.8.8/pod/perlmodstyle.pod#Version_numbering> > > Thanks for that. > > I guess from that the 1.5.2 version number should be: > > $VERSION = 1.05_02 > > And 1.6 would be > > $VERSION = 1.06 > > But will this cause a problem wrt 1.4? 1.4 has: > > $VERSION = 1.4; > > Is 1.4 lower than 1.06? Should we keep to a single digit version, so > 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them > version fifty and version sixty? 1.50_02, 1.60? Doesn't perl 5.6.1 and up use the 'x.y.z' versioning syntax? It would be much simpler to use that. Simon Cozens wrote about this a while back: http://www.perl.com/pub/a/2000/04/whatsnew.html ... Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Mon Oct 23 10:41:24 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 15:41:24 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <002201c6f6af$a91e4200$15327e82@pyrimidine> References: <002201c6f6af$a91e4200$15327e82@pyrimidine> Message-ID: <453CD494.8070905@sendu.me.uk> Chris Fields wrote: >> Dave Howorth wrote: >>>>> That's the user point of view - how does the developer actually tell >>>>> CPAN that something is a developer release so that normal users don't >>>>> automatically install it? >>>> I found this: >>>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt >>>> >>>> Is says that $VERSION should simply be changed from a naked number into >>>> a single quoted number and this should be recognized by the CPAN >> indexer. >>> > 5.8.8/pod/perlmodstyle.pod#Version_numbering> >> >> Thanks for that. >> >> I guess from that the 1.5.2 version number should be: >> >> $VERSION = 1.05_02 >> >> And 1.6 would be >> >> $VERSION = 1.06 >> >> But will this cause a problem wrt 1.4? 1.4 has: >> >> $VERSION = 1.4; >> >> Is 1.4 lower than 1.06? Should we keep to a single digit version, so >> 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them >> version fifty and version sixty? 1.50_02, 1.60? > > Doesn't perl 5.6.1 and up use the 'x.y.z' versioning syntax? It would be > much simpler to use that. That does not present us with a way to have 1.5.2 marked as a developer release in CPAN. Also, see the discussion here: http://perldoc.perl.org/functions/require.html Since we require 5.6.1 the backwards-compatible issues maybe don't apply to us, but do these ideas work with modules, or just Perl itself? Is CPAN et al. happy with this form of versioning? /Something/ needs to be done about Bioperl versioning, because the current 1.4 or 1.5 is completely inadequate. From bix at sendu.me.uk Mon Oct 23 10:51:25 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 15:51:25 +0100 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <002001c6f6ad$3f68dd90$15327e82@pyrimidine> References: <002001c6f6ad$3f68dd90$15327e82@pyrimidine> Message-ID: <453CD6ED.5050507@sendu.me.uk> Chris Fields wrote: [option 1] >> Oh, so this effectively means that our 'optional' dependencies are >> installed for CPAN users, which matches up to my 'force the >> optional ones anyway' desire, leaving Bundle::BioPerl without any >> use. [option 2] >> Makefile.PL could be altered again to remove from PREREQ_PM those >> modules the user didn't already have installed, thus CPAN would >> only install Bioperl itself and nothing optional. The user could >> then install Bundle::BioPerl if they wanted a quick way of getting >> all the optional stuff to work. >> >> I'm happy either way; what do other people think? > > I think that we should have it so Bioperl installs as-is (no > additional reqs) and have Bundle::BioPerl used as a convenient way to > install all optional modules for full functionality. Note we're specifically considering a CPAN install here. If you download the tar.gz or use cvs, [option 1] doesn't affect you. Bundle::Bioperl is still needed as a convenience if you want to install the optional external dependencies. > The catch is to make sure that any optional installations do not > crash tests during a CPAN bioperl installation, otherwise they aren't > considered optional by CPAN, and the install won't work without > forcing it. I'm pretty sure this isn't a problem, though it would be nice if someone could test it on a clean system: does 'make test' pass all ok with none of the optional modules installed? Anyway, to reiterate the question: Do we care if CPAN users get all the optional external dependencies installed for them automatically, or do we want to force them to install Bundle? The current situation is: CPAN users will get all optional external dependencies without using Bundle::BioPerl. Manual installers of bioperl (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to get full functionality. From n.haigh at sheffield.ac.uk Mon Oct 23 12:30:34 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 16:30:34 +0000 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CCABB.2060308@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> Message-ID: <453CEE2A.8000002@sheffield.ac.uk> Sendu Bala wrote: > Dave Howorth wrote: > >>>> That's the user point of view - how does the developer actually tell >>>> CPAN that something is a developer release so that normal users don't >>>> automatically install it? >>>> >>> I found this: >>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt >>> >>> Is says that $VERSION should simply be changed from a naked number into >>> a single quoted number and this should be recognized by the CPAN indexer. >>> >> >> > > Thanks for that. > > I guess from that the 1.5.2 version number should be: > > $VERSION = 1.05_02 > > And 1.6 would be > > $VERSION = 1.06 > > But will this cause a problem wrt 1.4? 1.4 has: > > $VERSION = 1.4; > > Is 1.4 lower than 1.06? Should we keep to a single digit version, so > 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them > version fifty and version sixty? 1.50_02, 1.60? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > I believe the link to the documentation above describes a common CPAN versioning scheme as follows: 1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32 Therefore version 1.5 of Bioperl would be: 1.50 and version 1.5.2 would be better as 1.52. Then to indicate that the 1.5 series is a developer release, you append the underscore and at least 2 digits. Thus resulting in the following: Bioperl 1.5 would be 1.50_01 and 1.5.2 would be 1.52_01. The only thing i'm unsure about would be when does the _01 get incremented? I suspect we would probably not increment this number since each release would be an increment of the minor release number e.g. 1.52_01, 1.53_01, 1.54_01 etc. Although I'm still not sure how this versioning would affect bioperl 1.4 since 1.4 uses a non-standard versioning scheme :o( As I understand it, the versioning of the Perl releases uses the x.y.z scheme. But apparently CPAN modules should use the above versioning scheme. Nath From cjfields at uiuc.edu Mon Oct 23 11:36:37 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 10:36:37 -0500 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <453CD6ED.5050507@sendu.me.uk> Message-ID: <000c01c6f6b9$0781af40$15327e82@pyrimidine> ... > > Note we're specifically considering a CPAN install here. If you download > the tar.gz or use cvs, [option 1] doesn't affect you. Bundle::Bioperl is > still needed as a convenience if you want to install the optional > external dependencies. > Agreed. I don't think the Bundle is dispensable. For instance, it's very easy for us to just state to beginners to install Bundle::Bioperl before installing bioperl itself, as opposed to having them inundate the mail list with requests on why x.pl script didn't work, which could be simply from lack of the required module. > I'm pretty sure this isn't a problem, though it would be nice if someone > could test it on a clean system: does 'make test' pass all ok with none > of the optional modules installed? So far on WinXP everything passes; I ran a clean perl installation a while ago using nmake and tests passed. > Anyway, to reiterate the question: Do we care if CPAN users get all the > optional external dependencies installed for them automatically, or do > we want to force them to install Bundle? > > The current situation is: CPAN users will get all optional external > dependencies without using Bundle::BioPerl. Manual installers of bioperl > (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to > get full functionality. I don't think forcing is necessary, so a CPAN installation shouldn't force someone to install optional modules. Graph.pm, for instance has a few optional modules, and the tests which use those get skipped and pass so the installation proceeds w/o problems. We could do the same (any tests using those optional modules display the reason why they are skipped). I would strongly state in the INSTALL and INSTALL.WIN docs that (new) users should install Bundle::Bioperl before installing Bioperl core for full functionality. If you are an advanced user and know your way around CPAN/Perl, then you can install the various independent requirements depending on your particular requirements. Chris From n.haigh at sheffield.ac.uk Mon Oct 23 12:38:00 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 16:38:00 +0000 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <453CD6ED.5050507@sendu.me.uk> References: <002001c6f6ad$3f68dd90$15327e82@pyrimidine> <453CD6ED.5050507@sendu.me.uk> Message-ID: <453CEFE8.4000704@sheffield.ac.uk> Sendu Bala wrote: > Chris Fields wrote: > > [option 1] > >>> Oh, so this effectively means that our 'optional' dependencies are >>> installed for CPAN users, which matches up to my 'force the >>> optional ones anyway' desire, leaving Bundle::BioPerl without any >>> use. >>> > > [option 2] > >>> Makefile.PL could be altered again to remove from PREREQ_PM those >>> modules the user didn't already have installed, thus CPAN would >>> only install Bioperl itself and nothing optional. The user could >>> then install Bundle::BioPerl if they wanted a quick way of getting >>> all the optional stuff to work. >>> >>> I'm happy either way; what do other people think? >>> >> I think that we should have it so Bioperl installs as-is (no >> additional reqs) and have Bundle::BioPerl used as a convenient way to >> install all optional modules for full functionality. >> > > Note we're specifically considering a CPAN install here. If you download > the tar.gz or use cvs, [option 1] doesn't affect you. Bundle::Bioperl is > still needed as a convenience if you want to install the optional > external dependencies. > > > >> The catch is to make sure that any optional installations do not >> crash tests during a CPAN bioperl installation, otherwise they aren't >> considered optional by CPAN, and the install won't work without >> forcing it. >> > > I'm pretty sure this isn't a problem, though it would be nice if someone > could test it on a clean system: does 'make test' pass all ok with none > of the optional modules installed? > > I could definitely do this on WinXP and *possibly* on a Linux system. > Anyway, to reiterate the question: Do we care if CPAN users get all the > optional external dependencies installed for them automatically, or do > we want to force them to install Bundle? > > I'd prefer any dependencies, whether the are seen as vital to the main functionality of Bioperl or not actually specified in PREREQ_PM (as they currently are). A dependency is a dependency - is it not? If a distinction is to be made based on whether the requiring module is simply adding additional functionality to Bioperl-core, then shouldn't it be moved out of core and into another package as with the run modules if we are to have "optional" dependencies? my 2p Nath > The current situation is: CPAN users will get all optional external > dependencies without using Bundle::BioPerl. Manual installers of bioperl > (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to > get full functionality. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at uiuc.edu Mon Oct 23 11:39:09 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 10:39:09 -0500 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CD494.8070905@sendu.me.uk> Message-ID: <000d01c6f6b9$62033d80$15327e82@pyrimidine> ... > That does not present us with a way to have 1.5.2 marked as a developer > release in CPAN. > > Also, see the discussion here: > http://perldoc.perl.org/functions/require.html > > Since we require 5.6.1 the backwards-compatible issues maybe don't apply > to us, but do these ideas work with modules, or just Perl itself? Is > CPAN et al. happy with this form of versioning? > > /Something/ needs to be done about Bioperl versioning, because the > current 1.4 or 1.5 is completely inadequate. I think using 'require Foo x.y.z' is applicable to modules as well. There is something in Programming Perl about this, just don't have it on hand... Not sure about CPAN, so we need to look into it. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Mon Oct 23 11:42:15 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 16:42:15 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CEE2A.8000002@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> Message-ID: <453CE2D7.5080608@sendu.me.uk> Nathan S. Haigh wrote: > I believe the link to the documentation above describes a common CPAN > versioning scheme as follows: > > 1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32 > > Therefore version 1.5 of Bioperl would be: 1.50 and version 1.5.2 would > be better as 1.52. Then to indicate that the 1.5 series is a developer > release, you append the underscore and at least 2 digits. Thus resulting > in the following: Bioperl 1.5 would be 1.50_01 and 1.5.2 would be > 1.52_01. The only thing i'm unsure about would be when does the _01 get > incremented? I suspect we would probably not increment this number since > each release would be an increment of the minor release number e.g. > 1.52_01, 1.53_01, 1.54_01 etc. > > Although I'm still not sure how this versioning would affect bioperl 1.4 > since 1.4 uses a non-standard versioning scheme :o( Ok, I'm going to go ahead and call it 1.52_01 then. Surely 1.60 will be treated higher than 1.4? Anyway, we can cross that bridge when we get there, but this seems appropriate now. Cheers, Sendu. From bix at sendu.me.uk Mon Oct 23 11:59:01 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 16:59:01 +0100 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <000c01c6f6b9$0781af40$15327e82@pyrimidine> References: <000c01c6f6b9$0781af40$15327e82@pyrimidine> Message-ID: <453CE6C5.6000108@sendu.me.uk> Chris Fields wrote: > ... >> The current situation is: CPAN users will get all optional external >> dependencies without using Bundle::BioPerl. Manual installers of bioperl >> (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to >> get full functionality. > > I don't think forcing is necessary, so a CPAN installation shouldn't force > someone to install optional modules. Graph.pm, for instance has a few > optional modules, and the tests which use those get skipped and pass so the > installation proceeds w/o problems. We could do the same (any tests using > those optional modules display the reason why they are skipped). I should clarify and say that that's what happens in Bioperl as well. The 'forcing' that I talk about is simply what I assume will happen if the user has CPAN set to automatically install dependencies. The user could say 'no' to every question regarding the installation of dependencies that CPAN discovers and Bioperl would still install fine. So really the difference between the current situation and, say, the situation when 1.5.1 was released, is that the CPAN user doesn't have to use Bundle::BioPerl for full functionality anymore, but can still chose not to install all the optional external modules. The difference is the possible default behaviour. Those users that auto-install dependencies get all the optional ones, whereas in the past they would not have. I have to point out the benefit of this behaviour: those people that don't care and just want it to work are more likely to get an installation that does just work. People who know what they're doing can still do what they want. Before we decide what to do I guess we need hard confirmation of how CPAN will actually behave with the current Makefile.PL. Any ideas how we can find out? It would also be good to have more options to break the current tie (Nathan is for keeping PREREQ_PM populated, Chris is for having it empty, I can go either way)... From dhoworth at mrc-lmb.cam.ac.uk Mon Oct 23 11:55:42 2006 From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth) Date: Mon, 23 Oct 2006 16:55:42 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CD494.8070905@sendu.me.uk> References: <002201c6f6af$a91e4200$15327e82@pyrimidine> <453CD494.8070905@sendu.me.uk> Message-ID: <453CE5FE.9070001@mrc-lmb.cam.ac.uk> Sendu Bala wrote: > Chris Fields wrote: >>> Dave Howorth wrote: >>>>>> That's the user point of view - how does the developer actually tell >>>>>> CPAN that something is a developer release so that normal users don't >>>>>> automatically install it? >>>>> I found this: >>>>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt >>>>> >>>>> Is says that $VERSION should simply be changed from a naked number into >>>>> a single quoted number and this should be recognized by the CPAN >>> indexer. >>>> >> 5.8.8/pod/perlmodstyle.pod#Version_numbering> >>> >>> Thanks for that. >>> >>> I guess from that the 1.5.2 version number should be: >>> >>> $VERSION = 1.05_02 I believe so - the underscore is key. Look at your favourite CPAN modules and see what they do. >>> And 1.6 would be >>> >>> $VERSION = 1.06 >>> >>> But will this cause a problem wrt 1.4? 1.4 has: I think it will cause a problem, yes. 1.4 > 1.06 As a workaround, you could remove 1.4 from CPAN and require everybody who installs from CPAN to uninstall it before installing 1.06. >>> $VERSION = 1.4; >>> >>> Is 1.4 lower than 1.06? Should we keep to a single digit version, so >>> 1.5_02 and 1.6? Does this really not work with CPAN? I think that would work but see at the end. >> Should we call them >>> version fifty and version sixty? 1.50_02, 1.60? Then you can count 1.50_02, 1.50_03, 1.52, 1.53_01 ... if you wish. >> Doesn't perl 5.6.1 and up use the 'x.y.z' versioning syntax? It would be >> much simpler to use that. > > That does not present us with a way to have 1.5.2 marked as a developer > release in CPAN. > > Also, see the discussion here: > http://perldoc.perl.org/functions/require.html > > Since we require 5.6.1 the backwards-compatible issues maybe don't apply > to us, but do these ideas work with modules, or just Perl itself? Is > CPAN et al. happy with this form of versioning? I'm not an expert :( It's my understanding that there is an awful lot of flexibility in Perl module version numbering (as you might expect :) However, I believe there are some gotchas. So I would recommend (a) finding an expert and (b) trying an experiment! > /Something/ needs to be done about Bioperl versioning, because the > current 1.4 or 1.5 is completely inadequate. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From n.haigh at sheffield.ac.uk Mon Oct 23 13:37:13 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 17:37:13 +0000 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <453CE6C5.6000108@sendu.me.uk> References: <000c01c6f6b9$0781af40$15327e82@pyrimidine> <453CE6C5.6000108@sendu.me.uk> Message-ID: <453CFDC9.8030107@sheffield.ac.uk> Sendu Bala wrote: > Chris Fields wrote: > >> ... >> >>> The current situation is: CPAN users will get all optional external >>> dependencies without using Bundle::BioPerl. Manual installers of bioperl >>> (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to >>> get full functionality. >>> >> I don't think forcing is necessary, so a CPAN installation shouldn't force >> someone to install optional modules. Graph.pm, for instance has a few >> optional modules, and the tests which use those get skipped and pass so the >> installation proceeds w/o problems. We could do the same (any tests using >> those optional modules display the reason why they are skipped). >> > > I should clarify and say that that's what happens in Bioperl as well. > The 'forcing' that I talk about is simply what I assume will happen if > the user has CPAN set to automatically install dependencies. The user > could say 'no' to every question regarding the installation of > dependencies that CPAN discovers and Bioperl would still install fine. > > So really the difference between the current situation and, say, the > situation when 1.5.1 was released, is that the CPAN user doesn't have to > use Bundle::BioPerl for full functionality anymore, but can still chose > not to install all the optional external modules. > > --snip-- Obviously, we could maintain a Bundle::BioPerl which includes all dependencies required for a fully functional Bioperl. I think the whole idea for a Bundle is to provide a common environment for a particular package. If for example, someone chooses not to install the dependencies through CPAN (in the current setup), that can easily go back and install Bundle::BioPerl and it would retrieve any missing dependencies for a fully functional Bioperl-core. Nath From n.haigh at sheffield.ac.uk Mon Oct 23 14:06:16 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 18:06:16 +0000 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CE2D7.5080608@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> Message-ID: <453D0498.8050206@sheffield.ac.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: > >> I believe the link to the documentation above describes a common CPAN >> versioning scheme as follows: >> >> 1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32 >> >> Therefore version 1.5 of Bioperl would be: 1.50 and version 1.5.2 would >> be better as 1.52. Then to indicate that the 1.5 series is a developer >> release, you append the underscore and at least 2 digits. Thus resulting >> in the following: Bioperl 1.5 would be 1.50_01 and 1.5.2 would be >> 1.52_01. The only thing i'm unsure about would be when does the _01 get >> incremented? I suspect we would probably not increment this number since >> each release would be an increment of the minor release number e.g. >> 1.52_01, 1.53_01, 1.54_01 etc. >> >> Although I'm still not sure how this versioning would affect bioperl 1.4 >> since 1.4 uses a non-standard versioning scheme :o( >> > > Ok, I'm going to go ahead and call it 1.52_01 then. Surely 1.60 will be > treated higher than 1.4? Anyway, we can cross that bridge when we get > there, but this seems appropriate now. > > > Cheers, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > Just tried the suggested: perl -MExtUtils::MakeMaker -le 'print MM->parse_version(shift)' bioperl-1-5-2/Bio/Root/Version.pm To see how it parses the various different version schemes - here are the results: 1.5 -> 1.5 1.4 -> 1.4 1.60 -> 1.60 1.05_01 -> 1.0501 1.5_01 -> 1.501 1.50_01 -> 1.5001 Nath From cjfields at uiuc.edu Mon Oct 23 13:15:44 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 12:15:44 -0500 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <453CE6C5.6000108@sendu.me.uk> Message-ID: <002701c6f6c6$e2622c40$15327e82@pyrimidine> ... > I should clarify and say that that's what happens in Bioperl as well. > The 'forcing' that I talk about is simply what I assume will happen if > the user has CPAN set to automatically install dependencies. The user > could say 'no' to every question regarding the installation of > dependencies that CPAN discovers and Bioperl would still install fine. > > So really the difference between the current situation and, say, the > situation when 1.5.1 was released, is that the CPAN user doesn't have to > use Bundle::BioPerl for full functionality anymore, but can still chose > not to install all the optional external modules. > > The difference is the possible default behaviour. Those users that > auto-install dependencies get all the optional ones, whereas in the past > they would not have. I have to point out the benefit of this behaviour: > those people that don't care and just want it to work are more likely to > get an installation that does just work. People who know what they're > doing can still do what they want. OK with me. Any way we go about it, we have to assume that anyone who set CPAN to automatically install dependencies would want this behavior. > Before we decide what to do I guess we need hard confirmation of how > CPAN will actually behave with the current Makefile.PL. Any ideas how we > can find out? > > It would also be good to have more options to break the current tie > (Nathan is for keeping PREREQ_PM populated, Chris is for having it > empty, I can go either way)... Frankly I'm for whatever is easiest for the end-user. I think we should continue maintaining Bundle::Bioperl b/c of its convenience (easier for us to say 'install Bundle::Bioperl' as opposed to 'install modules a b d d e f g...' ). I should note that Chris D. maintains Bundle::Bioperl via CPAN and can easily add/remove modules as needed, so all that would be necessary prior to a release is to make sure the various modules present in the Bundle are up-to-date. The only difficulty would updating the bundle PPM version for Win32; I agree with Nathan that it would be nice if it were easier to maintain. The PPD file generated using 'nmake ppd' needs modifications, likely b/c these are probably still generated as PPM3-compatible vs PPM4-compatible. I also think the idea of having the developer releases available via CPAN is a good one, as long as they are marked as such (which you are taking care of with versioning changes). It makes them a little more official, even if they are interim developer releases. Chris From cjfields at uiuc.edu Mon Oct 23 13:19:08 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 12:19:08 -0500 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <453CFDC9.8030107@sheffield.ac.uk> Message-ID: <002801c6f6c7$5a58ed60$15327e82@pyrimidine> ... > > So really the difference between the current situation and, say, the > > situation when 1.5.1 was released, is that the CPAN user doesn't have to > > use Bundle::BioPerl for full functionality anymore, but can still chose > > not to install all the optional external modules. > > > > > --snip-- > > Obviously, we could maintain a Bundle::BioPerl which includes all > dependencies required for a fully functional Bioperl. I think the whole > idea for a Bundle is to provide a common environment for a particular > package. If for example, someone chooses not to install the dependencies > through CPAN (in the current setup), that can easily go back and install > Bundle::BioPerl and it would retrieve any missing dependencies for a > fully functional Bioperl-core. > > Nath Succinctly put; I would've spent five paragraphs describing that! Too much coffee (from lab meetings...) Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 23 13:26:57 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 12:26:57 -0500 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: Message-ID: <002c01c6f6c8$7163dd20$15327e82@pyrimidine> Seth, Did you try this with a clean, taxonomy-installed database? There may be some junk left over tfrom the previous test runs. I'm looking into it this week; it may not make the developer release but we'll try to get it in. BTW, the 02sinmpleseq.t test failures have to do with a call to gzip. I'll look into a workaround for that. Sendu has posted a Bio::Root::Root fix which does get rid of some bugs but introduces others. One alternative which I found works is cygwin, but there's a catch: DBD-mysql is hard to install. If it isn't one thing it's another... Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign _____ From: Seth Johnson [mailto:johnson.biotech at gmail.com] Sent: Monday, October 23, 2006 11:37 AM To: Chris Fields Cc: bioperl-l Subject: Re: Error retrieving sequence from BioSQL Chris, There's definite improvement: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Failed Test Stat Wstat Total Fail Failed List of Failed ---------------------------------------------------------------------------- --- t/02species.t 65 2 3.08% 63 65 t/03simpleseq.t 1 256 59 106 179.66% 7-59 t/04swiss.t 52 14 26.92% 25 27-34 38-42 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There's some weirdness going on during the 'swiss.t' test. It almost seems to me that expectations of some tests are swapped (27 & 39, 28 & 40, 29 & 41, 31 & 27, 32 & 28, 33 & 29, 39 & 31): ================================ not ok 25 # Test 25 got: '10097078' (t/04swiss.t at line 79) # Expected: '91309150' ok 26 not ok 27 # Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t at line 85) # Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' not ok 28 # Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic mitochondrial matrix protein' (t/04swiss.t at line 86) # Expected: 'Functional expression of cloned human splicing factor SF2: homology to RNA-binding proteins, U1 70K, and Drosophila splicing regulators' not ok 29 # Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' (t/04swiss.t at line 87) # Expected: 'Cell 66 (2), 383-394 (1991)' not ok 30 # Test 30 got: (t/04swiss.t at line 88) # Expected: '91309150' not ok 31 # Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' (t/04swiss.t at line 85 fail #2) # Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., Celis, J.E. and Leffers,H.' not ok 32 # Test 32 got: 'Functional expression of cloned human splicing factor SF2: homology to RNA-binding proteins, U1 70K, and Drosophila splicing regulators' (t/04swiss.t at line 86 fail #2) # Expected: 'Cloning and expression of a cDNA covering the complete coding region of the P32 subunit of human pre-mRNA splicing factor SF2' not ok 33 # Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail #2) # Expected: 'Gene 134 (2), 283-287 (1993)' not ok 34 # Test 34 got: (t/04swiss.t at line 88 fail #2) # Expected: '94085792' ok 35 ok 36 ok 37 not ok 38 # Test 38 got: (t/04swiss.t at line 88 fail #3) # Expected: '94253723' not ok 39 # Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4) # Expected: 'Jiang,J., Zhang,Y., Krainer, A.R. and Xu,R.M.' not ok 40 # Test 40 got: 'Cloning and expression of a cDNA covering the complete coding region of the P32 subunit of human pre-mRNA splicing factor SF2' (t/04swiss.t at line 86 fail #4) # Expected: 'Crystal structure of human p32, a doughnut-shaped acidic mitochondrial matrix protein' not ok 41 # Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail #4) # Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' not ok 42 # Test 42 got: (t/04swiss.t at line 88 fail #4) # Expected: '99199225' ============================== On 10/20/06, Chris Fields < cjfields at uiuc.edu> wrote: Seth, Did you work out the problem here? There was a recent CVS update to OBDA tests (16obda.t) that fixed similar problems on Mac OS X. Old OBDA tests apparently left data from tests in the database, which caused problems with repeated test runs. Chris > > -----Original Message----- > > From: bioperl-l-bounces lists.open-bio.org [mailto: bioperl-l- > > bounces lists.open-bio.org] On Behalf Of Seth Johnson > > Sent: Saturday, September 30, 2006 6:35 PM > > To: Hilmar Lapp > > Cc: Chris Fields; Bioperl List > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > > > Here're complete test details: > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > ... > > > FAILED tests 10-12 > > Failed 3/12 tests, 75.00% okay > > Failed Test Stat Wstat Total Fail Failed List of Failed > > > -------------------------------------------------------------------------- > > ----- > > t\02species.t 65 2 3.08% 63 65 > > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > > t\12ontology.t 2 512 738 1471 199.32% 3-738 > > t\16obda.t 12 3 25.00% 10-12 > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From johnson.biotech at gmail.com Mon Oct 23 12:36:36 2006 From: johnson.biotech at gmail.com (Seth Johnson) Date: Mon, 23 Oct 2006 12:36:36 -0400 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: <000001c6f486$df508930$15327e82@pyrimidine> References: <000001c6f486$df508930$15327e82@pyrimidine> Message-ID: Chris, There's definite improvement: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Failed Test Stat Wstat Total Fail Failed List of Failed ------------------------------------------------------------------------------- t/02species.t 65 2 3.08% 63 65 t/03simpleseq.t 1 256 59 106 179.66% 7-59 t/04swiss.t 52 14 26.92% 25 27-34 38-42 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There's some weirdness going on during the 'swiss.t' test. It almost seems to me that expectations of some tests are swapped (27 & 39, 28 & 40, 29 & 41, 31 & 27, 32 & 28, 33 & 29, 39 & 31): ================================ not ok 25 # Test 25 got: '10097078' (t/04swiss.t at line 79) # Expected: '91309150' ok 26 not ok 27 # Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t at line 85) # Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' not ok 28 # Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic mitochondrial matrix protein' (t/04swiss.t at line 86) # Expected: 'Functional expression of cloned human splicing factor SF2: homology to RNA-binding proteins, U1 70K, and Drosophila splicing regulators' not ok 29 # Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' (t/04swiss.t at line 87) # Expected: 'Cell 66 (2), 383-394 (1991)' not ok 30 # Test 30 got: (t/04swiss.t at line 88) # Expected: '91309150' not ok 31 # Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' (t/04swiss.t at line 85 fail #2) # Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., Celis,J.E. and Leffers,H.' not ok 32 # Test 32 got: 'Functional expression of cloned human splicing factor SF2: homology to RNA-binding proteins, U1 70K, and Drosophila splicing regulators' (t/04swiss.t at line 86 fail #2) # Expected: 'Cloning and expression of a cDNA covering the complete coding region of the P32 subunit of human pre-mRNA splicing factor SF2' not ok 33 # Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail #2) # Expected: 'Gene 134 (2), 283-287 (1993)' not ok 34 # Test 34 got: (t/04swiss.t at line 88 fail #2) # Expected: '94085792' ok 35 ok 36 ok 37 not ok 38 # Test 38 got: (t/04swiss.t at line 88 fail #3) # Expected: '94253723' not ok 39 # Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4) # Expected: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' not ok 40 # Test 40 got: 'Cloning and expression of a cDNA covering the complete coding region of the P32 subunit of human pre-mRNA splicing factor SF2' (t/04swiss.t at line 86 fail #4) # Expected: 'Crystal structure of human p32, a doughnut-shaped acidic mitochondrial matrix protein' not ok 41 # Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail #4) # Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' not ok 42 # Test 42 got: (t/04swiss.t at line 88 fail #4) # Expected: '99199225' ============================== On 10/20/06, Chris Fields wrote: > > > > Seth, > > Did you work out the problem here? There was a recent CVS update to OBDA > tests (16obda.t) that fixed similar problems on Mac OS X. Old OBDA tests > apparently left data from tests in the database, which caused problems > with > repeated test runs. > > Chris > > > > -----Original Message----- > > > From: bioperl-l-bounces lists.open-bio.org [mailto: bioperl-l- > > > bounces lists.open-bio.org] On Behalf Of Seth Johnson > > > Sent: Saturday, September 30, 2006 6:35 PM > > > To: Hilmar Lapp > > > Cc: Chris Fields; Bioperl List > > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > > > > > Here're complete test details: > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > ... > > > > > FAILED tests 10-12 > > > Failed 3/12 tests, 75.00% okay > > > Failed Test Stat Wstat Total Fail Failed List of Failed > > > > > > -------------------------------------------------------------------------- > > > ----- > > > t\02species.t 65 2 3.08% 63 65 > > > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > > > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > > > t\12ontology.t 2 512 738 1471 199.32% 3-738 > > > t\16obda.t 12 3 25.00% 10-12 > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > From n.haigh at sheffield.ac.uk Mon Oct 23 16:08:00 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 20:08:00 +0000 Subject: [Bioperl-l] CPAN testing Service Message-ID: <453D2120.9010301@sheffield.ac.uk> We should also check the CPAN testing service (CPANTS) to see how "good" our package is for CPAN and try to increase the Kwalitee score. There only appears to be details for bioperl-1.2.3 for some reason: http://cpants.perl.org/dist/bioperl Nath From pabloivan at gmail.com Sun Oct 22 15:54:35 2006 From: pabloivan at gmail.com (Pablo Ivan) Date: Sun, 22 Oct 2006 16:54:35 -0300 Subject: [Bioperl-l] Bioperl installation under Windows Message-ID: Hello, I have been trying to install Bioperl 1.4 on a Windows XP system, but I didn't get too far; my perl installation was made using ActiveState 5.8.8build 816. I then tried the ppm method of searching for bioperl in the repositories and installing the core package 1.4. It says that the installation was made successfully, but the /Bio folder doesn't show up in /lib, and it's like nothing new was installed at all. I was wondering if using that version of ActiveState could be causing it, but the uninstall option for it isn't showing in Add/Remove, and I'm afraid just deleting the folders and installing version 5.6 of AS could somehow damage and make things worse. Or should I just forget about it and try using Cygwin? Thank you, Pablo. From cjfields at uiuc.edu Mon Oct 23 17:34:47 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 16:34:47 -0500 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: Message-ID: <000401c6f6eb$111df040$15327e82@pyrimidine> Don't know what that particular error is, but it looks ActivePerl-related (PPM generates HTML from the blib directory). You may need to run 'nmake clean' in between test cycles get rid of old blib and other files. The carryover issue from old test runs was a definite problem. Brian fixed that in the bioperl-db CVS recently. Also, I tried Sendu's fixes from CVS head to Bio::Root::Root and they seem to fix the problems with Bio::Root::Root. The issue came down to a use of indirect syntax (a bad perl practice). There are other errors popping up related to Bio::Species, but these seem fixable at least. I committed a few changes to bioperl-db CVS to fix 03simpleseq.t test failures due to a lack of gzip on WinXP (I didn't see them b/c I had a copy on GNU gzip in my path). These should pass w/o problems now on WinXP. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign _____ From: Seth Johnson [mailto:johnson.biotech at gmail.com] Sent: Monday, October 23, 2006 4:22 PM To: Chris Fields Cc: bioperl-l Subject: Re: Error retrieving sequence from BioSQL Chris, I have not cleaned my test database yet. I'll purge it and redo the tests. This error keeps popping up in unexpected places while running nmake during installation: "Undefined subroutine &main::UpdateHTML_blib called at -e line 1. NMAKE : fatal error U1077: 'C:\WINDOWS\system32\cmd.exe' : return code '0xff'" Is there a way around it?? Seth On 10/23/06, Chris Fields wrote: Seth, Did you try this with a clean, taxonomy-installed database? There may be some junk left over tfrom the previous test runs. I'm looking into it this week; it may not make the developer release but we'll try to get it in. BTW, the 02sinmpleseq.t test failures have to do with a call to gzip. I'll look into a workaround for that. Sendu has posted a Bio::Root::Root fix which does get rid of some bugs but introduces others. One alternative which I found works is cygwin, but there's a catch: DBD-mysql is hard to install. If it isn't one thing it's another... Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign _____ From: Seth Johnson [mailto:johnson.biotech at gmail.com] Sent: Monday, October 23, 2006 11:37 AM To: Chris Fields Cc: bioperl-l Subject: Re: Error retrieving sequence from BioSQL Chris, There's definite improvement: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Failed Test Stat Wstat Total Fail Failed List of Failed ---------------------------------------------------------------------------- --- t/02species.t 65 2 3.08% 63 65 t/03simpleseq.t 1 256 59 106 179.66% 7-59 t/04swiss.t 52 14 26.92% 25 27-34 38-42 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There's some weirdness going on during the 'swiss.t' test. It almost seems to me that expectations of some tests are swapped (27 & 39, 28 & 40, 29 & 41, 31 & 27, 32 & 28, 33 & 29, 39 & 31): ================================ not ok 25 # Test 25 got: '10097078' (t/04swiss.t at line 79) # Expected: '91309150' ok 26 not ok 27 # Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t at line 85) # Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' not ok 28 # Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic mitochondrial matrix protein' (t/04swiss.t at line 86) # Expected: 'Functional expression of cloned human splicing factor SF2: homology to RNA-binding proteins, U1 70K, and Drosophila splicing regulators' not ok 29 # Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' (t/04swiss.t at line 87) # Expected: 'Cell 66 (2), 383-394 (1991)' not ok 30 # Test 30 got: (t/04swiss.t at line 88) # Expected: '91309150' not ok 31 # Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' (t/04swiss.t at line 85 fail #2) # Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., Celis, J.E. and Leffers,H.' not ok 32 # Test 32 got: 'Functional expression of cloned human splicing factor SF2: homology to RNA-binding proteins, U1 70K, and Drosophila splicing regulators' (t/04swiss.t at line 86 fail #2) # Expected: 'Cloning and expression of a cDNA covering the complete coding region of the P32 subunit of human pre-mRNA splicing factor SF2' not ok 33 # Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail #2) # Expected: 'Gene 134 (2), 283-287 (1993)' not ok 34 # Test 34 got: (t/04swiss.t at line 88 fail #2) # Expected: '94085792' ok 35 ok 36 ok 37 not ok 38 # Test 38 got: (t/04swiss.t at line 88 fail #3) # Expected: '94253723' not ok 39 # Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4) # Expected: 'Jiang,J., Zhang,Y., Krainer, A.R. and Xu,R.M.' not ok 40 # Test 40 got: 'Cloning and expression of a cDNA covering the complete coding region of the P32 subunit of human pre-mRNA splicing factor SF2' (t/04swiss.t at line 86 fail #4) # Expected: 'Crystal structure of human p32, a doughnut-shaped acidic mitochondrial matrix protein' not ok 41 # Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail #4) # Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' not ok 42 # Test 42 got: (t/04swiss.t at line 88 fail #4) # Expected: '99199225' ============================== On 10/20/06, Chris Fields < cjfields at uiuc.edu > wrote: Seth, Did you work out the problem here? There was a recent CVS update to OBDA tests (16obda.t) that fixed similar problems on Mac OS X. Old OBDA tests apparently left data from tests in the database, which caused problems with repeated test runs. Chris > > -----Original Message----- > > From: bioperl-l-bounces lists.open-bio.org [mailto: bioperl-l- > > bounces lists.open-bio.org] On Behalf Of Seth Johnson > > Sent: Saturday, September 30, 2006 6:35 PM > > To: Hilmar Lapp > > Cc: Chris Fields; Bioperl List > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > > > Here're complete test details: > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > ... > > > FAILED tests 10-12 > > Failed 3/12 tests, 75.00% okay > > Failed Test Stat Wstat Total Fail Failed List of Failed > > > -------------------------------------------------------------------------- > > ----- > > t\02species.t 65 2 3.08% 63 65 > > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > > t\12ontology.t 2 512 738 1471 199.32% 3-738 > > t\16obda.t 12 3 25.00% 10-12 > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- Best Regards, Seth Johnson Senior Bioinformatics Associate Ph: (202) 470-0900 Fx: (775) 251-0358 From cjfields at uiuc.edu Mon Oct 23 17:53:27 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 16:53:27 -0500 Subject: [Bioperl-l] Bioperl installation under Windows In-Reply-To: References: Message-ID: <9994CFF6-FCA1-4C7F-9A33-31765C6AE255@uiuc.edu> It won't install in Perl\lib, but in Perl\site\lib. Check there. We are working intently on the next developer release for BioPerl and plan on having several PPMs available, but we only are supporting ActivePerl 5.8.8.819. I would suggest that you upgrade your ActivePerl installation to that if possible since PPM has undergone major changes (they use PPM4 now, which has a GUI by default). Most repositories are now moving over to using PPM4 so you'll likely be seeing less PPM3-compatible packages being made. Chris On Oct 22, 2006, at 2:54 PM, Pablo Ivan wrote: > Hello, > > I have been trying to install Bioperl 1.4 on a Windows XP system, > but I > didn't get too far; my perl installation was made using ActiveState > 5.8.8build 816. I then tried the ppm method of searching for bioperl > in the > repositories and installing the core package 1.4. It says that the > installation was made successfully, but the /Bio folder doesn't > show up in > /lib, and it's like nothing new was installed at all. I was > wondering if > using that version of ActiveState could be causing it, but the > uninstall > option for it isn't showing in Add/Remove, and I'm afraid just > deleting the > folders and installing version 5.6 of AS could somehow damage and make > things worse. Or should I just forget about it and try using Cygwin? > > Thank you, > > Pablo. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From johnson.biotech at gmail.com Mon Oct 23 17:22:13 2006 From: johnson.biotech at gmail.com (Seth Johnson) Date: Mon, 23 Oct 2006 17:22:13 -0400 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: <002c01c6f6c8$7163dd20$15327e82@pyrimidine> References: <002c01c6f6c8$7163dd20$15327e82@pyrimidine> Message-ID: Chris, I have not cleaned my test database yet. I'll purge it and redo the tests. This error keeps popping up in unexpected places while running nmake during installation: "Undefined subroutine &main::UpdateHTML_blib called at -e line 1. NMAKE : fatal error U1077: 'C:\WINDOWS\system32\cmd.exe' : return code '0xff'" Is there a way around it?? Seth On 10/23/06, Chris Fields wrote: > > Seth, > > Did you try this with a clean, taxonomy-installed database? There may be > some junk left over tfrom the previous test runs. > > I'm looking into it this week; it may not make the developer release but > we'll try to get it in. BTW, the 02sinmpleseq.t test failures have to do > with a call to gzip. I'll look into a workaround for that. > > Sendu has posted a Bio::Root::Root fix which does get rid of some bugs but > introduces others. One alternative which I found works is cygwin, but > there's a catch: DBD-mysql is hard to install. If it isn't one thing it's > another... > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > ------------------------------ > > *From:* Seth Johnson [mailto:johnson.biotech at gmail.com] > *Sent:* Monday, October 23, 2006 11:37 AM > *To:* Chris Fields > *Cc:* bioperl-l > *Subject:* Re: Error retrieving sequence from BioSQL > > > > Chris, > > There's definite improvement: > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Failed Test Stat Wstat Total Fail Failed List of Failed > ------------------------------------------------------------------------------- > > t/02species.t 65 2 3.08% 63 65 > t/03simpleseq.t 1 256 59 106 179.66% 7-59 > t/04swiss.t 52 14 26.92% 25 27-34 38-42 > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > There's some weirdness going on during the 'swiss.t' test. It almost > seems to me that expectations of some tests are swapped (27 & 39, 28 & 40, > 29 & 41, 31 & 27, 32 & 28, 33 & 29, 39 & 31): > ================================ > not ok 25 > # Test 25 got: '10097078' (t/04swiss.t at line 79) > # Expected: '91309150' > ok 26 > not ok 27 > # Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t > at line 85) > # Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' > not ok 28 > # Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic > mitochondrial matrix protein' (t/04swiss.t at line 86) > # Expected: 'Functional expression of cloned human splicing factor SF2: > homology to RNA-binding proteins, U1 70K, and Drosophila splicing > regulators' > not ok 29 > # Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' > (t/04swiss.t at line 87) > # Expected: 'Cell 66 (2), 383-394 (1991)' > not ok 30 > # Test 30 got: (t/04swiss.t at line 88) > # Expected: '91309150' > not ok 31 > # Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' > (t/04swiss.t at line 85 fail #2) > # Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., > Celis, J.E. and Leffers,H.' > not ok 32 > # Test 32 got: 'Functional expression of cloned human splicing factor SF2: > homology to RNA-binding proteins, U1 70K, and Drosophila splicing > regulators' (t/04swiss.t at line 86 fail #2) > # Expected: 'Cloning and expression of a cDNA covering the complete > coding region of the P32 subunit of human pre-mRNA splicing factor SF2' > not ok 33 > # Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail > #2) > # Expected: 'Gene 134 (2), 283-287 (1993)' > not ok 34 > # Test 34 got: (t/04swiss.t at line 88 fail #2) > # Expected: '94085792' > ok 35 > ok 36 > ok 37 > not ok 38 > # Test 38 got: (t/04swiss.t at line 88 fail #3) > # Expected: '94253723' > not ok 39 > # Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., > Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4) > # Expected: 'Jiang,J., Zhang,Y., Krainer, A.R. and Xu,R.M.' > not ok 40 > # Test 40 got: 'Cloning and expression of a cDNA covering the complete > coding region of the P32 subunit of human pre-mRNA splicing factor SF2' > (t/04swiss.t at line 86 fail #4) > # Expected: 'Crystal structure of human p32, a doughnut-shaped acidic > mitochondrial matrix protein' > not ok 41 > # Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail > #4) > # Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' > not ok 42 > # Test 42 got: (t/04swiss.t at line 88 fail #4) > # Expected: '99199225' > ============================== > > On 10/20/06, *Chris Fields* < cjfields at uiuc.edu> wrote: > > > > Seth, > > Did you work out the problem here? There was a recent CVS update to OBDA > tests (16obda.t) that fixed similar problems on Mac OS X. Old OBDA tests > apparently left data from tests in the database, which caused problems > with > repeated test runs. > > Chris > > > > -----Original Message----- > > > From: bioperl-l-bounces lists.open-bio.org [mailto: bioperl-l- > > > bounces lists.open-bio.org] On Behalf Of Seth Johnson > > > Sent: Saturday, September 30, 2006 6:35 PM > > > To: Hilmar Lapp > > > Cc: Chris Fields; Bioperl List > > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > > > > > Here're complete test details: > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > ... > > > > > FAILED tests 10-12 > > > Failed 3/12 tests, 75.00% okay > > > Failed Test Stat Wstat Total Fail Failed List of Failed > > > > > > -------------------------------------------------------------------------- > > > ----- > > > t\02species.t 65 2 3.08% 63 65 > > > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > > > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > > > t\12ontology.t 2 512 738 1471 199.32% 3-738 > > > t\16obda.t 12 3 25.00% 10-12 > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > -- Best Regards, Seth Johnson Senior Bioinformatics Associate Ph: (202) 470-0900 Fx: (775) 251-0358 From chhalling at alumni.ls.berkeley.edu Mon Oct 23 21:02:24 2006 From: chhalling at alumni.ls.berkeley.edu (Conrad Halling) Date: Mon, 23 Oct 2006 21:02:24 -0400 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C6509.90005@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> Message-ID: <453D6620.5020401@alumni.ls.berkeley.edu> Sorry, I should know better about giving all the details. This was using the Bundle::BioPerl on CPAN, installing for Perl 5.8.8 (a fresh compile) with Mac OS X 10.4.8. -- Conrad Nathan S. Haigh wrote: > Chris Fields wrote: > >> Thanks for letting us know! Did PPM4 throw errors or just silently >> pass them over? >> >> Chris >> >> On Oct 22, 2006, at 6:45 PM, Conrad Halling wrote: >> >> >> > I believe he is talking about the bundle on cpan and not the ppd. I will > get this updated as soon as possible. > > Sendu/Chris - can you confirm to me which Bioperl modules are essential > to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any > reason for not putting *all* dependencies into the bundle? > > Nath > > > > > > -- Conrad Halling chhalling at alumni.ls.berkeley.edu From n.haigh at sheffield.ac.uk Tue Oct 24 03:05:53 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 24 Oct 2006 08:05:53 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453D6620.5020401@alumni.ls.berkeley.edu> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453D6620.5020401@alumni.ls.berkeley.edu> Message-ID: <453DBB51.6010505@sheffield.ac.uk> Conrad Halling wrote: > Sorry, I should know better about giving all the details. > > This was using the Bundle::BioPerl on CPAN, installing for Perl 5.8.8 (a > fresh compile) with Mac OS X 10.4.8. > > -- Conrad > > My apologies Conrad, this was my bad! Are you in need of the corrections being made swiftly or can you wait until the Bioperl 1.5.2 release when I'll ensure the Bundle is updated correctly for that release? Cheers Nath From n.haigh at sheffield.ac.uk Tue Oct 24 05:57:25 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Tue, 24 Oct 2006 10:57:25 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CE2D7.5080608@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> Message-ID: <453DE385.8010700@sheffield.ac.uk> --snip-- > Ok, I'm going to go ahead and call it 1.52_01 then. Surely 1.60 will be > treated higher than 1.4? Anyway, we can cross that bridge when we get > there, but this seems appropriate now. > > > Cheers, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > Just been having a think about this versioning. Does this work well and is it intuitive with versioning the official 1.5.2 developer release and also the 1.6 stable release? I'd like to put forward the following versioning scheme for consideration (most is the same as what it is now, but with some clarification - hopefully): major-version . minor-version sub-version _ developer-release-version RC-version The sub-version represents bug-fixes and possibly some minor feature enhancements with no API changes. The minor-version represents some significant feature enhancements/API changes/bug fixes. The major-version represents significant rewrites of Bioperl. For an RC of a developer release the version would have _0x (where x=the RC number) For a non RC of a developer release the version would have _10 For an RC of a stable release the version would have _0x (where x=RC number) Fo a non RC of a stable release the version would not have the underscore suffix Therefore I would see the following $VERSION being applied: 1.5.2 RC1 = 1.52_01 1.5.2 RC2 = 1.52_02 1.5.2 RC3 = 1.52_03 1.5.2 = 1.52_10 1.6 RC1 = 1.60_01 1.6 RC2 = 1.60_02 1.6 = 1.60 1.6.1 RC1 = 1.61_01 1.6.1 = 1.61 This should satisfy the requirement of CPAN for having underscores in versions to indicate a developer release, which here is a Bioperl release with an odd minor version number or any RC whether it be of a developer release or a stable release. This should mean that we could have the RC's on CPAN, but by default, CPAN would only install the latest "non developer release" (i.e. the last package without an underscore in the version). If we are going ahead with the new $VERSION scheme (as it currently is in HEAD), we should, for the sake of clarity, try to talk about Bioperl 1.52 instead of Bioperl 1.5.2 and make an effort to sync the documentation with regards to this. Nath From bix at sendu.me.uk Tue Oct 24 06:19:05 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 24 Oct 2006 11:19:05 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453DE385.8010700@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> <453DE385.8010700@sheffield.ac.uk> Message-ID: <453DE899.4030603@sendu.me.uk> Nathan Haigh wrote: > > Therefore I would see the following $VERSION being applied: > 1.5.2 RC1 = 1.52_01 > 1.5.2 RC2 = 1.52_02 > 1.5.2 RC3 = 1.52_03 > 1.5.2 = 1.52_10 > 1.6 RC1 = 1.60_01 > 1.6 RC2 = 1.60_02 > 1.6 = 1.60 > 1.6.1 RC1 = 1.61_01 > 1.6.1 = 1.61 > > This should satisfy the requirement of CPAN for having underscores in > versions to indicate a developer release, which here is a Bioperl > release with an odd minor version number or any RC whether it be of a > developer release or a stable release. This should mean that we could > have the RC's on CPAN, but by default, CPAN would only install the > latest "non developer release" (i.e. the last package without an > underscore in the version). That all sounds good to me, except I worry about potential confusion if people look manually at the things available in CPAN, see 1.60_02 and think it is more recent than 1.60 and try to install it manually. Since $VERSION = 1.52_10; is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, final release version should be $VERSION = 1.6010. > If we are going ahead with the new $VERSION scheme (as it currently is > in HEAD), we should, for the sake of clarity, try to talk about Bioperl > 1.52 instead of Bioperl 1.5.2 and make an effort to sync the > documentation with regards to this. I might disagree with this though. I think perl people, and perhaps unix people in general, should be used to version numbers like '1.5.2', but then getting '1.52' from the code since such a number allows simple numerical comparisons while the former does not. The former is easier to read and understand. This is just how Perl itself behaves. Most users who wouldn't expect such a behaviour aren't going to be checking the version number programatically anyway. BTW. do we have someone with a CPAN account, or should I get one? From n.haigh at sheffield.ac.uk Tue Oct 24 07:37:12 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Tue, 24 Oct 2006 12:37:12 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453DE899.4030603@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> <453DE385.8010700@sheffield.ac.uk> <453DE899.4030603@sendu.me.uk> Message-ID: <453DFAE8.5050602@sheffield.ac.uk> Sendu Bala wrote: > Nathan Haigh wrote: > >> Therefore I would see the following $VERSION being applied: >> 1.5.2 RC1 = 1.52_01 >> 1.5.2 RC2 = 1.52_02 >> 1.5.2 RC3 = 1.52_03 >> 1.5.2 = 1.52_10 >> 1.6 RC1 = 1.60_01 >> 1.6 RC2 = 1.60_02 >> 1.6 = 1.60 >> 1.6.1 RC1 = 1.61_01 >> 1.6.1 = 1.61 >> >> This should satisfy the requirement of CPAN for having underscores in >> versions to indicate a developer release, which here is a Bioperl >> release with an odd minor version number or any RC whether it be of a >> developer release or a stable release. This should mean that we could >> have the RC's on CPAN, but by default, CPAN would only install the >> latest "non developer release" (i.e. the last package without an >> underscore in the version). >> > > That all sounds good to me, except I worry about potential confusion if > people look manually at the things available in CPAN, see 1.60_02 and > think it is more recent than 1.60 and try to install it manually. > > I not sure if this would be a problem. As far as I understand, CPAN treats these packages with underscores in $VERSION as something distinctly different to the others releases (i.e. developer releases). If you look at such a page, it is clearly evident that it is a developers release. For example, if you search on CPAN for the latest version of the CPAN module is shows 1.8802. if you go to that page: http://search.cpan.org/~andk/CPAN-1.8802/ There is also a link for the latest developer release, released 1 day after 1.8802 with a version of 1.88_57 (which would convert to 1.8857). This too appears to be later that 1.8802, but since it is dealt with as a developer release it doesn't seem to matter - CPAN will only deal with the stable (non-developer) releases, while the developer releases can be used as a convenient way to access developer releases. Although I'm thinking CPAN uses some hocus pocus with release dates too. > Since > $VERSION = 1.52_10; > is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, > final release version should be > $VERSION = 1.6010. > > > Because they are dealt with separately, I don't think this is an issue (see above). >> If we are going ahead with the new $VERSION scheme (as it currently is >> in HEAD), we should, for the sake of clarity, try to talk about Bioperl >> 1.52 instead of Bioperl 1.5.2 and make an effort to sync the >> documentation with regards to this. >> > > I might disagree with this though. I think perl people, and perhaps unix > people in general, should be used to version numbers like '1.5.2', but > then getting '1.52' from the code since such a number allows simple > numerical comparisons while the former does not. The former is easier to > read and understand. This is just how Perl itself behaves. > > Most users who wouldn't expect such a behaviour aren't going to be > checking the version number programatically anyway. > > > BTW. do we have someone with a CPAN account, or should I get one? > It says Ewan Birney is the author of Bioperl - I assume it must be possible to have multiple people have the permissions to update a single package. Nath From chhalling at alumni.ls.berkeley.edu Tue Oct 24 07:15:12 2006 From: chhalling at alumni.ls.berkeley.edu (Conrad Halling) Date: Tue, 24 Oct 2006 07:15:12 -0400 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453DBB51.6010505@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453D6620.5020401@alumni.ls.berkeley.edu> <453DBB51.6010505@sheffield.ac.uk> Message-ID: <453DF5C0.3040104@alumni.ls.berkeley.edu> Nathan S. Haigh wrote: > Conrad Halling wrote: >> Sorry, I should know better about giving all the details. >> >> This was using the Bundle::BioPerl on CPAN, installing for Perl 5.8.8 >> (a fresh compile) with Mac OS X 10.4.8. >> >> -- Conrad > My apologies Conrad, this was my bad! Are you in need of the > corrections being made swiftly or can you wait until the Bioperl 1.5.2 > release when I'll ensure the Bundle is updated correctly for that > release? > > Cheers > Nath No, I'm fine. I used the cpan utility to load the three modules manually. -- Conrad -- Conrad Halling chhalling at alumni.ls.berkeley.edu From bix at sendu.me.uk Tue Oct 24 08:16:54 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 24 Oct 2006 13:16:54 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453DFAE8.5050602@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> <453DE385.8010700@sheffield.ac.uk> <453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk> Message-ID: <453E0436.3050903@sendu.me.uk> Nathan Haigh wrote: > Sendu Bala wrote: > >> That all sounds good to me, except I worry about potential confusion if >> people look manually at the things available in CPAN, see 1.60_02 and >> think it is more recent than 1.60 and try to install it manually. > > I not sure if this would be a problem. As far as I understand, CPAN > treats these packages with underscores in $VERSION as something > distinctly different to the others releases (i.e. developer releases). > If you look at such a page, it is clearly evident that it is a > developers release. For example, if you search on CPAN for the latest > version of the CPAN module is shows 1.8802. if you go to that page: > http://search.cpan.org/~andk/CPAN-1.8802/ > There is also a link for the latest developer release, released 1 day > after 1.8802 with a version of 1.88_57 (which would convert to 1.8857). [snip] >> Since >> $VERSION = 1.52_10; >> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, >> final release version should be >> $VERSION = 1.6010. > > Because they are dealt with separately, I don't think this is an issue > (see above). If you don't notice the dates, or are doing numerical version number comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may not be automatic, but you can still chose to download the developer releases. Which means if we say to someone 'use Bioperl 1.6 or better' they may choose to get the latest version and think it is 1.6002 when infact 1.60 was the more recent version. 1.6010 solves the problem, is consistent with your 1.50_10 suggestion, and doesn't cause any problems as far as I can see. >>> If we are going ahead with the new $VERSION scheme (as it currently is >>> in HEAD), we should, for the sake of clarity, try to talk about Bioperl >>> 1.52 instead of Bioperl 1.5.2 and make an effort to sync the >>> documentation with regards to this. >>> >> I might disagree with this though. I think perl people, and perhaps unix >> people in general, should be used to version numbers like '1.5.2', but >> then getting '1.52' from the code since such a number allows simple >> numerical comparisons while the former does not. The former is easier to >> read and understand. This is just how Perl itself behaves. >> >> Most users who wouldn't expect such a behaviour aren't going to be >> checking the version number programatically anyway. >> >> >> BTW. do we have someone with a CPAN account, or should I get one? >> > > It says Ewan Birney is the author of Bioperl - I assume it must be > possible to have multiple people have the permissions to update a single > package. How did you get Bundle::BioPerl updated? Did you just ask Chris Dagdigian to do it for you? Or do you have access to his account? I'll ask Ewan about it. From n.haigh at sheffield.ac.uk Tue Oct 24 08:21:56 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Tue, 24 Oct 2006 13:21:56 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453E0436.3050903@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> <453DE385.8010700@sheffield.ac.uk> <453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk> <453E0436.3050903@sendu.me.uk> Message-ID: <453E0564.9030302@sheffield.ac.uk> Sendu Bala wrote: > Nathan Haigh wrote: >> Sendu Bala wrote: >> >>> That all sounds good to me, except I worry about potential confusion >>> if people look manually at the things available in CPAN, see 1.60_02 >>> and think it is more recent than 1.60 and try to install it manually. >> >> I not sure if this would be a problem. As far as I understand, CPAN >> treats these packages with underscores in $VERSION as something >> distinctly different to the others releases (i.e. developer releases). >> If you look at such a page, it is clearly evident that it is a >> developers release. For example, if you search on CPAN for the latest >> version of the CPAN module is shows 1.8802. if you go to that page: >> http://search.cpan.org/~andk/CPAN-1.8802/ >> There is also a link for the latest developer release, released 1 day >> after 1.8802 with a version of 1.88_57 (which would convert to 1.8857). > > [snip] > >>> Since >>> $VERSION = 1.52_10; >>> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before >>> release, final release version should be >>> $VERSION = 1.6010. >> >> Because they are dealt with separately, I don't think this is an issue >> (see above). > > If you don't notice the dates, or are doing numerical version number > comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may > not be automatic, but you can still chose to download the developer > releases. Which means if we say to someone 'use Bioperl 1.6 or better' > they may choose to get the latest version and think it is 1.6002 when > infact 1.60 was the more recent version. 1.6010 solves the problem, is > consistent with your 1.50_10 suggestion, and doesn't cause any > problems as far as I can see. > > I see - you mean for a non-RC release append 10 to the version number and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to the version. --snip-- > > How did you get Bundle::BioPerl updated? Did you just ask Chris > Dagdigian to do it for you? Or do you have access to his account? I'll > ask Ewan about it. I just asked Chris D. to do it for me :o) Nath From bix at sendu.me.uk Tue Oct 24 09:01:22 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 24 Oct 2006 14:01:22 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453E0564.9030302@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> <453DE385.8010700@sheffield.ac.uk> <453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk> <453E0436.3050903@sendu.me.uk> <453E0564.9030302@sheffield.ac.uk> Message-ID: <453E0EA2.6050306@sendu.me.uk> Nathan Haigh wrote: > I see - you mean for a non-RC release append 10 to the version number > and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to > the version. Precisely. 1.5.2 RC3 will have in Bio::Root::Version : $VERSION = 1.52_03; $VERSION = eval $VERSION; # $VERSION is 1.5203 1.5.2 final release would have: $VERSION = 1.52_10; $VERSION = eval $VERSION; # $VERSION is 1.5210 1.6.0 RC1 would have: $VERSION = 1.60_01; $VERSION = eval $VERSION; # $VERSION is 1.6001 1.6.0 final release would have: $VERSION = 1.6010; Nice thing about putting RCs up on CPAN is that I suppose we'd see the test results from cpantesters. The more test results the better :) From n.haigh at sheffield.ac.uk Tue Oct 24 09:05:54 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Tue, 24 Oct 2006 14:05:54 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453E0EA2.6050306@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> <453DE385.8010700@sheffield.ac.uk> <453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk> <453E0436.3050903@sendu.me.uk> <453E0564.9030302@sheffield.ac.uk> <453E0EA2.6050306@sendu.me.uk> Message-ID: <453E0FB2.4080002@sheffield.ac.uk> Sendu Bala wrote: > Nathan Haigh wrote: >> I see - you mean for a non-RC release append 10 to the version number >> and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to >> the version. > > Precisely. > > 1.5.2 RC3 will have in Bio::Root::Version : > > $VERSION = 1.52_03; > $VERSION = eval $VERSION; # $VERSION is 1.5203 > > 1.5.2 final release would have: > > $VERSION = 1.52_10; > $VERSION = eval $VERSION; # $VERSION is 1.5210 > > 1.6.0 RC1 would have: > > $VERSION = 1.60_01; > $VERSION = eval $VERSION; # $VERSION is 1.6001 > > 1.6.0 final release would have: > > $VERSION = 1.6010; > > > Nice thing about putting RCs up on CPAN is that I suppose we'd see the > test results from cpantesters. The more test results the better :) Did you see the cpants site I sent earlier: http://cpants.perl.org/dist/bioperl But I'm not sure why 1.4 didn't make it in there instead of 1.2.3 From bix at sendu.me.uk Tue Oct 24 09:14:08 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 24 Oct 2006 14:14:08 +0100 Subject: [Bioperl-l] CPAN testing Service In-Reply-To: <453D2120.9010301@sheffield.ac.uk> References: <453D2120.9010301@sheffield.ac.uk> Message-ID: <453E11A0.20304@sendu.me.uk> Nathan S. Haigh wrote: > We should also check the CPAN testing service (CPANTS) to see how "good" > our package is for CPAN and try to increase the Kwalitee score. There > only appears to be details for bioperl-1.2.3 for some reason: > http://cpants.perl.org/dist/bioperl Yes, but I think it will be pretty similar score this time round. We'll resolve the remaining issues for 1.6. From cjfields at uiuc.edu Tue Oct 24 10:24:44 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 24 Oct 2006 09:24:44 -0500 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453E0436.3050903@sendu.me.uk> Message-ID: <000501c6f778$279cee10$15327e82@pyrimidine> ... > >> Since > >> $VERSION = 1.52_10; > >> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, > >> final release version should be > >> $VERSION = 1.6010. > > > > Because they are dealt with separately, I don't think this is an issue > > (see above). > > If you don't notice the dates, or are doing numerical version number > comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may > not be automatic, but you can still chose to download the developer > releases. Which means if we say to someone 'use Bioperl 1.6 or better' > they may choose to get the latest version and think it is 1.6002 when > infact 1.60 was the more recent version. 1.6010 solves the problem, is > consistent with your 1.50_10 suggestion, and doesn't cause any problems > as far as I can see. CPAN looks like it can handle 'x.y.z', at least for Pugs: http://search.cpan.org/~audreyt/Perl6-Pugs-6.2.13/ >From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013': our $VERSION = 6.002013; That's also a very perlish-way to do it. And there are no developer versions of Pugs, since it is always under active development. We could try something like: our $VERSION = 1.005002_01; just to tag it as a developer release or release candidate, if that's what you want; I'm neutral to that point. I don't think it's necessary to post every RC to CPAN, though, unless you feel very strongly about it. It just seems like more hassle than it's worth, esp. since you've been releasing about one per week leading up to a final 1.5.2 (due soon). > >> I might disagree with this though. I think perl people, and perhaps > unix > >> people in general, should be used to version numbers like '1.5.2', but > >> then getting '1.52' from the code since such a number allows simple > >> numerical comparisons while the former does not. The former is easier > to > >> read and understand. This is just how Perl itself behaves. > >> > >> Most users who wouldn't expect such a behaviour aren't going to be > >> checking the version number programatically anyway. > >> > >> > >> BTW. do we have someone with a CPAN account, or should I get one? > >> > > > > It says Ewan Birney is the author of Bioperl - I assume it must be > > possible to have multiple people have the permissions to update a single > > package. As a quick response to the above, I would read 'rel. 1.5.2' as the second patched release of the second revision (here in a developer cycle) of the first major release. I would read 'rel 1.52' as the 52nd release of the major release (just can't quite make it to version 2, I guess). I don't think we can use the latter as it is just too confusing, especially since we've adopted the 'major.minor.patch' versioning quite early on. As for CPAN, I believe there is usually a person or group responsible for maintaining each distribution. As Ewan seems to be the point man, you'll have to ask him. I suppose it is possible to add more if needed > How did you get Bundle::BioPerl updated? Did you just ask Chris > Dagdigian to do it for you? Or do you have access to his account? I'll > ask Ewan about it. When I inquired about XML::Simple, I emailed Chris D. via his contact information from CPAN. He let me know that adding it would be pretty easy, so all you need to do is let him know about any errors/additions/deletions. I think his wiki page also has some contact info. Which reminds me, if anyone contacts him, could you make sure that XML::Simple is added? I can't remember if it has been. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 24 10:29:11 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 24 Oct 2006 09:29:11 -0500 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453E0FB2.4080002@sheffield.ac.uk> Message-ID: <000601c6f778$c639f0e0$15327e82@pyrimidine> > Sendu Bala wrote: > > Nathan Haigh wrote: > >> I see - you mean for a non-RC release append 10 to the version number > >> and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to > >> the version. > > > > Precisely. > > > > 1.5.2 RC3 will have in Bio::Root::Version : > > > > $VERSION = 1.52_03; > > $VERSION = eval $VERSION; # $VERSION is 1.5203 > > > > 1.5.2 final release would have: > > > > $VERSION = 1.52_10; > > $VERSION = eval $VERSION; # $VERSION is 1.5210 > > > > 1.6.0 RC1 would have: > > > > $VERSION = 1.60_01; > > $VERSION = eval $VERSION; # $VERSION is 1.6001 > > > > 1.6.0 final release would have: > > > > $VERSION = 1.6010; > > > > > > Nice thing about putting RCs up on CPAN is that I suppose we'd see the > > test results from cpantesters. The more test results the better :) > Did you see the cpants site I sent earlier: > http://cpants.perl.org/dist/bioperl > > But I'm not sure why 1.4 didn't make it in there instead of 1.2.3 Yes, odd. Another thing to note is that CPAN also list two bugs related to bioperl 1.4. We may need to have some way of either redirecting users from there to bugzilla, or routinely checking the CPAN site. Otherwise we'll miss those. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From JK at novozymes.com Tue Oct 24 10:45:26 2006 From: JK at novozymes.com (JK (Jesper Agerbo Krogh)) Date: Tue, 24 Oct 2006 16:45:26 +0200 Subject: [Bioperl-l] Keeping references around in the objects? Message-ID: <934F95E71B6C9347A873C42AE3C196191299E011@NZT0004E.dknz.nzcorp.net> Hi All. When getting a Bio::Seq object back from a feature it would be really nice to have access to the old objects through the new object as: $featseq->feature()->parent_seq(); Would it be possible to keep the references around for (as an example) to be able to access the global information through the particular feature. Most of the annotation in the general header of a EMBL/Genbank-record also applies to the specific features. Jesper From JK at novozymes.com Tue Oct 24 10:28:22 2006 From: JK at novozymes.com (JK (Jesper Agerbo Krogh)) Date: Tue, 24 Oct 2006 16:28:22 +0200 Subject: [Bioperl-l] Subclassing Bio::Seq ? Extending Bio::Perl Message-ID: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net> Hi. We're trying to "extend" bioperl in our own setup. We have some funtions that we'd like to "allways" have available on a Bio::Seq-object. As an example, I'd like to have the sequence-digest available on ->digest that just returns A hex-encoded message-digest of the sequence in the object. This is really comfortable when trying to figure out wether we've got some computations stored in the cache for this particular sequence. Another example is that we have some fields we want to be mandatory in the objects, thus adding additional checks in the constructor is nessesary. Our approach has been to "subclass" Bio::Seq in a new object: (Nz::Seq) and add the functionality there. This generally works fine (->translate() calls ->can_call_new() and instantiates the correct subclassed object. But the logic fails when the ->seq of a feature just instantiates a Bio::PrimarySeq without trying to get the subclassed object. So the question basically is: What is the preferred way of extending/subclassing Bio-perl -objects with our own methods? Jesper From bix at sendu.me.uk Tue Oct 24 11:26:19 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 24 Oct 2006 16:26:19 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <000501c6f778$279cee10$15327e82@pyrimidine> References: <000501c6f778$279cee10$15327e82@pyrimidine> Message-ID: <453E309B.9090007@sendu.me.uk> Chris Fields wrote: > ... >>>> Since >>>> $VERSION = 1.52_10; >>>> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, >>>> final release version should be >>>> $VERSION = 1.6010. >>> Because they are dealt with separately, I don't think this is an issue >>> (see above). >> If you don't notice the dates, or are doing numerical version number >> comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may >> not be automatic, but you can still chose to download the developer >> releases. Which means if we say to someone 'use Bioperl 1.6 or better' >> they may choose to get the latest version and think it is 1.6002 when >> infact 1.60 was the more recent version. 1.6010 solves the problem, is >> consistent with your 1.50_10 suggestion, and doesn't cause any problems >> as far as I can see. > > CPAN looks like it can handle 'x.y.z', at least for Pugs: > > http://search.cpan.org/~audreyt/Perl6-Pugs-6.2.13/ 'handle'? I think it shows up as '6.2.13' simply because it was uploaded with the filename Perl6-Pugs-6.2.13.tar.gz As you point out, the code has the kind of $VERSION number we've been suggesting in this thread: > From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013': > > our $VERSION = 6.002013; > > That's also a very perlish-way to do it. And there are no developer > versions of Pugs, since it is always under active development. We could try > something like: > > our $VERSION = 1.005002_01; Yes, this was already like one of my suggestions (1.0502_01), but I brought up the concern that 1.05 might be < 1.4. So then we have a question: do we try and fumble a 1.4 compatible number by using 1.60_10, or do we have a clean break, remove 1.4 from CPAN if it causes problems, and go for the 'proper' 1.006000 (1.6.0) with no room for RC numbering, or 1.006000010 (1.6.0.10) - the first final release following some 1.006000_001 (1.6.0.01 == rc1) RCs? > just to tag it as a developer release or release candidate, if that's what > you want; I'm neutral to that point. I don't think it's necessary to post > every RC to CPAN, though, unless you feel very strongly about it. It just > seems like more hassle than it's worth, esp. since you've been releasing > about one per week leading up to a final 1.5.2 (due soon). I don't think it would be a hassle; on the contrary it would be very useful to know the CPAN distribution actually works. I'm very happy with the idea that a release candidate gets fully tested... From bix at sendu.me.uk Tue Oct 24 11:39:16 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 24 Oct 2006 16:39:16 +0100 Subject: [Bioperl-l] Subclassing Bio::Seq ? Extending Bio::Perl In-Reply-To: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net> References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net> Message-ID: <453E33A4.5060004@sendu.me.uk> JK (Jesper Agerbo Krogh) wrote: > Hi. > > We're trying to "extend" bioperl in our own setup. We have some funtions > that we'd like to "allways" have available on a Bio::Seq-object. [snip] > So the question basically is: > What is the preferred way of extending/subclassing Bio-perl -objects > with our own methods? http://www.bioperl.org/wiki/Advanced_BioPerl#Extending_the_toolkit From hlapp at gmx.net Tue Oct 24 12:24:09 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 24 Oct 2006 12:24:09 -0400 Subject: [Bioperl-l] Subclassing Bio::Seq ? Extending Bio::Perl In-Reply-To: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net> References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net> Message-ID: I think you've generally taken the right path, but see below. First off, object factories are used extensively already but not yet in each and every place where Bioperl creates an object internally. Achieving your goal may entail fixes to Bioperl to use a factory instead of a hard-coded module name. Also be on the lookout for factory() or seq_factory() methods for classes whose work entails creating sequence objects and that already give you control over the type to be created. The problem that hits you here though isn't one of determining the type of the object to be created, because the respective method doesn't create a sequence object. It only returns the sequence object that the feature has a reference to. The reason that this is a Bio::PrimarySeq and not a Bio::Seq or your extension of the latter is that the Perl garbage collector can't deal with circular references. The way we've circumvented the problem with sequence (who hold references to their feature objects) and feature objects (who need to hold a reference to their sequence object) is to make Bio::Seq a wrapper around Bio::PrimarySeq (i.e., Bio::Seq implements Bio::PrimarySeqI by delegating all the Bio::PrimarySeqI methods to an instance of Bio::PrimarySeq, and then adds implementations of the Bio::SeqI methods), and then make feature objects only hold a reference to the 'base' Bio::PrimarySeq instance. This works because Bio::PrimarySeq doesn't hold features, only Bio::SeqI objects do. Having said all that, note that if all what you want to do is defining computations on Bio::Seq objects, as opposed to storing values for additional attributes, the best design approach is not to extend the class but to create a class with those computations as static methods (which would accept the seq object on which to compute as an argument; e.g., print $seqComputations->message_digest($seq)). -hlmar On Oct 24, 2006, at 10:28 AM, JK ((Jesper Agerbo Krogh)) wrote: > Hi. > > We're trying to "extend" bioperl in our own setup. We have some > funtions > > that we'd like to "allways" have available on a Bio::Seq-object. As an > example, > I'd like to have the sequence-digest available on ->digest that just > returns > A hex-encoded message-digest of the sequence in the object. This is > really comfortable > when trying to figure out wether we've got some computations stored in > the cache > for this particular sequence. > > Another example is that we have some fields we want to be mandatory in > the objects, > thus adding additional checks in the constructor is nessesary. > > Our approach has been to "subclass" Bio::Seq in a new object: > (Nz::Seq) > and add > the functionality there. This generally works fine (->translate() > calls > ->can_call_new() > and instantiates the correct subclassed object. > > But the logic fails when the ->seq of a feature just instantiates a > Bio::PrimarySeq > without trying to get the subclassed object. > > So the question basically is: > What is the preferred way of extending/subclassing Bio-perl -objects > with > our own methods? > > Jesper > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Tue Oct 24 12:45:25 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 24 Oct 2006 11:45:25 -0500 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453E309B.9090007@sendu.me.uk> Message-ID: <000001c6f78b$d1c65a30$15327e82@pyrimidine> ... > > 'handle'? I think it shows up as '6.2.13' simply because it was uploaded > with the filename Perl6-Pugs-6.2.13.tar.gz Sorry, my point was that when Audrey T. uses '6.2.13', her $VERSION is '6.002013'. So maybe we should follow a similar convention. Seems easier and less confusing to me, at least. > As you point out, the code has the kind of $VERSION number we've been > suggesting in this thread: > > > From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013': > > > > our $VERSION = 6.002013; > > > > That's also a very perlish-way to do it. And there are no developer > > versions of Pugs, since it is always under active development. We could > try > > something like: > > > > our $VERSION = 1.005002_01; > > Yes, this was already like one of my suggestions (1.0502_01), but I > brought up the concern that 1.05 might be < 1.4. > > So then we have a question: do we try and fumble a 1.4 compatible number > by using 1.60_10, or do we have a clean break, remove 1.4 from CPAN if > it causes problems, and go for the 'proper' 1.006000 (1.6.0) with no > room for RC numbering, or 1.006000010 (1.6.0.10) - the first final > release following some 1.006000_001 (1.6.0.01 == rc1) RCs? I would go for the clean break if it follows perl/CPAN convention. '1.60_10' looks like '1.60.10', not 1.6 or 1.6 RC1, so it's too confusing. If we can use '1.00600x' for 1.6, 1.6.1, etc, and '1.006000_00x' for 1.6 RC1, 1.6 RC2 etc then that would be consistent and perl-compatible. BTW, the reason I looked at Pugs was to see what some of the Perl6 developers were using. Who knows; they'll probably change it! ... > I don't think it would be a hassle; on the contrary it would be very > useful to know the CPAN distribution actually works. I'm very happy with > the idea that a release candidate gets fully tested... So you obviously feel strongly about it! ;> I don't have a problem as long as we stick with doing this from now on (i.e. have a consistent versioning scheme, release policy, CPAN release policy, etc). Would be nice for Jason/Brian/Hilmar to chime in as to the reasoning behind the older versioning scheme. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From JK at novozymes.com Tue Oct 24 13:59:10 2006 From: JK at novozymes.com (JK (Jesper Agerbo Krogh)) Date: Tue, 24 Oct 2006 19:59:10 +0200 Subject: [Bioperl-l] Subclassing Bio::Seq ? Extending Bio::Perl References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.n et> Message-ID: <934F95E71B6C9347A873C42AE3C196190B84C5FC@NZT0004E.dknz.nzcorp.net> > > I think you've generally taken the right path, but see below. > > First off, object factories are used extensively already but not yet > in each and every place where Bioperl creates an object internally. > Achieving your goal may entail fixes to Bioperl to use a factory > instead of a hard-coded module name. Also be on the lookout for > factory() or seq_factory() methods for classes whose work entails > creating sequence objects and that already give you control over the > type to be created. Can you elaborate/describe this a bit more? > The problem that hits you here though isn't one of determining the > type of the object to be created, because the respective method > doesn't create a sequence object. It only returns the sequence object > that the feature has a reference to. This was what Data::Dumper told me, but stuff I'd likewise would like to change was to get a RichSeq object returned every-time from Bio::Seq, adding in the stuff that allways seems appropriate. > The reason that this is a Bio::PrimarySeq and not a Bio::Seq or your > extension of the latter is that the Perl garbage collector can't deal > with circular references. Doesn't Scalar::Util::weaken solve that? > Having said all that, note that if all what you want to do is > defining computations on Bio::Seq objects, as opposed to storing > values for additional attributes, the best design approach is not to > extend the class but to create a class with those computations as > static methods (which would accept the seq object on which to compute > as an argument; e.g., print $seqComputations->message_digest($seq)). I could but there are some functionality that I'd by design would like to have available on every sequence in the system. This way I would end up coding the functionality for getting the message_digest every place that I needed to get the value (which would be quite often in this application), whereas it by design belongs into the Bio::Seq-stuff. Jesper From JK at novozymes.com Tue Oct 24 13:59:19 2006 From: JK at novozymes.com (JK (Jesper Agerbo Krogh)) Date: Tue, 24 Oct 2006 19:59:19 +0200 Subject: [Bioperl-l] Subclassing Bio::Seq ? Extending Bio::Perl References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.n et> <453E33A4.5060004@sendu.me.uk> Message-ID: <934F95E71B6C9347A873C42AE3C196190B84C5FD@NZT0004E.dknz.nzcorp.net> > JK (Jesper Agerbo Krogh) wrote: > > Hi. > > > > We're trying to "extend" bioperl in our own setup. We have some funtions > > that we'd like to "allways" have available on a Bio::Seq-object. > [snip] > > So the question basically is: > > What is the preferred way of extending/subclassing Bio-perl -objects > > with our own methods? > > http://www.bioperl.org/wiki/Advanced_BioPerl#Extending_the_toolkit That is definately a way of extending Bio-perl, thanks. Jesper From hlapp at gmx.net Tue Oct 24 14:57:02 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 24 Oct 2006 14:57:02 -0400 Subject: [Bioperl-l] Subclassing Bio::Seq ? Extending Bio::Perl In-Reply-To: <934F95E71B6C9347A873C42AE3C196190B84C5FC@NZT0004E.dknz.nzcorp.net> References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.n et> <934F95E71B6C9347A873C42AE3C196190B84C5FC@NZT0004E.dknz.nzcorp.net> Message-ID: On Oct 24, 2006, at 1:59 PM, JK ((Jesper Agerbo Krogh)) wrote: >> >> I think you've generally taken the right path, but see below. >> >> First off, object factories are used extensively already but not yet >> in each and every place where Bioperl creates an object internally. >> Achieving your goal may entail fixes to Bioperl to use a factory >> instead of a hard-coded module name. Also be on the lookout for >> factory() or seq_factory() methods for classes whose work entails >> creating sequence objects and that already give you control over the >> type to be created. > > Can you elaborate/describe this a bit more? See for example the POD of Bio::SeqIO (sorry, the method is called sequence_factory()). > >> The reason that this is a Bio::PrimarySeq and not a Bio::Seq or your >> extension of the latter is that the Perl garbage collector can't deal >> with circular references. > > Doesn't Scalar::Util::weaken solve that? You're welcome to test and try. It should be a simple change in Bio::Seq::add_SeqFeature(). You will see that it is this method and not the feature object that makes sure the wrapped primarySeq gets passed as sequence reference. Just change that to creating a new reference to the sequence object and make it a weak reference before passing it to the feature object. (The feature object has no requirement (or knowledge) that the referenced sequence object is a PrimarySeq.) > >> Having said all that, note that if all what you want to do is >> defining computations on Bio::Seq objects, as opposed to storing >> values for additional attributes, the best design approach is not to >> extend the class but to create a class with those computations as >> static methods (which would accept the seq object on which to compute >> as an argument; e.g., print $seqComputations->message_digest($seq)). > > I could but there are some functionality that I'd by design would > like to > have available on every sequence in the system. This way I would > end up > coding the functionality for getting the message_digest every place > that > I needed to get the value (which would be quite often in this > application), > whereas it by design belongs into the Bio::Seq-stuff. I'm not following you why this would make any difference (it would be $seq->message_digest() compared to $seqCompute->message_digest ($seq)), unless what you are saying is that you would like to cache the result of the computation. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Wed Oct 25 06:36:27 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 25 Oct 2006 11:36:27 +0100 Subject: [Bioperl-l] Lagan environment variable Message-ID: <453F3E2B.2040309@sendu.me.uk> Notification to say I'm changing the environmental variable that Bio::Tools::Run::Alignment::Lagan expects to define the location of the lagan executables from LAGANDIR to LAGAN_DIR, since the latter is the default variable that the lagan installation and scripts themselves look for. I hope this isn't too much of a burden, but it seems like the sensible approach to getting Bio::Tools::Run::Alignment::Lagan to actually work. Thank you, Sendu. From n.haigh at sheffield.ac.uk Wed Oct 25 09:07:47 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 25 Oct 2006 13:07:47 +0000 Subject: [Bioperl-l] Lagan environment variable In-Reply-To: <453F3E2B.2040309@sendu.me.uk> References: <453F3E2B.2040309@sendu.me.uk> Message-ID: <453F61A3.4090904@sheffield.ac.uk> Sendu Bala wrote: > Notification to say I'm changing the environmental variable that > Bio::Tools::Run::Alignment::Lagan expects to define the location of the > lagan executables from LAGANDIR to LAGAN_DIR, since the latter is the > default variable that the lagan installation and scripts themselves look > for. > > I hope this isn't too much of a burden, but it seems like the sensible > approach to getting Bio::Tools::Run::Alignment::Lagan to actually work. > > > Thank you, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > Woudn't it make more sense to change the test? That is what I've just done for t/Genscan.t It seemed to fit in with the ENV variable syntax that other modules in Bioperl-run used. Nath -- > A: Yes. >> Q: Are you sure? >> >>> A: Because it reverses the logical flow of conversation. >>> >>>> Q: Why is top posting frowned upon? >>>> Get Thunderbird From bix at sendu.me.uk Wed Oct 25 08:12:00 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 25 Oct 2006 13:12:00 +0100 Subject: [Bioperl-l] Lagan environment variable In-Reply-To: <453F61A3.4090904@sheffield.ac.uk> References: <453F3E2B.2040309@sendu.me.uk> <453F61A3.4090904@sheffield.ac.uk> Message-ID: <453F5490.7060808@sendu.me.uk> Nathan S. Haigh wrote: > Sendu Bala wrote: >> Notification to say I'm changing the environmental variable that >> Bio::Tools::Run::Alignment::Lagan expects to define the location of the >> lagan executables from LAGANDIR to LAGAN_DIR, since the latter is the >> default variable that the lagan installation and scripts themselves look >> for. >> >> I hope this isn't too much of a burden, but it seems like the sensible >> approach to getting Bio::Tools::Run::Alignment::Lagan to actually work. > > Woudn't it make more sense to change the test? That is what I've just > done for t/Genscan.t For Genscan.t, the test script looked at the wrong environment variable. Here I'm talking about lagan itself (the thing you get from http://lagan.stanford.edu/lagan_web/citing.shtml, nothing to do with Bioperl) needing the environment variable LAGAN_DIR to be set in order to work. Since you need to set LAGAN_DIR to make lagan work, it makes sense that the Bioperl front-end to lagan also use the same variable. From n.haigh at sheffield.ac.uk Wed Oct 25 09:16:16 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 25 Oct 2006 13:16:16 +0000 Subject: [Bioperl-l] Lagan environment variable In-Reply-To: <453F5490.7060808@sendu.me.uk> References: <453F3E2B.2040309@sendu.me.uk> <453F61A3.4090904@sheffield.ac.uk> <453F5490.7060808@sendu.me.uk> Message-ID: <453F63A0.7040609@sheffield.ac.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: >> Sendu Bala wrote: >>> Notification to say I'm changing the environmental variable that >>> Bio::Tools::Run::Alignment::Lagan expects to define the location of >>> the lagan executables from LAGANDIR to LAGAN_DIR, since the latter >>> is the default variable that the lagan installation and scripts >>> themselves look for. >>> >>> I hope this isn't too much of a burden, but it seems like the >>> sensible approach to getting Bio::Tools::Run::Alignment::Lagan to >>> actually work. >> >> Woudn't it make more sense to change the test? That is what I've just >> done for t/Genscan.t > > For Genscan.t, the test script looked at the wrong environment variable. > > Here I'm talking about lagan itself (the thing you get from > http://lagan.stanford.edu/lagan_web/citing.shtml, nothing to do with > Bioperl) needing the environment variable LAGAN_DIR to be set in order > to work. > > Since you need to set LAGAN_DIR to make lagan work, it makes sense > that the Bioperl front-end to lagan also use the same variable. > Ah, OK! :-[ teach me for speak up about something I know nothing about! :-) FYI, I've been busy this morning installing as much Bioperl-run external software as I could (those that have tests). Will be posting results shorty. Nath From massimo.ubaldi at gmail.com Wed Oct 25 10:28:52 2006 From: massimo.ubaldi at gmail.com (Massimo Ubaldi) Date: Wed, 25 Oct 2006 16:28:52 +0200 Subject: [Bioperl-l] blastxml format Message-ID: <4b5350650610250728s1a421199if2493c9c4660474d@mail.gmail.com> Hi I'm using the script below to parse a blastn output to multiple sequences I got the output from the blast web interface asking for xml formatted output. Everything work fine except that I cannot print the name of each input sequence (see below). That is, using the line (see below) $result->query_description I got just the name of the first sequence. Infact this is defined by the tag. What I really want is to extract the name that is defined by the tag. Now I digged out the bioperl mailing list and other sources but I did not find anything to solve this. Can somebody help me? Thanks alot Massimo This is an example of ouput I got MRDNA_probe 46.1 PREDICTED: Danio rerio similar to mineralocorticoid receptor form B (LOC562171), mRNA 68354945 XM_685568 81.8 Danio rerio VDR-B mRNA, partial cds 68132043 DQ017633 PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA 68420187 XM_684078 This what I'd like to get MRDNA_probe 46.1 PREDICTED: Danio rerio similar to mineralocorticoid receptor form B (LOC562171), mRNA 68354945 XM_685568 VDRacterm_probe 81.8 Danio rerio VDR-B mRNA, partial cds 68132043 DQ017633 ARalpcterm_probe PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA 68420187 XM_684078 This is the script #!/usr/bin/perl use strict; use Bio::SearchIO; my $in = new Bio::SearchIO(-format => 'blast', -file => 'Blastn_danio.bls'); open OUTFILE, ">parsed_blastn_danio.txt" or die "Could not open file, stopped"; my $result = $in->next_result; print OUTFILE $result->algorithm, "\n"; print OUTFILE $result->database_name, "\n"; print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers", "\t", "GenBank Accession", "\n"; while($result = $in->next_result ) { print OUTFILE $result->query_description, "\n"; while( my $hit = $result->next_hit ) { while( my $hsp = $hit->next_hsp ) { my $acc=$hit->name; my $description= $hit->description; $acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/; print OUTFILE $hit->raw_score, "\t", # Score $hit->description, "\t", # Description $1, "\t", $2, "\n"; } } } From cjfields at uiuc.edu Wed Oct 25 11:04:14 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 25 Oct 2006 10:04:14 -0500 Subject: [Bioperl-l] blastxml format In-Reply-To: <4b5350650610250728s1a421199if2493c9c4660474d@mail.gmail.com> Message-ID: <000301c6f846$d6227760$15327e82@pyrimidine> Iterations (which are related to PSIBLAST) aren't currently handled in blastxml, which is why the tag isn't being parsed. I'll give it a look but I don't think it will be properly fixed anytime soon, since we're gearing up for a developer release and are sorting out various bugs in relation to that. In the meantime, you could always try changing the relevant tag in the %MAPPING hash in your local copy of Bio::SearchIO::blastxml from 'BlastOutput_query-def' to 'Iteration_query-def', which may do the trick for you. I'm a bit reluctant to change this in CVS as it would be better to add this in when iterations are handled properly by blastxml, and I'm not sure all BLAST XML varieties have the tag. If you want you can add this to the bioperl bugzilla as an enhancement request to remind us: http://bugzilla.open-bio.org/ Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Massimo Ubaldi > Sent: Wednesday, October 25, 2006 9:29 AM > To: bioperl-l List > Subject: [Bioperl-l] blastxml format > > Hi > I'm using the script below to parse a blastn output to multiple sequences > I got the output from the blast web interface asking for xml formatted > output. > Everything work fine except that I cannot print the name of each input > sequence (see below). > That is, using the line (see below) $result->query_description I got just > the name of the first sequence. Infact this is defined by the > tag. > What I really want is to extract the name that is defined by the > tag. > Now I digged out the bioperl mailing list and other sources but I did not > find anything to solve this. > Can somebody help me? > Thanks alot > Massimo > > > This is an example of ouput I got > > MRDNA_probe > 46.1 PREDICTED: Danio rerio similar to mineralocorticoid receptor form > B > (LOC562171), mRNA 68354945 XM_685568 > 81.8 Danio rerio VDR-B mRNA, partial cds 68132043 DQ017633 > PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA > 68420187 XM_684078 > > This what I'd like to get > MRDNA_probe > 46.1 PREDICTED: Danio rerio similar to mineralocorticoid receptor form > B > (LOC562171), mRNA 68354945 XM_685568 > VDRacterm_probe > 81.8 Danio rerio VDR-B mRNA, partial cds 68132043 DQ017633 > ARalpcterm_probe > PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA > 68420187 XM_684078 > > This is the script > #!/usr/bin/perl > use strict; > use Bio::SearchIO; > my $in = new Bio::SearchIO(-format => 'blast', > -file => 'Blastn_danio.bls'); > open OUTFILE, ">parsed_blastn_danio.txt" or die "Could not open file, > stopped"; > my $result = $in->next_result; > print OUTFILE $result->algorithm, "\n"; > print OUTFILE $result->database_name, "\n"; > > print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers", > "\t", "GenBank Accession", "\n"; > > while($result = $in->next_result ) { > print OUTFILE $result->query_description, "\n"; > while( my $hit = $result->next_hit ) { > while( my $hsp = $hit->next_hsp ) { > > my $acc=$hit->name; > my $description= $hit->description; > > $acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/; > > print OUTFILE > > $hit->raw_score, "\t", # Score > $hit->description, "\t", # Description > > $1, "\t", $2, "\n"; > } > } > } > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From massimo.ubaldi at gmail.com Wed Oct 25 11:20:49 2006 From: massimo.ubaldi at gmail.com (Massimo Ubaldi) Date: Wed, 25 Oct 2006 17:20:49 +0200 Subject: [Bioperl-l] blastxml format In-Reply-To: <000301c6f846$d6227760$15327e82@pyrimidine> References: <4b5350650610250728s1a421199if2493c9c4660474d@mail.gmail.com> <000301c6f846$d6227760$15327e82@pyrimidine> Message-ID: <4b5350650610250820w1498b27dnd155896fbf9a2012@mail.gmail.com> Thanks for the reply. I've already tried this but I got exactly the same results as before. What other can I try? Massimo On 10/25/06, Chris Fields wrote: > > Iterations (which are related to PSIBLAST) aren't currently handled in > blastxml, which is why the tag isn't being parsed. I'll give it a look > but > I don't think it will be properly fixed anytime soon, since we're gearing > up > for a developer release and are sorting out various bugs in relation to > that. > > In the meantime, you could always try changing the relevant tag in the > %MAPPING hash in your local copy of Bio::SearchIO::blastxml from > 'BlastOutput_query-def' to 'Iteration_query-def', which may do the trick > for > you. I'm a bit reluctant to change this in CVS as it would be better to > add > this in when iterations are handled properly by blastxml, and I'm not sure > all BLAST XML varieties have the tag. > > If you want you can add this to the bioperl bugzilla as an enhancement > request to remind us: > > http://bugzilla.open-bio.org/ > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > -----Original Message----- > > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > > bounces at lists.open-bio.org] On Behalf Of Massimo Ubaldi > > Sent: Wednesday, October 25, 2006 9:29 AM > > To: bioperl-l List > > Subject: [Bioperl-l] blastxml format > > > > Hi > > I'm using the script below to parse a blastn output to multiple > sequences > > I got the output from the blast web interface asking for xml formatted > > output. > > Everything work fine except that I cannot print the name of each input > > sequence (see below). > > That is, using the line (see below) $result->query_description I got > just > > the name of the first sequence. Infact this is defined by the > > tag. > > What I really want is to extract the name that is defined by the > > tag. > > Now I digged out the bioperl mailing list and other sources but I did > not > > find anything to solve this. > > Can somebody help me? > > Thanks alot > > Massimo > > > > > > This is an example of ouput I got > > > > MRDNA_probe > > 46.1 PREDICTED: Danio rerio similar to mineralocorticoid receptor > form > > B > > (LOC562171), mRNA 68354945 XM_685568 > > 81.8 Danio rerio VDR-B mRNA, partial cds 68132043 DQ017633 > > PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA > > 68420187 XM_684078 > > > > This what I'd like to get > > MRDNA_probe > > 46.1 PREDICTED: Danio rerio similar to mineralocorticoid receptor > form > > B > > (LOC562171), mRNA 68354945 XM_685568 > > VDRacterm_probe > > 81.8 Danio rerio VDR-B mRNA, partial cds 68132043 DQ017633 > > ARalpcterm_probe > > PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA > > 68420187 XM_684078 > > > > This is the script > > #!/usr/bin/perl > > use strict; > > use Bio::SearchIO; > > my $in = new Bio::SearchIO(-format => 'blast', > > -file => 'Blastn_danio.bls'); > > open OUTFILE, ">parsed_blastn_danio.txt" or die "Could not open file, > > stopped"; > > my $result = $in->next_result; > > print OUTFILE $result->algorithm, "\n"; > > print OUTFILE $result->database_name, "\n"; > > > > print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers", > > "\t", "GenBank Accession", "\n"; > > > > while($result = $in->next_result ) { > > print OUTFILE $result->query_description, "\n"; > > while( my $hit = $result->next_hit ) { > > while( my $hsp = $hit->next_hsp ) { > > > > my $acc=$hit->name; > > my $description= $hit->description; > > > > $acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/; > > > > print OUTFILE > > > > $hit->raw_score, "\t", # Score > > $hit->description, "\t", # Description > > > > $1, "\t", $2, "\n"; > > } > > } > > } > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at uiuc.edu Wed Oct 25 12:56:46 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 25 Oct 2006 11:56:46 -0500 Subject: [Bioperl-l] blastxml format In-Reply-To: <4b5350650610250820w1498b27dnd155896fbf9a2012@mail.gmail.com> Message-ID: <000001c6f856$8ee44bc0$15327e82@pyrimidine> > Thanks for the reply. I've already tried this but I got exactly the same > > results as before. > What other can I try? > Massimo If you don't mind me asking, what version of perl and Bioperl are you using, and what version of BLAST is used? I want to point out there are a number of problems with your script, now I have had a chance to look at it. 1) You have the SearchIO format set to 'blast'. It should be 'blastxml' if you are parsing XML format. 2) Every time you call next_result() you iterate through each BLAST report. In effect, you're doing something like this: my $result = $in->next_result(); ....# do something here (in first BLAST report) while ($result = $in->next_result()) { # change to second BLAST report # more stuff here (in second BLAST report, if there is one) } I don't know if it's intentional though, but it's something to point out. 3) You also use raw_score(), which doesn't return a value for me (this may be related to the bioperl version, which is why I asked above). If you use $hit->bits() or $hit->significance() you can get the bits or hit evalue, respectively. 4) Also, I didn't see a difference with the two XML tags and using BLAST 2.2.15 output (WebBLAST at NCBI), which makes sense since they should originate from the same query sequence anyway. This could be related to the BLAST version. Here's my version of your script, using WinXP and bioperl-live (CVS): use Bio::SearchIO; my $file = shift @ARGV; my $in = new Bio::SearchIO(-format => 'blastxml', -file => $file); open OUTFILE, ">parsed_blastn_danio.txt" || die "Could not open file, stopped"; while(my $result = $in->next_result ) { print OUTFILE $result->algorithm, "\n"; print OUTFILE $result->database_name, "\n"; print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers", "\t", "GenBank Accession", "\n"; print OUTFILE $result->query_description, "\n"; while( my $hit = $result->next_hit ) { while( my $hsp = $hit->next_hsp ) { my $acc=$hit->name; my $description= $hit->description; if ($acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/) { print OUTFILE $hit->bits, "\t", # Score $hit->description, "\t", # Description $1, "\t", $2, "\n"; } } } } Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign ... From n.haigh at sheffield.ac.uk Thu Oct 26 04:47:27 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 26 Oct 2006 09:47:27 +0100 Subject: [Bioperl-l] More extensive Bioperl-run 1.5.2RC2 tests Message-ID: <4540761F.6010904@sheffield.ac.uk> Oops, I posted this to the Biojava list the other day by mistake! I have recently installed some more software for which there are bioperl-run tests and run the test suite with several versions of the software I could find. I've added info to http://www.bioperl.org/wiki/Release_1.5.2#bioperl-run. If there were any fails in any of the versions I tested I've noted them together with versions that were ok (if any). There maybe another 6 or so programs I'm trying to get hold of to run further tests - I'll update when I get them. Nath From n.haigh at sheffield.ac.uk Thu Oct 26 05:14:07 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 26 Oct 2006 10:14:07 +0100 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally Message-ID: <45407C5F.40104@sheffield.ac.uk> I'm thinking that it's not wise to test for things like overall_percentage_identity etc in alignments that are generated by external software like T-Coffee, Clustalw etc. Changes to software algorithms/efficiency, bug fixes etc may well alter the quality of the alignment produced in different versions and thus affect the value returned by such methods. Therefore, I think these methods should only be tested from alignments loaded directly from t/data. Nath From bix at sendu.me.uk Thu Oct 26 05:48:37 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 26 Oct 2006 10:48:37 +0100 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <45407C5F.40104@sheffield.ac.uk> References: <45407C5F.40104@sheffield.ac.uk> Message-ID: <45408475.30903@sendu.me.uk> Nathan Haigh wrote: > I'm thinking that it's not wise to test for things like > overall_percentage_identity etc in alignments that are generated by > external software like T-Coffee, Clustalw etc. Changes to software > algorithms/efficiency, bug fixes etc may well alter the quality of the > alignment produced in different versions and thus affect the value > returned by such methods. Therefore, I think these methods should only > be tested from alignments loaded directly from t/data. Did you discover some specific problem cases? From n.haigh at sheffield.ac.uk Thu Oct 26 06:04:54 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 26 Oct 2006 11:04:54 +0100 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <45408475.30903@sendu.me.uk> References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk> Message-ID: <45408846.1050001@sheffield.ac.uk> Sendu Bala wrote: > Nathan Haigh wrote: >> I'm thinking that it's not wise to test for things like >> overall_percentage_identity etc in alignments that are generated by >> external software like T-Coffee, Clustalw etc. Changes to software >> algorithms/efficiency, bug fixes etc may well alter the quality of the >> alignment produced in different versions and thus affect the value >> returned by such methods. Therefore, I think these methods should only >> be tested from alignments loaded directly from t/data. > > Did you discover some specific problem cases? My messages seem to be taking a while to come through, but, yes. It may be due to the software changing default parameters, but it makes testing the output for specific details pretty difficult and inconsistent. For example, running T-Coffee, the following command from t/TCoffee.t results in slightly different alignment: $aln = $factory->run('-type' => 'profile', '-profile' => $aln1, '-seq' => Bio::Root::IO->catfile("t","data","cysprot1b.fa")); Of particular note, is the gaps on the last line of the sequences. In 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in CATH_RAT/1-333 ------mwtalpllcagawllsagat----------aeltvnaiek------------fh ftswmkqhqktyss-reyshrlqvfannwrkiqahn----qrnhtfkmglnqfsdmsfae ikhkylwsepqncsat--ksnylrgtgp--ypssmdwrkkgnvvspvknqgacgscwtfs ttgalesavaiasgkmmtlaeqqlvdcaqnfnnh--------gcqgglpsqafeyilynk gimgedsypyigkngqckfnpekavafvknvv-nitlndeaamveavalynpvsfafevt -edfmmyksgvyssnschktpdkvnhavlavgygeqn-----gllywivknswgsnwgnn gyfliergk-nm---cglaacasypipqv >CATL_HUMAN/1-333 --------------------------------mnptlilaafclgiasatltfdhsleaq wtkwkamhnrlygmnee-gwrravweknmkmielhnqeyregkhsftmamnafgdmtsee frqvmngfqnrkpr----kgkvfqeplfyeaprsvdwrekg-yvtpvknqgqcgscwafs atgalegqmfrktgrlislseqnlvdcsgpqgn--------egcngglmdyafqyvqdng gldseesypyeateesckynpkysvandtgfv-dip-kqekalmkavatvgpisvaidag hesflfykegiyfepdc--ssedmdhgvlvvgygfestesdnn-kywlvknswgeewgmg gyvkmakdrrnh---cgiasaasyptv-- >CATL_RAT/1-334 --------------------------------mtpllllavlclgtalatpkfdqtfnaq whqwksthrrlygtnee-ewrravweknmrmiqlhngeysngkhgftmemnafgdmtnee frqivngyrhqkhk----kgrlfqeplmlqipktvdwrekg-cvtpvknqgqcgscwafs asgclegqmflktgklislseqnlvdcshdqgn--------qgcngglmdfafqyikeng gldseesypyeakdgsckyraeyavandtgfv-dip-qqekalmkavatvgpisvamdas hpslqfyssgiyyepnc--sskdldhgvlvvgygyegtdsnkd-kywlvknswgkewgmd gyikiakdrnnh---cglataasypivn- >PAPA_CARPA/1-345 mamipsiskllfvaiclfvymglsfg-------------dfsivgysqndltsterliql feswmlkhnkiyknidekiyrfeifkdnlkyidetn----kknnsywlglnvfadmsnde fkekytgsiagnytttelsyeevlndgdvnipeyvdwrqkg-avtpvknqgscgscwafs avvtiegiikirtgnlneyseqelldc----------drrsygcnggypwsalqlvaqy- gihyrntypyegvqrycrsrekgpyaaktdgvrqvqpynegallysian-qpvsvvleaa gkdfqlyrggifvgpcgnk----vdhavaavgygpn---------yiliknswgtgwgen gyirikrgtgnsygvcglytssfypvkn- >ALEU_HORVU/1-362 maharvlllalavlataavavassssfadsnpirpvtdraastlesavlgalgrtrhalr farfavrygksyesaaevrrrfrifsesleevrstn----rkglpyrlginrfsdmswee fqatrlg-aaqtcsatlagnhlmrdaaa--lpetkdwredg-ivspvknqahcgscwtfs ttgaleaaytqatgknislseqqlvdcaggfnnf--------gcngglpsqafeyikyng gidteesypykgvngvchykaenaavqvldsv-nitlnaedelknavglvrpvsvafqvi -dgfrqyksgvytsdhcgttpddvnhavlavgygven-----gvpywliknswgadwgdn gyfkmemgk-nm---caiatcasypvvaa >CATH_HUMAN/1-335 ------mwatlpllcagawllg--------vpvcgaaelsvnslek------------fh fkswmskhrktys-teeyhhrlqtfasnwrkinahn----ngnhtfkmalnqfsdmsfae ikhkylwsepqncsatks--nylrgtgp--yppsvdwrkkgnfvspvknqgacgscwtfs ttgalesaiaiatgkmlslaeqqlvdcaqdfnny--------gcqgglpsqafeyilynk gimgedtypyqgkdgyckfqpgkaigfvkdva-nitiydeeamveavalynpvsfafevt -qdfmmyrtgiysstschktpdkvnhavlavgygekn-----gipywivknswgpqwgmn gyfliergk-nm---cglaacasypiplv >CYS1_DICDI/1-343 -----mkvillfvlavftvfvs---------------srgippeeq------------sq flefqdkfnkkys-heeylerfeifksnlgkieelnliainhkadtkfgvnkfadlssde fknyylnnkeaiftddlpvadylddefinsiptafdwrtrg-avtpvknqgqcgscwsfs ttgnvegqhfisqnklvslseqnlvdcdhecmeyegeeacdegcngglqpnaynyiikng giqtessypytaetgtqcnfnsanigakisnf-tmipknetvmagyivstgplaiaadav -e-wqfyiggvfdipcn---pnsldhgilivgysakntifrknmpywivknswgadwgeq gyiylrrgk-nt---cgvsnfvstsii-- While T-Coffee <4.45 returned: >CATH_RAT/1-333 ----------mwtalpllcagawllsagat----------aeltvnaiek---------- --fhftswmkqhqktyss-reyshrlqvfannwrkiqahn----q----rnhtfkmglnq fsdmsfaeikhkylwsepqncsat--ksnylrgtgp--ypssmdwrkkgnvvspvknqga cgscwtfsttgalesavaiasgkmmtlaeqqlvdcaqnfnnh--------gcqgglpsqa feyilynkgimgedsypyigkngqckfnpekavafvknvvn-itlndeaamveavalynp vsfafevt-edfmmyksgvyssnschktpdkvnhavlavgygeqn-----gllywivkns wgsnwgnngyfliergkn----mcglaacasypipqv >PAPA_CARPA/1-345 mamipsiskllfvaiclfvymglsfgdfsivgysqndltsterliqlfeswml------- -------------khnkiyknidekiyrf-----eifkdnlkyidetnkknnsywlglnv fadmsndefkekytgsiagnytttelsyeevlndgdvnipeyvdwrqkg-avtpvknqgs cgscwafsavvtiegiikirtgnlneyseqelldc----------drrsygcnggypwsa lq-lvaqygihyrntypyegvqrycrsrekgpyaaktdgvrqvqpynegallysia-nqp vsvvleaagkdfqlyrggifvgpcgnk----vdhavaavgygpn---------yilikns wgtgwgengyirikrgtgnsygvcglytssfypvkn- >CATL_HUMAN/1-333 -----------------------------------------mnptlilaafclgiasatl tfdhsleaqwtkwkamhnrlygmneegwrravweknmkmielhnqeyregkhsftmamna fgdmtseefrqvmngfqnrkprkgkvfqeplf----yeaprsvdwrekg-yvtpvknqgq cgscwafsatgalegqmfrktgrlislseqnlvdcsgpqgn--------egcngglmdya fqyvqdnggldseesypyeateesckynpkysvandtgfvd--ipkqekalmkavatvgp isvaidaghesflfykegiyfepdc--ssedmdhgvlvvgygfestesdnn-kywlvkns wgeewgmggyvkmakdrrnh---cgiasaasyptv-- >CATL_RAT/1-334 -----------------------------------------mtpllllavlclgtalatp kfdqtfnaqwhqwksthrrlygtneeewrravweknmrmiqlhngeysngkhgftmemna fgdmtneefrqivngyrhqkhkkgrlfqeplm----lqipktvdwrekg-cvtpvknqgq cgscwafsasgclegqmflktgklislseqnlvdcshdqgn--------qgcngglmdfa fqyikenggldseesypyeakdgsckyraeyavandtgfvd--ipqqekalmkavatvgp isvamdashpslqfyssgiyyepnc--sskdldhgvlvvgygyegtdsnkd-kywlvkns wgkewgmdgyikiakdrnnh---cglataasypivn- >ALEU_HORVU/1-362 ----maharvlllalavlataavavassssfadsnpirpvtdraastlesavlgalgrtr halrfarfavrygksyesaaevrrrfrifsesleevrstn----r----kglpyrlginr fsdmsweefqatrlg-aaqtcsatlagnhlmrdaaa--lpetkdwredg-ivspvknqah cgscwtfsttgaleaaytqatgknislseqqlvdcaggfnnf--------gcngglpsqa feyikynggidteesypykgvngvchykaenaavqvldsvn-itlnaedelknavglvrp vsvafqvi-dgfrqyksgvytsdhcgttpddvnhavlavgygven-----gvpywlikns wgadwgdngyfkmemgkn----mcaiatcasypvvaa >CATH_HUMAN/1-335 ----------mwatlpllcagawllg--------vpvcgaaelsvnslek---------- --fhfkswmskhrktys-teeyhhrlqtfasnwrkinahn----n----gnhtfkmalnq fsdmsfaeikhkylwsepqncsatks--nylrgtgp--yppsvdwrkkgnfvspvknqga cgscwtfsttgalesaiaiatgkmlslaeqqlvdcaqdfnny--------gcqgglpsqa feyilynkgimgedtypyqgkdgyckfqpgkaigfvkdvan-itiydeeamveavalynp vsfafevt-qdfmmyrtgiysstschktpdkvnhavlavgygekn-----gipywivkns wgpqwgmngyfliergkn----mcglaacasypiplv >CYS1_DICDI/1-343 ---------mkvillfvlavftvfvs---------------srgippeeq---------- --sqflefqdkfnkkys-heeylerfeifksnlgkieelnliain----hkadtkfgvnk fadlssdefknyylnnkeaiftddlpvadylddefinsiptafdwrtrg-avtpvknqgq cgscwsfsttgnvegqhfisqnklvslseqnlvdcdhecmeyegeeacdegcngglqpna ynyiiknggiqtessypytaetgtqcnfnsanigakisnft-mipknetvmagyivstgp laiaadav-e-wqfyiggvfdipcn---pnsldhgilivgysakntifrknmpywivkns wgadwgeqgyiylrrgkn----tcgvsnfvstsii-- From sanges at biogem.it Thu Oct 26 06:26:36 2006 From: sanges at biogem.it (Remo Sanges) Date: Thu, 26 Oct 2006 11:26:36 +0100 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <45408846.1050001@sheffield.ac.uk> References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk> <45408846.1050001@sheffield.ac.uk> Message-ID: <45408D5C.1000305@biogem.it> Nathan Haigh wrote: > Sendu Bala wrote: > >> Nathan Haigh wrote: >> >>> I'm thinking that it's not wise to test for things like >>> overall_percentage_identity etc in alignments that are generated by >>> external software like T-Coffee, Clustalw etc. Changes to software >>> algorithms/efficiency, bug fixes etc may well alter the quality of the >>> alignment produced in different versions and thus affect the value >>> returned by such methods. Therefore, I think these methods should only >>> be tested from alignments loaded directly from t/data. >>> >> Did you discover some specific problem cases? >> > My messages seem to be taking a while to come through, but, yes. It may > be due to the software changing default parameters, but it makes testing > the output for specific details pretty difficult and inconsistent. For > example, running T-Coffee, the following command from t/TCoffee.t > results in slightly different alignment: > $aln = $factory->run('-type' => 'profile', > '-profile' => $aln1, > '-seq' => > Bio::Root::IO->catfile("t","data","cysprot1b.fa")); > > Of particular note, is the gaps on the last line of the sequences. In > 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in > I'm not a T-coffee user but usually you can come across these problems when you use different scoring parameters when align sequences. Could it be possible that they have simply changed the default parameters for gap penalties and that kind of stuff? It is possible to set them? If so you can just run the test by defining the scores in the param hash without using the default. HTH Remo From n.haigh at sheffield.ac.uk Thu Oct 26 06:33:55 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 26 Oct 2006 11:33:55 +0100 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <45408D5C.1000305@biogem.it> References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk> <45408846.1050001@sheffield.ac.uk> <45408D5C.1000305@biogem.it> Message-ID: <45408F13.9020209@sheffield.ac.uk> Remo Sanges wrote: > Nathan Haigh wrote: >> Sendu Bala wrote: >> >>> Nathan Haigh wrote: >>> >>>> I'm thinking that it's not wise to test for things like >>>> overall_percentage_identity etc in alignments that are generated by >>>> external software like T-Coffee, Clustalw etc. Changes to software >>>> algorithms/efficiency, bug fixes etc may well alter the quality of the >>>> alignment produced in different versions and thus affect the value >>>> returned by such methods. Therefore, I think these methods should only >>>> be tested from alignments loaded directly from t/data. >>>> >>> Did you discover some specific problem cases? >>> >> My messages seem to be taking a while to come through, but, yes. It may >> be due to the software changing default parameters, but it makes testing >> the output for specific details pretty difficult and inconsistent. For >> example, running T-Coffee, the following command from t/TCoffee.t >> results in slightly different alignment: >> $aln = $factory->run('-type' => 'profile', >> '-profile' => $aln1, >> '-seq' => >> Bio::Root::IO->catfile("t","data","cysprot1b.fa")); >> >> Of particular note, is the gaps on the last line of the sequences. In >> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in >> > > I'm not a T-coffee user but usually you can come across > these problems when you use different scoring parameters > when align sequences. > > Could it be possible that they have simply changed the > default parameters for gap penalties and that kind of > stuff? It is possible to set them? > > If so you can just run the test by defining > the scores in the param hash without using the default. > > HTH > > Remo That is true, but it depends on the whether the wrapper is complete enough to be able to set all the parameters provided by the software. Nath From n.haigh at sheffield.ac.uk Thu Oct 26 12:13:03 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 26 Oct 2006 17:13:03 +0100 Subject: [Bioperl-l] Bio::Restriction::Enzyme Message-ID: <4540DE8F.7070501@sheffield.ac.uk> I'm in the middle of writing some code that uses Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using Bioperl from HEAD. I seem to find that $enzyme->is_palindromic always seems to return true. Can anyone verify this? If needs be, I can send some code. Thanks Nathan From info at nanotechcongresssmailer.net Tue Oct 24 10:45:10 2006 From: info at nanotechcongresssmailer.net (International Association of Nanotechnology) Date: Tue, 24 Oct 2006 09:45:10 -0500 Subject: [Bioperl-l] ICNT2006-presents Nanotechnology Workforce Development Message-ID: <200610241445.k9OEjBBA024478@portal.open-bio.org> An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061024/d185772e/attachment.html From bosborne11 at verizon.net Thu Oct 26 12:37:06 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Thu, 26 Oct 2006 12:37:06 -0400 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: <4540DE8F.7070501@sheffield.ac.uk> Message-ID: Nathan, Perhaps because most restriction sites are palindromes. Anyway, I added tests for palindromic() and is_palindromic() where the site is not a palindrome, these tests pass (t/RestrictionAnalyis.t). Brian O. On 10/26/06 12:13 PM, "Nathan Haigh" wrote: > I'm in the middle of writing some code that uses > Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using > Bioperl from HEAD. > > I seem to find that $enzyme->is_palindromic always seems to return true. > Can anyone verify this? If needs be, I can send some code. > > Thanks > Nathan > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From n.haigh at sheffield.ac.uk Thu Oct 26 12:49:48 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 26 Oct 2006 17:49:48 +0100 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: References: Message-ID: <4540E72C.5020800@sheffield.ac.uk> Brian Osborne wrote: > Nathan, > > Perhaps because most restriction sites are palindromes. Anyway, I added > tests for palindromic() and is_palindromic() where the site is not a > palindrome, these tests pass (t/RestrictionAnalyis.t). > > Brian O. > > > On 10/26/06 12:13 PM, "Nathan Haigh" wrote: > > >> I'm in the middle of writing some code that uses >> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using >> Bioperl from HEAD. >> >> I seem to find that $enzyme->is_palindromic always seems to return true. >> Can anyone verify this? If needs be, I can send some code. >> >> Thanks >> Nathan >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > > Ok, thanks - nice to know :-) From cjfields at uiuc.edu Thu Oct 26 12:58:34 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 26 Oct 2006 11:58:34 -0500 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: <4540DE8F.7070501@sheffield.ac.uk> Message-ID: <001301c6f91f$f9611770$15327e82@pyrimidine> > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Nathan Haigh > Sent: Thursday, October 26, 2006 11:13 AM > To: Bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] Bio::Restriction::Enzyme > > I'm in the middle of writing some code that uses > Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using > Bioperl from HEAD. > > I seem to find that $enzyme->is_palindromic always seems to return true. > Can anyone verify this? If needs be, I can send some code. > > Thanks > Nathan You should file a bug report if you have found a test case where this method isn't working as it should, especially if Brian's tests pass and you're still getting the wrong results. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Thu Oct 26 12:57:32 2006 From: jason at bioperl.org (Jason Stajich) Date: Thu, 26 Oct 2006 09:57:32 -0700 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <45408F13.9020209@sheffield.ac.uk> References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk> <45408846.1050001@sheffield.ac.uk> <45408D5C.1000305@biogem.it> <45408F13.9020209@sheffield.ac.uk> Message-ID: Nathan - I agree - the values tend to change with different versions of the applications unfortunately. It would make sense to just test that you get out sequences that are in valid alignment format and perhaps have as many ending sequences as you started with. The more restrictive tests probably aren't reliable with mixing and matching versions. One thing we do for PAML is condition tests on the version used - but of course when a new version comes out we have to add more stuff to the tests (or just have some code that skips those tests). -jason On Oct 26, 2006, at 3:33 AM, Nathan Haigh wrote: > Remo Sanges wrote: >> Nathan Haigh wrote: >>> Sendu Bala wrote: >>> >>>> Nathan Haigh wrote: >>>> >>>>> I'm thinking that it's not wise to test for things like >>>>> overall_percentage_identity etc in alignments that are >>>>> generated by >>>>> external software like T-Coffee, Clustalw etc. Changes to software >>>>> algorithms/efficiency, bug fixes etc may well alter the quality >>>>> of the >>>>> alignment produced in different versions and thus affect the value >>>>> returned by such methods. Therefore, I think these methods >>>>> should only >>>>> be tested from alignments loaded directly from t/data. >>>>> >>>> Did you discover some specific problem cases? >>>> >>> My messages seem to be taking a while to come through, but, yes. >>> It may >>> be due to the software changing default parameters, but it makes >>> testing >>> the output for specific details pretty difficult and >>> inconsistent. For >>> example, running T-Coffee, the following command from t/TCoffee.t >>> results in slightly different alignment: >>> $aln = $factory->run('-type' => 'profile', >>> '-profile' => $aln1, >>> '-seq' => >>> Bio::Root::IO->catfile("t","data","cysprot1b.fa")); >>> >>> Of particular note, is the gaps on the last line of the >>> sequences. In >>> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in >>> >> >> I'm not a T-coffee user but usually you can come across >> these problems when you use different scoring parameters >> when align sequences. >> >> Could it be possible that they have simply changed the >> default parameters for gap penalties and that kind of >> stuff? It is possible to set them? >> >> If so you can just run the test by defining >> the scores in the param hash without using the default. >> >> HTH >> >> Remo > That is true, but it depends on the whether the wrapper is complete > enough to be able to set all the parameters provided by the software. > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From cjfields at uiuc.edu Thu Oct 26 18:01:08 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 26 Oct 2006 17:01:08 -0500 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: Message-ID: <000301c6f94a$3e2a3f10$15327e82@pyrimidine> I have been running into similar issues with EUtilities tests. Since the data on the server is constantly updated I have to try an future-proof the tests so they don't constantly fail. I have been using Test::More and like/unlike or cmp_ok to get around some of those 'fuzzy data' issues. If some methods consistently return a particular type of value, such as an integer, you could use: like($foo->get_value, qr{^\d+$}, 'value test'); #integer or similar. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > Nathan - > > I agree - the values tend to change with different versions of the > applications unfortunately. It would make sense to just test that > you get out sequences that are in valid alignment format and perhaps > have as many ending sequences as you started with. The more > restrictive tests probably aren't reliable with mixing and matching > versions. > > One thing we do for PAML is condition tests on the version used - but > of course when a new version comes out we have to add more stuff to > the tests (or just have some code that skips those tests). > > -jason > On Oct 26, 2006, at 3:33 AM, Nathan Haigh wrote: > > > Remo Sanges wrote: > >> Nathan Haigh wrote: > >>> Sendu Bala wrote: > >>> > >>>> Nathan Haigh wrote: > >>>> > >>>>> I'm thinking that it's not wise to test for things like > >>>>> overall_percentage_identity etc in alignments that are > >>>>> generated by > >>>>> external software like T-Coffee, Clustalw etc. Changes to software > >>>>> algorithms/efficiency, bug fixes etc may well alter the quality > >>>>> of the > >>>>> alignment produced in different versions and thus affect the value > >>>>> returned by such methods. Therefore, I think these methods > >>>>> should only > >>>>> be tested from alignments loaded directly from t/data. > >>>>> > >>>> Did you discover some specific problem cases? > >>>> > >>> My messages seem to be taking a while to come through, but, yes. > >>> It may > >>> be due to the software changing default parameters, but it makes > >>> testing > >>> the output for specific details pretty difficult and > >>> inconsistent. For > >>> example, running T-Coffee, the following command from t/TCoffee.t > >>> results in slightly different alignment: > >>> $aln = $factory->run('-type' => 'profile', > >>> '-profile' => $aln1, > >>> '-seq' => > >>> Bio::Root::IO->catfile("t","data","cysprot1b.fa")); > >>> > >>> Of particular note, is the gaps on the last line of the > >>> sequences. In > >>> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in > >>> >>> > >> I'm not a T-coffee user but usually you can come across > >> these problems when you use different scoring parameters > >> when align sequences. > >> > >> Could it be possible that they have simply changed the > >> default parameters for gap penalties and that kind of > >> stuff? It is possible to set them? > >> > >> If so you can just run the test by defining > >> the scores in the param hash without using the default. > >> > >> HTH > >> > >> Remo > > That is true, but it depends on the whether the wrapper is complete > > enough to be able to set all the parameters provided by the software. > > > > Nath > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich, PhD > Miller Research Fellow > University of California > Dept of Plant and Microbial Biology > 321 Koshland Hall #3102 > Berkeley, CA 94720-3102 > lab: 510.642.8441 > http://pmb.berkeley.edu/~taylor/people/js.html > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From gbazykin at Princeton.EDU Thu Oct 26 18:49:56 2006 From: gbazykin at Princeton.EDU (Georgii A Bazykin) Date: Thu, 26 Oct 2006 18:49:56 -0400 Subject: [Bioperl-l] about PAML running within bioperl In-Reply-To: <001901c6dbcf$9af4de50$0915020a@zchou> References: <001901c6dbcf$9af4de50$0915020a@zchou> Message-ID: <185431468.20061026184956@princeton.edu> I just had the exact same problem, which was also (as in Caleb Davis's case) was solved by switching to PAML 3.14 from 3.15. ------------------------------ Tuesday, September 19, 2006, 5:40:07 AM, you wrote: > Hello, every one, > I use code in the PAML HOWTO (running PAML fom within Bioperl) on > my Linux OS. And I set ENV as described by instructions. At the > beginning, it seems that ClustalW run smoothly. However, when the > programme run to call method "get_MLmatrix", somethign happened. The > following information was listed as follows: (What reason or How to solve these problems?) > ........ > Sequences (2:3) Aligned. Score: 87 > Sequences (2:4) Aligned. Score: 88 > Sequences (2:5) Aligned. Score: 87 > Sequences (2:6) Aligned. Score: 87 > Sequences (2:7) Aligned. Score: 87 > Sequences (2:8) Aligned. Score: 87 > Sequences (3:4) Aligned. Score: 93 > Sequences (3:5) Aligned. Score: 93 > Sequences (3:6) Aligned. Score: 93 > Sequences (3:7) Aligned. Score: 92 > Sequences (3:8) Aligned. Score: 92 > Sequences (4:5) Aligned. Score: 99 > Sequences (4:6) Aligned. Score: 99 > Sequences (4:7) Aligned. Score: 98 > Sequences (4:8) Aligned. Score: 98 > Sequences (5:6) Aligned. Score: 100 > Sequences (5:7) Aligned. Score: 99 > Sequences (5:8) Aligned. Score: 99 > Sequences (6:7) Aligned. Score: 99 > Sequences (6:8) Aligned. Score: 99 > Sequences (7:8) Aligned. Score: 100 > Guide tree file created: > [/home/zchou/TMPDIR/8QEqLivAKY/JU833u8OTP.dnd] > Start of Multiple Alignment > There are 7 groups > Aligning... > Group 1: Sequences: 2 Score:5875 > Group 2: Sequences: 2 Score:5877 > Group 3: Sequences: 4 Score:5864 > Group 4: Sequences: 5 Score:5537 > Group 5: Sequences: 6 Score:5727 > Group 6: Sequences: 7 Score:5608 > Group 7: Sequences: 8 Score:5607 > Alignment Score 43650 > GCG-Alignment file created > [/home/zchou/TMPDIR/8QEqLivAKY/CussPD56rZ] > aligned aa sequences were: Bio::SimpleAlign=HASH(0x87b93f4) > Can't call method "get_MLmatrix" on an undefined value at > originalpaml.pl line 57, line 332. > Zhuocheng Hou > Department of Animal Genetics and Breeding > China Agricultural University From himanshu.ardawatia at bccs.uib.no Thu Oct 26 21:54:36 2006 From: himanshu.ardawatia at bccs.uib.no (Himanshu Ardawatia) Date: Fri, 27 Oct 2006 03:54:36 +0200 Subject: [Bioperl-l] Query on tree bootstrap values Message-ID: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> Hi, 2 questions : 1. I have a phylogenetic tree and I wish to set (or modify or query) bootstrap values for all internal nodes. How do I do that using BioPerl ? 2. I tried the example script attached below for general purpose for the example newick tree with bootstrap values (also attached below) and It gives strange results even for branch length. It shows Parent ID as 0.71 which actually is the bootstrap value for the last ancestral node for human and chimp and It shows the Child node ID as 'Human' ! Am I missing something in the tree formatting ? Results also attached below. Also how to extract / modify/ add bootstrap values in this tree ? Thanks Himanshu EXAMPLE TREE (Newick with bootstrap values and branch lengths) : ################################# ( ('Chimp' : 0.052, 'Human' : 0.042) 0.71 : 0.007, 'Gorilla' : 0.060, ('Gibbon' : 0.124, 'Orangutan' : 0.0971) 1 : 0.038 ); ################################# EXAMPLE SCRIPT: ################################# #!/usr/bin/perl -w use Bio::Seq; # use Bio::TreeIO; use Bio::Tree::TreeI; # get a Tree::NodeI somehow # like from a TreeIO use Bio::TreeIO; # read in a clustalw NJ in phylip/newick format my $treeio = new Bio::TreeIO(-format => 'newick', -file => 'example_newick_tree.newick'); my $tree = $treeio->next_tree; # we'll assume it worked for demo purposes # you might want to test that it was defined my $rootnode = $tree->get_root_node; # process just the next generation foreach my $node ( $rootnode->each_Descendent() ) { print "branch len is ", $node->branch_length, "\n"; } # process all the children my $example_leaf_node; foreach my $node ( $rootnode->get_Descendents() ) { if( $node->is_Leaf ) { print "node is a leaf ... "; # for example use below $example_leaf_node = $node unless defined $example_leaf_node; } print "branch len is ", $node->branch_length, "\n"; } # The ancestor() method points to the parent of a node # A node can only have one parent my $parent = $example_leaf_node->ancestor; # parent won't likely have an description because it is an internal node # but child will because it is a leaf print "Parent id: ", $parent->id," child id: ", $example_leaf_node->id, "\n"; ########################################## RESULTS: branch len is 0.007 branch len is 0.060 branch len is 0.038 node is a leaf ... branch len is 0.042 node is a leaf ... branch len is 0.052 branch len is 0.007 node is a leaf ... branch len is 0.060 node is a leaf ... branch len is 0.0971 node is a leaf ... branch len is 0.124 branch len is 0.038 Parent id: _0.71_ child id: ___'Human'__ From n.haigh at sheffield.ac.uk Fri Oct 27 04:42:23 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 08:42:23 +0000 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: References: Message-ID: <4541C66F.1020404@sheffield.ac.uk> Hi Brian, I wonder if i'm using is_prototype() correctly as I don't seem to get any returning true: my $enz_coll = Bio::Restriction::EnzymeCollection->new(); my $prototype = 0; foreach my $enz ($enz_coll->each_enzyme) { $prototype++ if $enz->is_prototype; } print "$prototype have unique recognition sites\n"; prints: 0 have unique recognition sites Thanks Nath Brian Osborne wrote: > Nathan, > > Perhaps because most restriction sites are palindromes. Anyway, I added > tests for palindromic() and is_palindromic() where the site is not a > palindrome, these tests pass (t/RestrictionAnalyis.t). > > Brian O. > > > On 10/26/06 12:13 PM, "Nathan Haigh" wrote: > > >> I'm in the middle of writing some code that uses >> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using >> Bioperl from HEAD. >> >> I seem to find that $enzyme->is_palindromic always seems to return true. >> Can anyone verify this? If needs be, I can send some code. >> >> Thanks >> Nathan >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > > -- > A: Yes. >> Q: Are you sure? >> >>> A: Because it reverses the logical flow of conversation. >>> >>>> Q: Why is top posting frowned upon? >>>> Get Thunderbird From n.haigh at sheffield.ac.uk Fri Oct 27 04:47:21 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 08:47:21 +0000 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: <001301c6f91f$f9611770$15327e82@pyrimidine> References: <001301c6f91f$f9611770$15327e82@pyrimidine> Message-ID: <4541C799.4090507@sheffield.ac.uk> Chris Fields wrote: >> -----Original Message----- >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >> bounces at lists.open-bio.org] On Behalf Of Nathan Haigh >> Sent: Thursday, October 26, 2006 11:13 AM >> To: Bioperl-l at lists.open-bio.org >> Subject: [Bioperl-l] Bio::Restriction::Enzyme >> >> I'm in the middle of writing some code that uses >> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using >> Bioperl from HEAD. >> >> I seem to find that $enzyme->is_palindromic always seems to return true. >> Can anyone verify this? If needs be, I can send some code. >> >> Thanks >> Nathan >> > > You should file a bug report if you have found a test case where this method > isn't working as it should, especially if Brian's tests pass and you're > still getting the wrong results. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > I was doing some filtering of the default set of enzymes and happened to removed the 2 that are not palindromic before I used is_palindromic(). Thus, I didn't see any that were not palindromic - if that makes sense! Since I know very little about restriction enzymes, I'll trust that these are correct :-) and I'm getting the correct results. Thanks Nath From n.haigh at sheffield.ac.uk Fri Oct 27 05:04:40 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 09:04:40 +0000 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <000301c6f94a$3e2a3f10$15327e82@pyrimidine> References: <000301c6f94a$3e2a3f10$15327e82@pyrimidine> Message-ID: <4541CBA8.10006@sheffield.ac.uk> Chris Fields wrote: > I have been running into similar issues with EUtilities tests. Since the > data on the server is constantly updated I have to try an future-proof the > tests so they don't constantly fail. > > I have been using Test::More and like/unlike or cmp_ok to get around some of > those 'fuzzy data' issues. If some methods consistently return a particular > type of value, such as an integer, you could use: > > like($foo->get_value, qr{^\d+$}, 'value test'); #integer > > or similar. > > Chris > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > >> Nathan - >> >> I agree - the values tend to change with different versions of the >> applications unfortunately. It would make sense to just test that >> you get out sequences that are in valid alignment format and perhaps >> have as many ending sequences as you started with. The more >> restrictive tests probably aren't reliable with mixing and matching >> versions. >> >> One thing we do for PAML is condition tests on the version used - but >> of course when a new version comes out we have to add more stuff to >> the tests (or just have some code that skips those tests). >> >> -jason >> On Oct 26, 2006, at 3:33 AM, Nathan Haigh wrote: >> >> I think it makes sense to test that data of the expected type was returned by the xternal resource but not to test the specifics of what was retured. If specifics are tested we are then in the realm of testing whether we believe the data returned by the external resource or not. We should assume that the domain experts for these resources know what they are doing - in some cases this might not be true :-) but I think we should stick to testing that the objects created hold the expected type of data. I like what Chris had to say (above) but wonder whether tests would/should be tested for in the module itself - i.e. testing that a stored value is an integer and warn/throw if not? Nath From bix at sendu.me.uk Fri Oct 27 05:08:18 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 27 Oct 2006 10:08:18 +0100 Subject: [Bioperl-l] Query on tree bootstrap values In-Reply-To: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> Message-ID: <4541CC82.2040705@sendu.me.uk> Himanshu Ardawatia wrote: > Hi, > > 2 questions : > > 1. I have a phylogenetic tree and I wish to set (or modify or query) > bootstrap values for all internal nodes. How do I do that using BioPerl ? Does bootstrap() not do what you need? > 2. I tried the example script attached below for general purpose for the > example newick tree with bootstrap values (also attached below) and It gives > strange results even for branch length. It shows Parent ID as 0.71 which > actually is the bootstrap value for the last ancestral node for human and > chimp and It shows the Child node ID as 'Human' ! Am I missing something in > the tree formatting ? Results also attached below. Also how to extract / > modify/ add bootstrap values in this tree ? [snip] > EXAMPLE TREE (Newick with bootstrap values and branch lengths) : > ################################# > ( > ('Chimp' : 0.052, > 'Human' : 0.042) 0.71 : 0.007, > 'Gorilla' : 0.060, > ('Gibbon' : 0.124, > 'Orangutan' : 0.0971) 1 : 0.038 > ); > ################################# Are you sure this is in the correct format? For example, with the tree: ( ('Chimp':0.052,'Human':0.042)'Human_Chimp_Ancestor':0.007, 'Gorilla':0.060, ('Gibbon':0.124,'Orangutan':0.0971)'Gibbon_Orangutan_Ancestor':0.038); and your script (with a print "--\n" between the two printing loops for clarity) I get... > ########################################## > > RESULTS: > branch len is 0.007 > branch len is 0.060 > branch len is 0.038 > node is a leaf ... branch len is 0.042 > node is a leaf ... branch len is 0.052 > branch len is 0.007 > node is a leaf ... branch len is 0.060 > node is a leaf ... branch len is 0.0971 > node is a leaf ... branch len is 0.124 > branch len is 0.038 > Parent id: _0.71_ child id: ___'Human'__ ... branch len is 0.007 branch len is 0.060 branch len is 0.038 -- branch len is 0.007 node is a leaf ... branch len is 0.052 node is a leaf ... branch len is 0.042 node is a leaf ... branch len is 0.060 branch len is 0.038 node is a leaf ... branch len is 0.124 node is a leaf ... branch len is 0.0971 Parent id: 'Human_Chimp_Ancestor' child id: 'Chimp' This seems reasonable to me. What were you expecting? From n.haigh at sheffield.ac.uk Fri Oct 27 07:36:10 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 11:36:10 +0000 Subject: [Bioperl-l] Query on tree bootstrap values In-Reply-To: <4541CC82.2040705@sendu.me.uk> References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> <4541CC82.2040705@sendu.me.uk> Message-ID: <4541EF2A.4050600@sheffield.ac.uk> Sendu Bala wrote: > Himanshu Ardawatia wrote: > >> Hi, >> >> 2 questions : >> >> 1. I have a phylogenetic tree and I wish to set (or modify or query) >> bootstrap values for all internal nodes. How do I do that using BioPerl ? >> > > Does bootstrap() not do what you need? > > > >> 2. I tried the example script attached below for general purpose for the >> example newick tree with bootstrap values (also attached below) and It gives >> strange results even for branch length. It shows Parent ID as 0.71 which >> actually is the bootstrap value for the last ancestral node for human and >> chimp and It shows the Child node ID as 'Human' ! Am I missing something in >> the tree formatting ? Results also attached below. Also how to extract / >> modify/ add bootstrap values in this tree ? >> > [snip] > >> EXAMPLE TREE (Newick with bootstrap values and branch lengths) : >> ################################# >> ( >> ('Chimp' : 0.052, >> 'Human' : 0.042) 0.71 : 0.007, >> 'Gorilla' : 0.060, >> ('Gibbon' : 0.124, >> 'Orangutan' : 0.0971) 1 : 0.038 >> ); >> ################################# >> > > Are you sure this is in the correct format? > He/she may have a tree that already contains bootstrap values output from another program. If this is so, which program did you use? Without reminding myself of the formats, you should lookup newick format and whther it is possible to store bootstraps in it. In addition you should also look up the nhx format. > For example, with the tree: > ( ('Chimp':0.052,'Human':0.042)'Human_Chimp_Ancestor':0.007, > 'Gorilla':0.060, > ('Gibbon':0.124,'Orangutan':0.0971)'Gibbon_Orangutan_Ancestor':0.038); > > This tree does not contain any bootstrap values - only branch lengths. Sorry I can't be much more help at the moment - if i get a spare 10 mins i'll have a closer look. Nath From bix at sendu.me.uk Fri Oct 27 07:16:08 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 27 Oct 2006 12:16:08 +0100 Subject: [Bioperl-l] Query on tree bootstrap values In-Reply-To: <4541EF2A.4050600@sheffield.ac.uk> References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> <4541CC82.2040705@sendu.me.uk> <4541EF2A.4050600@sheffield.ac.uk> Message-ID: <4541EA78.3050404@sendu.me.uk> Nathan S. Haigh wrote: > Sendu Bala wrote: >> Himanshu Ardawatia wrote: >>> >>> EXAMPLE TREE (Newick with bootstrap values and branch lengths) : >>> ################################# >>> ( >>> ('Chimp' : 0.052, >>> 'Human' : 0.042) 0.71 : 0.007, >>> 'Gorilla' : 0.060, >>> ('Gibbon' : 0.124, >>> 'Orangutan' : 0.0971) 1 : 0.038 >>> ); >>> ################################# >>> >> Are you sure this is in the correct format? >> > > He/she may have a tree that already contains bootstrap values output > from another program. If this is so, which program did you use? Without > reminding myself of the formats, you should lookup newick format and > whther it is possible to store bootstraps in it. In addition you should > also look up the nhx format. Ah, well from a brief google it seemed like some software do store boostrap values for internal nodes as the node ids when outputting in Newick format. I don't think Bioperl should be able to tell the difference between a normal id and a bootstrap value, so you'll have to detect that yourself and manually use bootstrap() when you get an id that looks like a number. Or should Bioperl be making this assumption for you? Is that a safe thing to do? Maybe as an option only? From n.haigh at sheffield.ac.uk Fri Oct 27 08:24:49 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 12:24:49 +0000 Subject: [Bioperl-l] Query on tree bootstrap values In-Reply-To: <4541EA78.3050404@sendu.me.uk> References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> <4541CC82.2040705@sendu.me.uk> <4541EF2A.4050600@sheffield.ac.uk> <4541EA78.3050404@sendu.me.uk> Message-ID: <4541FA91.3040505@sheffield.ac.uk> --snip-- > > Ah, well from a brief google it seemed like some software do store > boostrap values for internal nodes as the node ids when outputting in > Newick format. I don't think Bioperl should be able to tell the > difference between a normal id and a bootstrap value, so you'll have > to detect that yourself and manually use bootstrap() when you get an > id that looks like a number. If I remember rightly, in programs like Clustal you can specify where bootstrap values are stored - node or branch. I can't remember which is the default way, but TreeView can only see bootstraps in they are stored using the "non-default" setting. This "could" be the same issue here. > > Or should Bioperl be making this assumption for you? Is that a safe > thing to do? Maybe as an option only? I don't know without a closer look - i'd also need to look at the newick format definition as to whether this is an "extension" to the format or if something is just flouting the newick rules. Nath From n.haigh at sheffield.ac.uk Fri Oct 27 08:59:51 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 12:59:51 +0000 Subject: [Bioperl-l] Caching sequences Message-ID: <454202C7.1040701@sheffield.ac.uk> I have a script that is capable of downloading sequences from GenBank based on GI numbers. I retrieve them if fasta format in order to save bandwidth, but I'd like to take this one step further and cache the sequences in case the user want to rerun the script using some of the GI's they used previously. Does anyone have any guidance on how best to do this? Cheers Nath From bix at sendu.me.uk Fri Oct 27 08:35:13 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 27 Oct 2006 13:35:13 +0100 Subject: [Bioperl-l] Caching sequences In-Reply-To: <454202C7.1040701@sheffield.ac.uk> References: <454202C7.1040701@sheffield.ac.uk> Message-ID: <4541FD01.6090803@sendu.me.uk> Nathan S. Haigh wrote: > I have a script that is capable of downloading sequences from GenBank > based on GI numbers. I retrieve them if fasta format in order to save > bandwidth, but I'd like to take this one step further and cache the > sequences in case the user want to rerun the script using some of the > GI's they used previously. > > Does anyone have any guidance on how best to do this? You'd probably write the sequences out in some suitable format and access them via Bio::Index Or, I'm sure bioperl-db excels at this kind of thing, but is a little more involved if this is only a simple situation. From bosborne11 at verizon.net Fri Oct 27 09:09:30 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Fri, 27 Oct 2006 09:09:30 -0400 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: <4541C66F.1020404@sheffield.ac.uk> Message-ID: Nathan, I don't know how this is supposed to work, there would be different ways to make is_prototype true. One way would be to make the enzyme with the first occurrence of a given restriction site the prototype (and the next enzymes with the same site are isoschizomers). Or, one could wait until one site had appeared twice, with 2 different enzymes, then make the first the prototype, etc. I would have done it the first way myself but I took a quick look at IO/withrefm.pm and it looks like it's doing it the second way. That means one can read an enzyme file and end up with no duplicated restriction sites, or prototypes and isoschizomers. Brian O. On 10/27/06 4:42 AM, "Nathan S. Haigh" wrote: > Hi Brian, > > I wonder if i'm using is_prototype() correctly as I don't seem to get > any returning true: > > my $enz_coll = Bio::Restriction::EnzymeCollection->new(); > my $prototype = 0; > foreach my $enz ($enz_coll->each_enzyme) { > $prototype++ if $enz->is_prototype; > } > print "$prototype have unique recognition sites\n"; > > prints: > 0 have unique recognition sites > > Thanks > Nath > > Brian Osborne wrote: >> Nathan, >> >> Perhaps because most restriction sites are palindromes. Anyway, I added >> tests for palindromic() and is_palindromic() where the site is not a >> palindrome, these tests pass (t/RestrictionAnalyis.t). >> >> Brian O. >> >> >> On 10/26/06 12:13 PM, "Nathan Haigh" wrote: >> >> >>> I'm in the middle of writing some code that uses >>> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using >>> Bioperl from HEAD. >>> >>> I seem to find that $enzyme->is_palindromic always seems to return true. >>> Can anyone verify this? If needs be, I can send some code. >>> >>> Thanks >>> Nathan >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> >> >> > From n.haigh at sheffield.ac.uk Fri Oct 27 10:19:02 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 14:19:02 +0000 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: References: Message-ID: <45421556.9060300@sheffield.ac.uk> Brian Osborne wrote: > Nathan, > > I don't know how this is supposed to work, there would be different ways to > make is_prototype true. One way would be to make the enzyme with the first > occurrence of a given restriction site the prototype (and the next enzymes > with the same site are isoschizomers). Or, one could wait until one site had > appeared twice, with 2 different enzymes, then make the first the prototype, > etc. I would have done it the first way myself but I took a quick look at > IO/withrefm.pm and it looks like it's doing it the second way. That means > one can read an enzyme file and end up with no duplicated restriction sites, > or prototypes and isoschizomers. > > Brian O. > > Hmm, I'd have done it the first way also. Doing it the second way would mean you only ended up with something as a prototype if there were multiple enzymes with the same restriction site - is that correct biologically? Nath From n.haigh at sheffield.ac.uk Fri Oct 27 10:23:20 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 14:23:20 +0000 Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage Message-ID: <45421658.5000103@sheffield.ac.uk> As you may be aware by now, i'm working with Bio::Restriction::Analysis and friends. I'm doing restriction analysis on large sequences - chromosomes. I need to identify an appropriate enzyme based on the total length of fragments that are of a certain size (e.g. 100 - 500 bp). However, the amount of memory used by Bio::Restriction::Analysis::fragments() is prohibative. I have the following code (bottom) which downloads 2 thaliana chromosomes (mito and chloro - so pretty small) and runs an analysis and then loops through the fragments for all enzymes in the default collection. My memory usage just keep on climbing and none seems to get freed up even when a $ra goes out of scope (start dealing with the next sequence). Is this a memory leak of some sort, is there a way to free up memory as I go? I'd appreciate any help/advice on how to reduce the amount of memory being consumed as I'd like to use all the thaliana chromosomes (not just mito and chloro), which at the moment probably won't work. Cheers Nath use strict; use Bio::DB::GenBank; use Bio::Restriction::Analysis; use Bio::Restriction::EnzymeCollection; my @seq_objs; my @gis = ( 7525012, 26556996 ); my $db = Bio::DB::GenBank->new(-format => "fasta"); foreach my $gi (@gis) { print "Getting GI: $gi\n"; push @seq_objs, $db->get_Seq_by_id($gi) } my $min_fragment_size = 100; my $max_fragment_size = 500; my $enz_Coll = Bio::Restriction::EnzymeCollection->new(); foreach my $seq (@seq_objs) { my $tot_size = 0; print "Processing ", $seq->primary_id,"\n"; my $ra = Bio::Restriction::Analysis->new( -seq=>$seq, -enzymes=>$enz_Coll, ); my @all_enzymes = $ra->cutters->each_enzyme; print " Calc total length of fragments in range: $min_fragment_size - $max_fragment_size\n"; foreach my $enzyme ( @all_enzymes ) { # fragments() is a real memory hog foreach my $frag ($ra->fragments($enzyme)) { next if $min_fragment_size && (length $frag < $min_fragment_size); next if $max_fragment_size && (length $frag > $max_fragment_size); $tot_size += length $frag; } # do something based on value of $tot_size #print " ", $enzyme->name, " total = $tot_size\n"; } print "DONE\n"; } From avilella at gmail.com Fri Oct 27 09:39:41 2006 From: avilella at gmail.com (Albert Vilella) Date: Fri, 27 Oct 2006 14:39:41 +0100 Subject: [Bioperl-l] scale branch lengths of a tree to sum 1 In-Reply-To: <358f4d650610270623t13875c8bv50278af5ee5b3c44@mail.gmail.com> References: <358f4d650610270623t13875c8bv50278af5ee5b3c44@mail.gmail.com> Message-ID: <358f4d650610270639q14870a6erae2e3c4e9063105d@mail.gmail.com> I respond to myself: I think I found the way: my $tree = $treeio->next_tree; my $total_branch_length = 0; foreach my $node ($tree->get_nodes) { $total_branch_length += $node->branch_length; } foreach my $node ($tree->get_nodes) { my $branch_length = $node->branch_length; next unless (defined($branch_length)); $node->branch_length($branch_length/$total_branch_length); 1; } my $new_branch_length; foreach my $node ($tree->get_nodes) { $new_branch_length += $node->branch_length; } 1; On 10/27/06, Albert Vilella wrote: > Hi all, > > I am in need of a method that would scale the different branch lengths > of a tree so that after the scaling they all sum up to exactly 1. > > Any pointers? Has anyone done that before? > > Thanks in advance, > > Albert. > From cjfields at uiuc.edu Fri Oct 27 10:35:35 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 09:35:35 -0500 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <4541CBA8.10006@sheffield.ac.uk> Message-ID: <001501c6f9d5$2e33e120$15327e82@pyrimidine> ... > I think it makes sense to test that data of the expected type was > returned by the xternal resource but not to test the specifics of what > was retured. If specifics are tested we are then in the realm of testing > whether we believe the data returned by the external resource or not. We > should assume that the domain experts for these resources know what they > are doing - in some cases this might not be true :-) but I think we > should stick to testing that the objects created hold the expected type > of data. > > I like what Chris had to say (above) but wonder whether tests > would/should be tested for in the module itself - i.e. testing that a > stored value is an integer and warn/throw if not? > > Nath Yeah, sorry about the top post (stupid Outlook always sticks the sig at the top of the page!). Testing in the module would be best but can be tricky for the very same reasons that writing tests entail, even more so. For instance, for NCBI esummary data, I parse the data in a very generic way in order to have access to as much data as possible. For tests, I have to assume that NCBI will always return a particular type of value (string, integer, date). I can test for each of those with a regex in the module fairly simply and throw/wanr, as you indicate. However, if they decide to add new data with a data tag other that the ones I test for in the module (i.e. String, Integer, Date), I suddenly have warns/throws showing up and cluttering/clobbering the code for perfectly valid data. However, if these are caught in tests and the tests fail, no big loss. The actual module still works, even if the tests are failing based on an new unknown value being returned. For me, failed tests are sort of a warning light to let me know that something has changed, but it doesn't necessarily mean a module doesn't work. I generally use throw/warn for something truly catastrophic, like no response from the server or an error in the XML, which affects downstream methods. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Fri Oct 27 11:09:36 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 10:09:36 -0500 Subject: [Bioperl-l] Caching sequences In-Reply-To: <454202C7.1040701@sheffield.ac.uk> Message-ID: <001601c6f9d9$ebd8c7f0$15327e82@pyrimidine> > I have a script that is capable of downloading sequences from GenBank > based on GI numbers. I retrieve them if fasta format in order to save > bandwidth, but I'd like to take this one step further and cache the > sequences in case the user want to rerun the script using some of the > GI's they used previously. > > Does anyone have any guidance on how best to do this? > > Cheers > Nath There is Bio::DB::InMemoryCache, which is really an interface but appears to have several methods defined; you could look for modules which implement it. Sendu's suggestion of the Bio::Index modules and bioperl-db are also good starting points. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Fri Oct 27 11:21:49 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 10:21:49 -0500 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: <45421556.9060300@sheffield.ac.uk> Message-ID: <001701c6f9db$9f90d160$15327e82@pyrimidine> > Brian Osborne wrote: > > Nathan, > > > > I don't know how this is supposed to work, there would be different ways > to > > make is_prototype true. One way would be to make the enzyme with the > first > > occurrence of a given restriction site the prototype (and the next > enzymes > > with the same site are isoschizomers). Or, one could wait until one site > had > > appeared twice, with 2 different enzymes, then make the first the > prototype, > > etc. I would have done it the first way myself but I took a quick look > at > > IO/withrefm.pm and it looks like it's doing it the second way. That > means > > one can read an enzyme file and end up with no duplicated restriction > sites, > > or prototypes and isoschizomers. > > > > Brian O. > > > > > Hmm, I'd have done it the first way also. Doing it the second way would > mean you only ended up with something as a prototype if there were > multiple enzymes with the same restriction site - is that correct > biologically? > > Nath I had a look at all the Restriction::IO modules a while back; most need serious updating! It just hasn't been a top priority unfortunately. I think the prototype issue may depend on the IO format and whether or not one is defined explicitly in the file being parsed or is just chosen based on what Brian said (order in the file, similar cutting site). By the strictest definition (and cheating by looking at the Fermentas web site), the prototype is supposed to be the first enzyme discovered which cleaves a unique sequence, so it may not be the first enzyme found in the file. Isoschizomers are those discovered to cleave the same sequence subsequent to the prototype. Neoschizomers cleave the same sequence as a prototype but at a different site. So this calls into question whether the prototype should be defined at all unless it is specifically indicated in the file. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Fri Oct 27 12:47:53 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 16:47:53 +0000 Subject: [Bioperl-l] Caching sequences In-Reply-To: <8273f6c20610270838y76bf0dd1haed77615f46a82fc@mail.gmail.com> References: <454202C7.1040701@sheffield.ac.uk> <001601c6f9d9$ebd8c7f0$15327e82@pyrimidine> <8273f6c20610270838y76bf0dd1haed77615f46a82fc@mail.gmail.com> Message-ID: <45423839.9040503@sheffield.ac.uk> Jason Stajich wrote: > Bio::DB::FileCache does one better and lets you cache the data in a > persistent file. Not sure this index is shareable among users though > - bioperl-db is a better soln when that is desired. Thanks I'll have a look into it. No need for being sharable among users - not unless the script becomes heavily used. Thanks Nath From cjfields at uiuc.edu Fri Oct 27 12:15:00 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 11:15:00 -0500 Subject: [Bioperl-l] StandAloneFasta.t bioperl-run tests Message-ID: <000101c6f9e3$0e5e95d0$15327e82@pyrimidine> Nathan, The test fails you posted on the wiki seem to indicate that using the wrapper works but the order of the returned hits is off. Does the order of the returned hits match the actual FASTA report order? If it does then the tests need to be fixed in a way to make it more flexible, to account for some data 'fuzziness' due to variations in output based on different versions. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Fri Oct 27 12:50:54 2006 From: jason at bioperl.org (Jason Stajich) Date: Fri, 27 Oct 2006 09:50:54 -0700 Subject: [Bioperl-l] Query on tree bootstrap values In-Reply-To: <4541EA78.3050404@sendu.me.uk> References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> <4541CC82.2040705@sendu.me.uk> <4541EF2A.4050600@sheffield.ac.uk> <4541EA78.3050404@sendu.me.uk> Message-ID: <1230E110-01AB-4D4E-842F-20B939555299@bioperl.org> I've answered to this effect this multiple times in the past on the mailing list. newick format does not distinguish between internal ids and bootstrap values (or whatever else you want to attach there). Different programs have different conventions. when both values are present and encoded so that we can parse out the bootstrap like this: [BOOTSTRAP] the parser grabs it out. If you know all the internal ids are boostraps you can just copy the values over manually very simply for my $node ( grep { ! $_->is_Leaf } $tree->get_nodes ) { # get all the internal nodes $node->bootstrap($node->id) if defined $node->id && length($node- >id); # copy id to boostrap $node->id(''); # set internal id to empty } If someone can make this clearer on a wiki page that would be great. On Oct 27, 2006, at 4:16 AM, Sendu Bala wrote: > Nathan S. Haigh wrote: >> Sendu Bala wrote: >>> Himanshu Ardawatia wrote: >>>> >>>> EXAMPLE TREE (Newick with bootstrap values and branch lengths) : >>>> ################################# >>>> ( >>>> ('Chimp' : 0.052, >>>> 'Human' : 0.042) 0.71 : 0.007, >>>> 'Gorilla' : 0.060, >>>> ('Gibbon' : 0.124, >>>> 'Orangutan' : 0.0971) 1 : 0.038 >>>> ); >>>> ################################# >>>> >>> Are you sure this is in the correct format? >>> >> >> He/she may have a tree that already contains bootstrap values output >> from another program. If this is so, which program did you use? >> Without >> reminding myself of the formats, you should lookup newick format and >> whther it is possible to store bootstraps in it. In addition you >> should >> also look up the nhx format. > > Ah, well from a brief google it seemed like some software do store > boostrap values for internal nodes as the node ids when outputting in > Newick format. I don't think Bioperl should be able to tell the > difference between a normal id and a bootstrap value, so you'll > have to > detect that yourself and manually use bootstrap() when you get an id > that looks like a number. > > Or should Bioperl be making this assumption for you? Is that a safe > thing to do? Maybe as an option only? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From avilella at gmail.com Fri Oct 27 09:23:07 2006 From: avilella at gmail.com (Albert Vilella) Date: Fri, 27 Oct 2006 14:23:07 +0100 Subject: [Bioperl-l] scale branch lengths of a tree to sum 1 Message-ID: <358f4d650610270623t13875c8bv50278af5ee5b3c44@mail.gmail.com> Hi all, I am in need of a method that would scale the different branch lengths of a tree so that after the scaling they all sum up to exactly 1. Any pointers? Has anyone done that before? Thanks in advance, Albert. From cjfields at uiuc.edu Fri Oct 27 14:34:57 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 13:34:57 -0500 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign Message-ID: <000001c6f9f6$9ab12710$15327e82@pyrimidine> I am working an refactoring the AlignIO::stockholm parser to get it reading and writing Pfam/Rfam alignments, and noticed that many alignments have EMBL-like annotations attached, which pertain to the entire alignment: # STOCKHOLM 1.0 #=GF ID ykkC-yxkD #=GF AC RF00442 #=GF DE ykkC-yxkD element #=GF AU Moxon SJ #=GF GA 20.0 #=GF NC 0.1 #=GF TC 59.4 #=GF SE Barrick JE, Breaker RR #=GF SS Predicted; Barrick JE, Breaker RR #=GF TP Cis-reg; riboswitch; #=GF BM cmbuild CM SEED #=GF BM cmsearch -W 175 CM SEQDB #=GF RN [1] #=GF RM 15096624 #=GF RT New RNA motifs suggest an expanded scope for riboswitches in #=GF RT bacterial genetic control. #=GF RA Barrick JE, Corbino KA, Winkler WC, Nahvi A, Mandal M, Collins J, Lee #=GF RA M, Roth A, Sudarsan N, Jona I, Wickiser JK, Breaker RR; #=GF RL Proc Natl Acad Sci U S A 2004;101:6421-6426. #=GF CC This family represents the bacterial ykkC/yxkD element. The function of #=GF CC this family is unclear although it has been suggested that it may function #=GF CC to switch on efflux pumps and detoxification systems in response to harmful #=GF CC environmental molecules [1]. The Thermoanaerobacter tengcongensis sequence #=GF CC EMBL:AE013027 overlaps with Rfam:RF00167 suggesting that the two #=GF CC riboswitches may work in conjunction to regulate the the upstream gene #=GF CC which codes for Swiss:Q8RC62, a member of Pfam:PF00860 (Personal obs. Moxon #=GF CC SJ). #=GF SQ 16 SimpleAlign, as implemented, seemingly doesn't have a way to store this information. I'll work on getting the core alignment IO working, but would there be any interest in having a way to store annotations in Bio::SimpleAlign? I'm guessing the methods would be similar to the various Bio::Seq Annotation methods. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Fri Oct 27 16:23:46 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 27 Oct 2006 16:23:46 -0400 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: <000001c6f9f6$9ab12710$15327e82@pyrimidine> References: <000001c6f9f6$9ab12710$15327e82@pyrimidine> Message-ID: You could make SimpleAlign be a Bio::AnnotationHolderI. (I suppose this is what you meant by the 'various Bio::Seq Annotation methods' too.) Just to make sure I'm not misunderstanding, I suppose the annotation pertains to the entire alignment? -hilmar On Oct 27, 2006, at 2:34 PM, Chris Fields wrote: > I am working an refactoring the AlignIO::stockholm parser to get it > reading > and writing Pfam/Rfam alignments, and noticed that many alignments > have > EMBL-like annotations attached, which pertain to the entire alignment: > > # STOCKHOLM 1.0 > #=GF ID ykkC-yxkD > #=GF AC RF00442 > #=GF DE ykkC-yxkD element > #=GF AU Moxon SJ > #=GF GA 20.0 > #=GF NC 0.1 > #=GF TC 59.4 > #=GF SE Barrick JE, Breaker RR > #=GF SS Predicted; Barrick JE, Breaker RR > #=GF TP Cis-reg; riboswitch; > #=GF BM cmbuild CM SEED > #=GF BM cmsearch -W 175 CM SEQDB > #=GF RN [1] > #=GF RM 15096624 > #=GF RT New RNA motifs suggest an expanded scope for > riboswitches in > #=GF RT bacterial genetic control. > #=GF RA Barrick JE, Corbino KA, Winkler WC, Nahvi A, Mandal M, > Collins J, > Lee > #=GF RA M, Roth A, Sudarsan N, Jona I, Wickiser JK, Breaker RR; > #=GF RL Proc Natl Acad Sci U S A 2004;101:6421-6426. > #=GF CC This family represents the bacterial ykkC/yxkD element. The > function of > #=GF CC this family is unclear although it has been suggested > that it may > function > #=GF CC to switch on efflux pumps and detoxification systems in > response > to harmful > #=GF CC environmental molecules [1]. The Thermoanaerobacter > tengcongensis > sequence > #=GF CC EMBL:AE013027 overlaps with Rfam:RF00167 suggesting that > the two > #=GF CC riboswitches may work in conjunction to regulate the the > upstream > gene > #=GF CC which codes for Swiss:Q8RC62, a member of Pfam:PF00860 > (Personal > obs. Moxon > #=GF CC SJ). > #=GF SQ 16 > > SimpleAlign, as implemented, seemingly doesn't have a way to store > this > information. > > I'll work on getting the core alignment IO working, but would there > be any > interest in having a way to store annotations in Bio::SimpleAlign? > I'm > guessing the methods would be similar to the various Bio::Seq > Annotation > methods. > > Chris > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Fri Oct 27 16:38:17 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 15:38:17 -0500 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: Message-ID: <000001c6fa07$d8659990$15327e82@pyrimidine> Hilmar Lapp wrote: > You could make SimpleAlign be a Bio::AnnotationHolderI. (I > suppose this is what you meant by the 'various Bio::Seq Annotation > methods' too.) > > Just to make sure I'm not misunderstanding, I suppose the > annotation pertains to the entire alignment? > > -hilmar ... Yes, that's correct. I would probably use Bio::Seq::Meta for the sequence-specific markup lines. I would have to add another new method to deal with non-sequence-based consensus data (like sec. structure) for now. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Fri Oct 27 11:38:05 2006 From: jason at bioperl.org (Jason Stajich) Date: Fri, 27 Oct 2006 08:38:05 -0700 Subject: [Bioperl-l] Caching sequences In-Reply-To: <001601c6f9d9$ebd8c7f0$15327e82@pyrimidine> References: <454202C7.1040701@sheffield.ac.uk> <001601c6f9d9$ebd8c7f0$15327e82@pyrimidine> Message-ID: <8273f6c20610270838y76bf0dd1haed77615f46a82fc@mail.gmail.com> Bio::DB::FileCache does one better and lets you cache the data in a persistent file. Not sure this index is shareable among users though - bioperl-db is a better soln when that is desired. -jason On 10/27/06, Chris Fields wrote: > > > I have a script that is capable of downloading sequences from GenBank > > based on GI numbers. I retrieve them if fasta format in order to save > > bandwidth, but I'd like to take this one step further and cache the > > sequences in case the user want to rerun the script using some of the > > GI's they used previously. > > > > Does anyone have any guidance on how best to do this? > > > > Cheers > > Nath > > There is Bio::DB::InMemoryCache, which is really an interface but appears > to > have several methods defined; you could look for modules which implement > it. > Sendu's suggestion of the Bio::Index modules and bioperl-db are also good > starting points. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich jason at bioperl.org http://www.duke.edu/~jes12/ From cjfields at uiuc.edu Fri Oct 27 21:57:58 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 20:57:58 -0500 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: References: <000001c6f9f6$9ab12710$15327e82@pyrimidine> Message-ID: <24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu> On Oct 27, 2006, at 3:23 PM, Hilmar Lapp wrote: > You could make SimpleAlign be a Bio::AnnotationHolderI. (I suppose > this is what you meant by the 'various Bio::Seq Annotation methods' > too.) > > Just to make sure I'm not misunderstanding, I suppose the annotation > pertains to the entire alignment? > > -hilmar BTW, was that supposed to be Bio::AnnotatableI, or Bio::AnnotationHolderI? The latter isn't present in CVS HEAD. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From eric.ross at neuro.utah.edu Sat Oct 28 17:24:30 2006 From: eric.ross at neuro.utah.edu (Eric Ross) Date: Sat, 28 Oct 2006 15:24:30 -0600 Subject: [Bioperl-l] PAML References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu> Message-ID: <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu> I am trying to extract the "Naive Empirical Bayes (NEB) probabilities" from a Bio::Tools::Phylo::PAML::Result object. I am able to extract other data from the report, but there seems to be a conflict in the documentation. One doc implies that there should be a get_posteriors method. (It's used as an example in the Bio::Tools::Phylo::PAML doc), but the method does not appear to exist in the Bio::Tools::Phylo::PAML::Result object. I have been trying various methods, in the event I'm just "confused", but I've had no luck, thus far. Anyone have suggestions? code: ----begin code------- #!/usr/bin/perl -w use strict; use Bio::Tools::Phylo::PAML; my $parser = new Bio::Tools::Phylo::PAML (-file => "mlc"); my $result = $parser->next_result; my @posteriors = $result->get_posteriors(); print "@posteriors"; exit(0); ---------end code------------- --------------- Eric Ross Computer Analyst II ejr at neuro.utah.edu Howard Hughes Medical Institute University of Utah S?nchez Lab From avilella at gmail.com Sun Oct 29 05:52:04 2006 From: avilella at gmail.com (Albert Vilella) Date: Sun, 29 Oct 2006 10:52:04 +0000 Subject: [Bioperl-l] PAML In-Reply-To: <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu> References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu> <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu> Message-ID: <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com> I don't know if this method is implemented. I can't grep-find it. Maybe it's simply not there yet, but was planned when the documentation was written. On 10/28/06, Eric Ross wrote: > I am trying to extract the "Naive Empirical Bayes (NEB) probabilities" from a Bio::Tools::Phylo::PAML::Result object. > > I am able to extract other data from the report, but there seems to be a conflict in the documentation. One doc implies that there should be a get_posteriors method. (It's used as an example in the Bio::Tools::Phylo::PAML doc), but the method does not appear to exist in the Bio::Tools::Phylo::PAML::Result object. > > > I have been trying various methods, in the event I'm just "confused", but I've had no luck, thus far. Anyone have suggestions? > > > code: > > ----begin code------- > #!/usr/bin/perl -w > > use strict; > > > use Bio::Tools::Phylo::PAML; > my $parser = new Bio::Tools::Phylo::PAML > (-file => "mlc"); > my $result = $parser->next_result; > my @posteriors = $result->get_posteriors(); > > print "@posteriors"; > > exit(0); > > ---------end code------------- > > > > --------------- > Eric Ross > Computer Analyst II > ejr at neuro.utah.edu > Howard Hughes Medical Institute > University of Utah > S?nchez Lab > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at uiuc.edu Sun Oct 29 09:23:45 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 29 Oct 2006 08:23:45 -0600 Subject: [Bioperl-l] PAML In-Reply-To: <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com> References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu> <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu> <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com> Message-ID: <9E776AD6-76E1-403A-8003-C27958F21D90@uiuc.edu> Does the data show up in the object using Data::Dumper? This should be filed as a bug since the docs imply the method exists. This could be written up fairly quickly if one had test data and and a script to work with (hint hint...) Chris On Oct 29, 2006, at 4:52 AM, Albert Vilella wrote: > I don't know if this method is implemented. I can't grep-find it. > Maybe it's simply not there yet, but was planned when the > documentation was written. > > On 10/28/06, Eric Ross wrote: >> I am trying to extract the "Naive Empirical Bayes (NEB) >> probabilities" from a Bio::Tools::Phylo::PAML::Result object. >> >> I am able to extract other data from the report, but there seems >> to be a conflict in the documentation. One doc implies that there >> should be a get_posteriors method. (It's used as an example in the >> Bio::Tools::Phylo::PAML doc), but the method does not appear to >> exist in the Bio::Tools::Phylo::PAML::Result object. >> >> >> I have been trying various methods, in the event I'm just >> "confused", but I've had no luck, thus far. Anyone have suggestions? >> >> >> code: >> >> ----begin code------- >> #!/usr/bin/perl -w >> >> use strict; >> >> >> use Bio::Tools::Phylo::PAML; >> my $parser = new Bio::Tools::Phylo::PAML >> (-file => "mlc"); >> my $result = $parser->next_result; >> my @posteriors = $result->get_posteriors(); >> >> print "@posteriors"; >> >> exit(0); >> >> ---------end code------------- >> >> >> >> --------------- >> Eric Ross >> Computer Analyst II >> ejr at neuro.utah.edu >> Howard Hughes Medical Institute >> University of Utah >> S?nchez Lab >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From eric.ross at neuro.utah.edu Sun Oct 29 12:06:54 2006 From: eric.ross at neuro.utah.edu (Eric Ross) Date: Sun, 29 Oct 2006 10:06:54 -0700 Subject: [Bioperl-l] PAML References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu> <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu> <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com> <9E776AD6-76E1-403A-8003-C27958F21D90@uiuc.edu> Message-ID: <2DCC13A920E6C640AD6257DEFA01F78408A3F5@CAMPUSV1.xds.umail.utah.edu> Thanks for all the help. I've been looking at the code for the PAML rst parser. It's a bit tricky. We have written a parser specific for our needs, but it looks to be a pretty complicated matter to make it generic. The output of PAML can vary a lot depending upon your options and this section can be repeated multiple times. I'm sure someone with a good grasp of the potential output of PAML could come up with something, but I'll admit to being at a loss. --------------- Eric Ross Computer Analyst II ejr at neuro.utah.edu Howard Hughes Medical Institute University of Utah S?nchez Lab -----Original Message----- From: Chris Fields [mailto:cjfields at uiuc.edu] Sent: Sun 2006-10-29 7:23 AM To: Albert Vilella Cc: Eric Ross; Bioperl-l at lists.open-bio.org Subject: Re: [Bioperl-l] PAML Does the data show up in the object using Data::Dumper? This should be filed as a bug since the docs imply the method exists. This could be written up fairly quickly if one had test data and and a script to work with (hint hint...) Chris On Oct 29, 2006, at 4:52 AM, Albert Vilella wrote: > I don't know if this method is implemented. I can't grep-find it. > Maybe it's simply not there yet, but was planned when the > documentation was written. > > On 10/28/06, Eric Ross wrote: >> I am trying to extract the "Naive Empirical Bayes (NEB) >> probabilities" from a Bio::Tools::Phylo::PAML::Result object. >> >> I am able to extract other data from the report, but there seems >> to be a conflict in the documentation. One doc implies that there >> should be a get_posteriors method. (It's used as an example in the >> Bio::Tools::Phylo::PAML doc), but the method does not appear to >> exist in the Bio::Tools::Phylo::PAML::Result object. >> >> >> I have been trying various methods, in the event I'm just >> "confused", but I've had no luck, thus far. Anyone have suggestions? >> >> >> code: >> >> ----begin code------- >> #!/usr/bin/perl -w >> >> use strict; >> >> >> use Bio::Tools::Phylo::PAML; >> my $parser = new Bio::Tools::Phylo::PAML >> (-file => "mlc"); >> my $result = $parser->next_result; >> my @posteriors = $result->get_posteriors(); >> >> print "@posteriors"; >> >> exit(0); >> >> ---------end code------------- >> >> >> >> --------------- >> Eric Ross >> Computer Analyst II >> ejr at neuro.utah.edu >> Howard Hughes Medical Institute >> University of Utah >> S?nchez Lab >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Sun Oct 29 12:43:20 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sun, 29 Oct 2006 17:43:20 +0000 Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage In-Reply-To: <45421658.5000103@sheffield.ac.uk> References: <45421658.5000103@sheffield.ac.uk> Message-ID: <4544E838.7090400@sheffield.ac.uk> Sorry for the repeat post but I haven't had a response. Just wondered if anyone had any idea about this? Thanks Nath Nathan S. Haigh wrote: > As you may be aware by now, i'm working with Bio::Restriction::Analysis > and friends. > > I'm doing restriction analysis on large sequences - chromosomes. I need > to identify an appropriate enzyme based on the total length of fragments > that are of a certain size (e.g. 100 - 500 bp). However, the amount of > memory used by Bio::Restriction::Analysis::fragments() is prohibative. I > have the following code (bottom) which downloads 2 thaliana chromosomes > (mito and chloro - so pretty small) and runs an analysis and then loops > through the fragments for all enzymes in the default collection. > > My memory usage just keep on climbing and none seems to get freed up > even when a $ra goes out of scope (start dealing with the next > sequence). Is this a memory leak of some sort, is there a way to free up > memory as I go? I'd appreciate any help/advice on how to reduce the > amount of memory being consumed as I'd like to use all the thaliana > chromosomes (not just mito and chloro), which at the moment probably > won't work. > > Cheers > Nath > > use strict; > use Bio::DB::GenBank; > use Bio::Restriction::Analysis; > use Bio::Restriction::EnzymeCollection; > > my @seq_objs; > my @gis = ( 7525012, 26556996 ); > > my $db = Bio::DB::GenBank->new(-format => "fasta"); > foreach my $gi (@gis) { > print "Getting GI: $gi\n"; > push @seq_objs, $db->get_Seq_by_id($gi) > } > > my $min_fragment_size = 100; > my $max_fragment_size = 500; > my $enz_Coll = Bio::Restriction::EnzymeCollection->new(); > > foreach my $seq (@seq_objs) { > my $tot_size = 0; > print "Processing ", $seq->primary_id,"\n"; > my $ra = Bio::Restriction::Analysis->new( > -seq=>$seq, > -enzymes=>$enz_Coll, > ); > > my @all_enzymes = $ra->cutters->each_enzyme; > print " Calc total length of fragments in range: $min_fragment_size - > $max_fragment_size\n"; > foreach my $enzyme ( @all_enzymes ) { > # fragments() is a real memory hog > foreach my $frag ($ra->fragments($enzyme)) { > next if $min_fragment_size && (length $frag < $min_fragment_size); > next if $max_fragment_size && (length $frag > $max_fragment_size); > $tot_size += length $frag; > } > # do something based on value of $tot_size > #print " ", $enzyme->name, " total = $tot_size\n"; > } > print "DONE\n"; > } > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at uiuc.edu Sun Oct 29 13:09:54 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 29 Oct 2006 12:09:54 -0600 Subject: [Bioperl-l] PAML In-Reply-To: <2DCC13A920E6C640AD6257DEFA01F78408A3F5@CAMPUSV1.xds.umail.utah.edu> References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu> <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu> <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com> <9E776AD6-76E1-403A-8003-C27958F21D90@uiuc.edu> <2DCC13A920E6C640AD6257DEFA01F78408A3F5@CAMPUSV1.xds.umail.utah.edu> Message-ID: On Oct 29, 2006, at 11:06 AM, Eric Ross wrote: > Thanks for all the help. > > I've been looking at the code for the PAML rst parser. It's a bit > tricky. > > We have written a parser specific for our needs, but it looks to be > a pretty complicated matter to make it generic. > > The output of PAML can vary a lot depending upon your options and > this section can be repeated multiple times. I'm sure someone with > a good grasp of the potential output of PAML could come up with > something, but I'll admit to being at a loss. Eric, I planned on looking at ways to integrate the protein-based PAML programs but I'm working on a different area at the moment. I agree it may be hard to adequately genericize parsing/methods to accomplish this, but if you have any ideas feel free to post them. Again, I would suggest adding any proposed enhancements or bugs to Bugzilla: http://bugzilla.open-bio.org/ Suggestions or bug reports on the list sometimes get lost in the shuffle, esp. since we're planning on a new developer release soon. Chris Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Sun Oct 29 13:16:37 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 29 Oct 2006 12:16:37 -0600 Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage In-Reply-To: <4544E838.7090400@sheffield.ac.uk> References: <45421658.5000103@sheffield.ac.uk> <4544E838.7090400@sheffield.ac.uk> Message-ID: <6D9EAA04-199C-4BDD-AA60-4833BC1CE250@uiuc.edu> On Oct 29, 2006, at 11:43 AM, Nathan S. Haigh wrote: > Sorry for the repeat post but I haven't had a response. Just > wondered if > anyone had any idea about this? > > Thanks > Nath ... I think Warnock applies here. Likely no one is really sure, hence they aren't answering. It probably bears investigating by submitting and tracking as a bug. My guess is something isn't garbage-collected properly (i.e. there are circular references present), leading to a memory leak. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From chhalling at alumni.ls.berkeley.edu Sun Oct 29 14:16:36 2006 From: chhalling at alumni.ls.berkeley.edu (Conrad Halling) Date: Sun, 29 Oct 2006 14:16:36 -0500 Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage In-Reply-To: <4544E838.7090400@sheffield.ac.uk> References: <45421658.5000103@sheffield.ac.uk> <4544E838.7090400@sheffield.ac.uk> Message-ID: <4544FE14.7030701@alumni.ls.berkeley.edu> Nathan S. Haigh wrote: > Sorry for the repeat post but I haven't had a response. Just wondered if > anyone had any idea about this? > > Thanks > Nath > > Nathan S. Haigh wrote: > >> As you may be aware by now, i'm working with Bio::Restriction::Analysis >> and friends. >> >> I'm doing restriction analysis on large sequences - chromosomes. I need >> to identify an appropriate enzyme based on the total length of fragments >> that are of a certain size (e.g. 100 - 500 bp). However, the amount of >> memory used by Bio::Restriction::Analysis::fragments() is prohibative. I >> have the following code (bottom) which downloads 2 thaliana chromosomes >> (mito and chloro - so pretty small) and runs an analysis and then loops >> through the fragments for all enzymes in the default collection. >> >> My memory usage just keep on climbing and none seems to get freed up >> even when a $ra goes out of scope (start dealing with the next >> sequence). Is this a memory leak of some sort, is there a way to free up >> memory as I go? I'd appreciate any help/advice on how to reduce the >> amount of memory being consumed as I'd like to use all the thaliana >> chromosomes (not just mito and chloro), which at the moment probably >> won't work. >> >> Cheers >> Nath >> >> use strict; >> use Bio::DB::GenBank; >> use Bio::Restriction::Analysis; >> use Bio::Restriction::EnzymeCollection; >> >> my @seq_objs; >> my @gis = ( 7525012, 26556996 ); >> >> my $db = Bio::DB::GenBank->new(-format => "fasta"); >> foreach my $gi (@gis) { >> print "Getting GI: $gi\n"; >> push @seq_objs, $db->get_Seq_by_id($gi) >> } >> >> my $min_fragment_size = 100; >> my $max_fragment_size = 500; >> my $enz_Coll = Bio::Restriction::EnzymeCollection->new(); >> >> foreach my $seq (@seq_objs) { >> my $tot_size = 0; >> print "Processing ", $seq->primary_id,"\n"; >> my $ra = Bio::Restriction::Analysis->new( >> -seq=>$seq, >> -enzymes=>$enz_Coll, >> ); >> >> my @all_enzymes = $ra->cutters->each_enzyme; >> print " Calc total length of fragments in range: $min_fragment_size - >> $max_fragment_size\n"; >> foreach my $enzyme ( @all_enzymes ) { >> # fragments() is a real memory hog >> foreach my $frag ($ra->fragments($enzyme)) { >> next if $min_fragment_size && (length $frag < $min_fragment_size); >> next if $max_fragment_size && (length $frag > $max_fragment_size); >> $tot_size += length $frag; >> } >> # do something based on value of $tot_size >> #print " ", $enzyme->name, " total = $tot_size\n"; >> } >> print "DONE\n"; >> } >> >> Try this code, which creates a new Bio::Restriction::Analysis object for each digest. On my PowerBook, this doesn't use more than 13 Mb of memory. Reading the code for Bio::Restriction::Analysis reveals that the fragments() method calls the cut() method. The documentation for the cut method states: Note: cut doesn't now re-initialize everything before figuring out cuts. This is so that you can do multiple digests, or add more data or whatever. You'll have to use new to reset everything. This means there is no memory leak; it's just that the Bio::Restriction::Analysis object is retaining cut information for each enzyme, which takes a lot of memory. use strict; use warnings; use Bio::DB::GenBank; use Bio::Restriction::Analysis; use Bio::Restriction::EnzymeCollection; my @seq_objs; my @gis = ( 7525012, 26556996 ); my $db = Bio::DB::GenBank->new(-format => "fasta"); foreach my $gi (@gis) { print "Getting GI: $gi\n"; push @seq_objs, $db->get_Seq_by_id($gi) } my $min_fragment_size = 100; my $max_fragment_size = 500; my $enz_Coll = Bio::Restriction::EnzymeCollection->new(); foreach my $seq (@seq_objs) { print "Processing ", $seq->primary_id, "\n"; foreach my $enzyme ( $enz_Coll->each_enzyme() ) { my $ra = Bio::Restriction::Analysis->new( -seq => $seq, -enzymes => $enzyme ); my $tot_size = 0; print " Calc total length of fragments in range: $min_fragment_size -" . " $max_fragment_size\n"; foreach my $frag ($ra->fragments($enzyme)) { next if $min_fragment_size && (length $frag < $min_fragment_size); next if $max_fragment_size && (length $frag > $max_fragment_size); $tot_size += length $frag; } # do something based on value of $tot_size print " ", $enzyme->name, " total = $tot_size\n"; } print "DONE\n"; } -- Conrad Halling chhalling at alumni.ls.berkeley.edu From n.haigh at sheffield.ac.uk Mon Oct 30 03:51:49 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 30 Oct 2006 08:51:49 +0000 Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id() Message-ID: <4545BD25.3030107@sheffield.ac.uk> In my script I retrieve sequences from GenBank in FASTA format by GI numbers and optionally store the sequence in a cache using Bio::DB::Fasta. On subsequent runs of the script, the cache is first checked for the GI and returns the sequence if it is found or the sequence is obtained from GenBank as above. I would have thought that Bio::DB::Fasta::get_Seq_by_id() would have returned a Bio::Seq object but rather it returns a Bio::PrimarySeq object which is defined within the Bio::DB::Fasta file. This is annoying, since $seq_obj in my script would be either a Bio::Seq if it was obtained from GenBank or a Bio::PrimarySeq if obtained from the cache and calling primary_id() on it doesn't do the expected thing with Bio::PrimarySeq: ID: Bio::PrimarySeq::Fasta=HASH(0x89b4508) Is there a reason why Bio::DB::Fasta doesn't return a Bio::Seq object? Nath From yuhki at ncifcrf.gov Mon Oct 30 08:57:35 2006 From: yuhki at ncifcrf.gov (Naoya Yuhki) Date: Mon, 30 Oct 2006 08:57:35 -0500 Subject: [Bioperl-l] bptutorial.pl 0 Message-ID: <46170A95-13C8-4839-BD70-EF086ED1E0DA@ncifcrf.gov> Hello, I run perl bptutorial.pl 0 and I got the following error. -------------------- WARNING --------------------- MSG: id (ROA1_HUMAN) does not exist --------------------------------------------------- Can't call method "display_id" on an undefined value at bptutorial.pl line 3945. other tests all worked. I thank any suggestions from you. NAOYA YUHKI. From cjfields at uiuc.edu Mon Oct 30 12:42:21 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 30 Oct 2006 11:42:21 -0600 Subject: [Bioperl-l] bptutorial.pl 0 In-Reply-To: <46170A95-13C8-4839-BD70-EF086ED1E0DA@ncifcrf.gov> Message-ID: <000601c6fc4a$c3e43450$15327e82@pyrimidine> > Hello, > I run > > perl bptutorial.pl 0 > > and I got the following error. > > -------------------- WARNING --------------------- > MSG: id (ROA1_HUMAN) does not exist > --------------------------------------------------- > Can't call method "display_id" on an undefined value at bptutorial.pl > line 3945. > > other tests all worked. > > I thank any suggestions from you. > > NAOYA YUHKI. What version of Bioperl are you running? As a warning, the bptutorial.pl script has been removed from CVS and will not be included in future versions of Bioperl. It can be found on the bioperl wiki instead: http://www.bioperl.org/wiki/Bptutorial chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Mon Oct 30 13:08:15 2006 From: jason at bioperl.org (Jason Stajich) Date: Mon, 30 Oct 2006 10:08:15 -0800 Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id() In-Reply-To: <4545BD25.3030107@sheffield.ac.uk> References: <4545BD25.3030107@sheffield.ac.uk> Message-ID: <29F47393-D134-4093-8751-E948BF521843@bioperl.org> Bio::PrimarySeq makes sense because Fasta databases only provide sequences without features. But you are actually getting a Bio::PrimarySeq::Fasta object which is a proxy object since the module won't pull a whole sequence into memory unless seq() is requested. The problem is really why you are getting something useless set for primary_id. What do you want it to be - the GI number? you'll need to explicitly set it because DB::Fasta has no concept of GI numbers encoded in the header line. AFAIK you cannot also set the primary_id to a value of your liking because this a proxy object. The best bet is to create a Bio::Seq object out of one of these and set the primary_id and display_id to values that you can compute from the display_id. At least that has been my strategy when using this - maybe someone wants to code something new into the object itsself. -jason On Oct 30, 2006, at 12:51 AM, Nathan S. Haigh wrote: > In my script I retrieve sequences from GenBank in FASTA format by GI > numbers and optionally store the sequence in a cache using > Bio::DB::Fasta. On subsequent runs of the script, the cache is first > checked for the GI and returns the sequence if it is found or the > sequence is obtained from GenBank as above. > > I would have thought that Bio::DB::Fasta::get_Seq_by_id() would have > returned a Bio::Seq object but rather it returns a Bio::PrimarySeq > object which is defined within the Bio::DB::Fasta file. This is > annoying, since $seq_obj in my script would be either a Bio::Seq if it > was obtained from GenBank or a Bio::PrimarySeq if obtained from the > cache and calling primary_id() on it doesn't do the expected thing > with > Bio::PrimarySeq: > ID: Bio::PrimarySeq::Fasta=HASH(0x89b4508) > > Is there a reason why Bio::DB::Fasta doesn't return a Bio::Seq object? > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From golharam at umdnj.edu Mon Oct 30 15:11:51 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Mon, 30 Oct 2006 15:11:51 -0500 Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String? Message-ID: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1> I'm trying to parse some blast output w/o actually creating the output file. Instead, I'm capturing the output in a variable and would like to use IO::String to represent the file: $_ = `megablast -d somedatabase -i somesequence -D 2`; my $blast_file = new IO::String($_); my $searchio = new Bio::SearchIO(-format => 'blast', -fh => $blast_file); my $results = $searchio->next_result; my $hit = $results->next_hit; if (! defined($hit)) { warn "No BLAST hit for $accession on chr $chr for Seq/$orth_id/$organism\n\n"; return; } Now, when Bio::SearchIO tries to read the output line by line, instead it reads the entire output as 1 line. If I provide the output in a file and use: my $searchio = new Bio::SearchIO(-format => 'blast', -file => '/tmp/somefile.blast'); This works...so is it possible to use IO::String to provide Bio::SearchIO with BLAST output? Ryan From golharam at umdnj.edu Mon Oct 30 15:54:29 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Mon, 30 Oct 2006 15:54:29 -0500 Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String? In-Reply-To: <716af09c0610301244q1c6d4cd6y23e72371dda4fa4d@mail.gmail.com> Message-ID: <00e801c6fc65$9849aee0$e6028a0a@GOLHARMOBILE1> Thanks. How are you getting the output? system()? BTW- I'm using v1.5.1... > -----Original Message----- > From: Bernd Web [mailto:bernd.web at gmail.com] > Sent: Monday, October 30, 2006 3:45 PM > To: golharam at umdnj.edu > Cc: bioperl-l > Subject: Re: [Bioperl-l] Is it possible to parse BLAST output > using IO:String? > > > Hi Ryan, > > I parse blastn output using IO::String w/o problems: > > my $stringfh = new IO::String($input); > my $in = new Bio::SearchIO(-format => 'blast', -fh => $stringfh); > > however this is input does not come via backticks. > > > bernd > > On 10/30/06, Ryan Golhar wrote: > > I'm trying to parse some blast output w/o actually creating > the output > > file. Instead, I'm capturing the output in a variable and > would like > > to use IO::String to represent the file: > > > > $_ = `megablast -d somedatabase -i somesequence -D 2`; > > my $blast_file = new IO::String($_); > > my $searchio = new Bio::SearchIO(-format => 'blast', -fh => > > $blast_file); > > my $results = $searchio->next_result; > > my $hit = $results->next_hit; > > if (! defined($hit)) { > > warn "No BLAST hit for $accession on chr $chr for > > Seq/$orth_id/$organism\n\n"; > > return; > > } > > > > Now, when Bio::SearchIO tries to read the output line by > line, instead > > it reads the entire output as 1 line. > > > > If I provide the output in a file and use: > > > > my $searchio = new Bio::SearchIO(-format => > 'blast', -file => > > '/tmp/somefile.blast'); > > > > This works...so is it possible to use IO::String to provide > > Bio::SearchIO with BLAST output? > > > > Ryan > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > From bix at sendu.me.uk Mon Oct 30 16:27:58 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 30 Oct 2006 21:27:58 +0000 Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String? In-Reply-To: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1> References: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1> Message-ID: <45466E5E.9000504@sendu.me.uk> Ryan Golhar wrote: > I'm trying to parse some blast output w/o actually creating the output > file. Instead, I'm capturing the output in a variable and would like to > use IO::String to represent the file: > > $_ = `megablast -d somedatabase -i somesequence -D 2`; > my $blast_file = new IO::String($_); > my $searchio = new Bio::SearchIO(-format => 'blast', -fh => > $blast_file); > my $results = $searchio->next_result; > my $hit = $results->next_hit; > if (! defined($hit)) { > warn "No BLAST hit for $accession on chr $chr for > Seq/$orth_id/$organism\n\n"; > return; > } > > Now, when Bio::SearchIO tries to read the output line by line, instead > it reads the entire output as 1 line. > > If I provide the output in a file and use: > > my $searchio = new Bio::SearchIO(-format => 'blast', -file => > '/tmp/somefile.blast'); > > This works...so is it possible to use IO::String to provide > Bio::SearchIO with BLAST output? Why must it be IO::String? Why not just open() your megablast and provide $searchio the real filehandle? It would be faster that way as well. Read the docs for `. Your usage above is inappropriate. From golharam at umdnj.edu Mon Oct 30 16:54:45 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Mon, 30 Oct 2006 16:54:45 -0500 Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String? In-Reply-To: Message-ID: <00f901c6fc6e$03916460$e6028a0a@GOLHARMOBILE1> Hmmm. Yes, I suppose I could. I did it with the backtick because I based my code off of the "To and >From a String" from the SeqIO HOWTO... -----Original Message----- From: Jason Stajich [mailto:jason.stajich at gmail.com] On Behalf Of Jason Stajich Sent: Monday, October 30, 2006 4:44 PM To: Sendu Bala Cc: golharam at umdnj.edu; 'bioperl-l' Subject: Re: [Bioperl-l] Is it possible to parse BLAST output using IO:String? right - can't you just do: my $fh; open($fh, "megablast -d ... | ") || die $!; my $searchIO = Bio::SearchIO->new(-format => 'blast', -fh => $fh); On Oct 30, 2006, at 1:27 PM, Sendu Bala wrote: Ryan Golhar wrote: I'm trying to parse some blast output w/o actually creating the output file. Instead, I'm capturing the output in a variable and would like to use IO::String to represent the file: $_ = `megablast -d somedatabase -i somesequence -D 2`; my $blast_file = new IO::String($_); my $searchio = new Bio::SearchIO(-format => 'blast', -fh => $blast_file); my $results = $searchio->next_result; my $hit = $results->next_hit; if (! defined($hit)) { warn "No BLAST hit for $accession on chr $chr for Seq/$orth_id/$organism\n\n"; return; } Now, when Bio::SearchIO tries to read the output line by line, instead it reads the entire output as 1 line. If I provide the output in a file and use: my $searchio = new Bio::SearchIO(-format => 'blast', -file => '/tmp/somefile.blast'); This works...so is it possible to use IO::String to provide Bio::SearchIO with BLAST output? Why must it be IO::String? Why not just open() your megablast and provide $searchio the real filehandle? It would be faster that way as well. Read the docs for `. Your usage above is inappropriate. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From bernd.web at gmail.com Mon Oct 30 15:44:31 2006 From: bernd.web at gmail.com (Bernd Web) Date: Mon, 30 Oct 2006 21:44:31 +0100 Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String? In-Reply-To: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1> References: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1> Message-ID: <716af09c0610301244q1c6d4cd6y23e72371dda4fa4d@mail.gmail.com> Hi Ryan, I parse blastn output using IO::String w/o problems: my $stringfh = new IO::String($input); my $in = new Bio::SearchIO(-format => 'blast', -fh => $stringfh); however this is input does not come via backticks. bernd On 10/30/06, Ryan Golhar wrote: > I'm trying to parse some blast output w/o actually creating the output > file. Instead, I'm capturing the output in a variable and would like to > use IO::String to represent the file: > > $_ = `megablast -d somedatabase -i somesequence -D 2`; > my $blast_file = new IO::String($_); > my $searchio = new Bio::SearchIO(-format => 'blast', -fh => > $blast_file); > my $results = $searchio->next_result; > my $hit = $results->next_hit; > if (! defined($hit)) { > warn "No BLAST hit for $accession on chr $chr for > Seq/$orth_id/$organism\n\n"; > return; > } > > Now, when Bio::SearchIO tries to read the output line by line, instead > it reads the entire output as 1 line. > > If I provide the output in a file and use: > > my $searchio = new Bio::SearchIO(-format => 'blast', -file => > '/tmp/somefile.blast'); > > This works...so is it possible to use IO::String to provide > Bio::SearchIO with BLAST output? > > Ryan > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From jason at bioperl.org Mon Oct 30 16:44:18 2006 From: jason at bioperl.org (Jason Stajich) Date: Mon, 30 Oct 2006 13:44:18 -0800 Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String? In-Reply-To: <45466E5E.9000504@sendu.me.uk> References: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1> <45466E5E.9000504@sendu.me.uk> Message-ID: right - can't you just do: my $fh; open($fh, "megablast -d ... | ") || die $!; my $searchIO = Bio::SearchIO->new(-format => 'blast', -fh => $fh); On Oct 30, 2006, at 1:27 PM, Sendu Bala wrote: > Ryan Golhar wrote: >> I'm trying to parse some blast output w/o actually creating the >> output >> file. Instead, I'm capturing the output in a variable and would >> like to >> use IO::String to represent the file: >> >> $_ = `megablast -d somedatabase -i somesequence -D 2`; >> my $blast_file = new IO::String($_); >> my $searchio = new Bio::SearchIO(-format => 'blast', -fh => >> $blast_file); >> my $results = $searchio->next_result; >> my $hit = $results->next_hit; >> if (! defined($hit)) { >> warn "No BLAST hit for $accession on chr $chr for >> Seq/$orth_id/$organism\n\n"; >> return; >> } >> >> Now, when Bio::SearchIO tries to read the output line by line, >> instead >> it reads the entire output as 1 line. >> >> If I provide the output in a file and use: >> >> my $searchio = new Bio::SearchIO(-format => 'blast', -file => >> '/tmp/somefile.blast'); >> >> This works...so is it possible to use IO::String to provide >> Bio::SearchIO with BLAST output? > > Why must it be IO::String? Why not just open() your megablast and > provide $searchio the real filehandle? It would be faster that way > as well. > > Read the docs for `. Your usage above is inappropriate. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From lstein at cshl.edu Mon Oct 30 13:59:29 2006 From: lstein at cshl.edu (Lincoln Stein) Date: Mon, 30 Oct 2006 13:59:29 -0500 Subject: [Bioperl-l] Last minute changes to Bio::Graphics::FeatureBase Message-ID: <6dce9a0b0610301059t6bfae183n8d359f4331985c62@mail.gmail.com> Hi All, I found a bug in Bio::Graphics::FeatureBase that was causing GFF3 output not to validate. I have committed a new version to live and to the release candidate branch. I hope it isn't too late to get this into the release. Lincoln -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From huangyi1 at hkusua.hku.hk Tue Oct 31 00:46:20 2006 From: huangyi1 at hkusua.hku.hk (Huang Yi) Date: Tue, 31 Oct 2006 13:46:20 +0800 Subject: [Bioperl-l] bioperl1.5 and GD2.35 Message-ID: <200610310546.k9V5kQGT010481@hkusua.hku.hk> Hi, I just installed bioperl 1.4 from CPAN to my Gentoo linux computer. But the installation was failed. I had to install by force. However, the GD module couldn't be installed for some unknown reasons. I therefore use "emerge" tool of Gentoo to get bioperl and GD again. They are fine. The version of bioperl became upgrade to1.5 and GD was 2.35. However, when I tested it by using the program in HOWTO wiki page (http://www.bioperl.org/wiki/HOWTO:Graphics), it always told me: Can't locate object method "png" via package "GD::Image" at /usr/lib/perl5/site_perl/5.8.8/Bio/Graphics/Panel.pm line 799, <> line 9. In my other computer, bioperl1.4 and GD2.34 work fine. I therefore want to remove the CPAN bioperl from the system and re-install it, but it seems to be impossible. Would you please give me some advices on how to let my GD and bioperl work. Thanks! Huang Yi From bix at sendu.me.uk Tue Oct 31 03:20:21 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 31 Oct 2006 08:20:21 +0000 Subject: [Bioperl-l] Last minute changes to Bio::Graphics::FeatureBase In-Reply-To: <6dce9a0b0610301059t6bfae183n8d359f4331985c62@mail.gmail.com> References: <6dce9a0b0610301059t6bfae183n8d359f4331985c62@mail.gmail.com> Message-ID: <45470745.1050605@sendu.me.uk> Lincoln Stein wrote: > Hi All, > > I found a bug in Bio::Graphics::FeatureBase that was causing GFF3 output not > to validate. I have committed a new version to live and to the release > candidate branch. I hope it isn't too late to get this into the release. It isn't too late, thank you. From avilella at gmail.com Tue Oct 31 08:54:39 2006 From: avilella at gmail.com (Albert Vilella) Date: Tue, 31 Oct 2006 13:54:39 +0000 Subject: [Bioperl-l] catfile and catdir Message-ID: <358f4d650610310554v5eb57db1x137ac307a73fa64@mail.gmail.com> Hi, I was testing the bioperl-run/t/PAML.t and stumbled upon this a catdir/catfile error: Can't locate object method "catdir" via package "Bio::Root::IO" at /home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line 113. BEGIN failed--compilation aborted at /home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line 143. Compilation failed in require at t/PAML.t line 64. BEGIN failed--compilation aborted at t/PAML.t line 64. Should be be using File::Spec for catdir and catfile instead of Root::IO? Cheers, Albert. From Kevin.M.Brown at asu.edu Tue Oct 31 10:34:34 2006 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Tue, 31 Oct 2006 08:34:34 -0700 Subject: [Bioperl-l] bioperl1.5 and GD2.35 Message-ID: <1A4207F8295607498283FE9E93B775B4023B5F3C@EX02.asurite.ad.asu.edu> Not really a Bioperl issue per se, but sounds like when you had Gentoo emerge GD it didn't include libpng and so didn't build the needed parts to create PNG type graphics. > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Huang Yi > Sent: Monday, October 30, 2006 10:46 PM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] bioperl1.5 and GD2.35 > > Hi, > > > > I just installed bioperl 1.4 from CPAN to my Gentoo linux > computer. But the > installation was failed. I had to install by force. > > > > However, the GD module couldn't be installed for some unknown reasons. > > > > I therefore use "emerge" tool of Gentoo to get bioperl and GD > again. They > are fine. The version of bioperl became upgrade to1.5 and GD was 2.35. > > > > However, when I tested it by using the program in HOWTO wiki page > (http://www.bioperl.org/wiki/HOWTO:Graphics), it always told me: > > > > Can't locate object method "png" via package "GD::Image" at > /usr/lib/perl5/site_perl/5.8.8/Bio/Graphics/Panel.pm line > 799, <> line 9. > > > > In my other computer, bioperl1.4 and GD2.34 work fine. I > therefore want to > remove the CPAN bioperl from the system and re-install it, > but it seems to > be impossible. > > > > Would you please give me some advices on how to let my GD and > bioperl work. > > > > Thanks! > > > > Huang Yi > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From hlapp at gmx.net Tue Oct 31 11:21:40 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 31 Oct 2006 11:21:40 -0500 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: <24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu> References: <000001c6f9f6$9ab12710$15327e82@pyrimidine> <24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu> Message-ID: <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net> On Oct 27, 2006, at 9:57 PM, Chris Fields wrote: > BTW, was that supposed to be Bio::AnnotatableI, or > Bio::AnnotationHolderI? Sorry, the former. I guess I got confused with FeatureHolders. Too bad Featureable isn't an English word. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Tue Oct 31 12:01:44 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 31 Oct 2006 12:01:44 -0500 Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id() In-Reply-To: <4545BD25.3030107@sheffield.ac.uk> References: <4545BD25.3030107@sheffield.ac.uk> Message-ID: <375EA8B5-EE6E-47BA-A335-AAC44CC454C3@gmx.net> The only thing I would add to Jason's reply is that it is easy to do if (! $seq->isa("Bio::SeqI")) { my $bioseq = Bio::Seq->new(); $bioseq->primary_seq($seq); $seq = $bioseq; } and from that point on all your objects are Bio::SeqI compliant regardless of whether they were obtained that way or not. Aside from that I wonder why there isn't a -primary_seq option in Bio::Seq::new - this would shorten the above into a (more perl'ish) single line: $seq = Bio::Seq->new(-primary_seq=>$seq) unless $seq->isa("Bio::SeqI"); Anyone takers to add that capability? -hilmar On Oct 30, 2006, at 3:51 AM, Nathan S. Haigh wrote: > In my script I retrieve sequences from GenBank in FASTA format by GI > numbers and optionally store the sequence in a cache using > Bio::DB::Fasta. On subsequent runs of the script, the cache is first > checked for the GI and returns the sequence if it is found or the > sequence is obtained from GenBank as above. > > I would have thought that Bio::DB::Fasta::get_Seq_by_id() would have > returned a Bio::Seq object but rather it returns a Bio::PrimarySeq > object which is defined within the Bio::DB::Fasta file. This is > annoying, since $seq_obj in my script would be either a Bio::Seq if it > was obtained from GenBank or a Bio::PrimarySeq if obtained from the > cache and calling primary_id() on it doesn't do the expected thing > with > Bio::PrimarySeq: > ID: Bio::PrimarySeq::Fasta=HASH(0x89b4508) > > Is there a reason why Bio::DB::Fasta doesn't return a Bio::Seq object? > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Tue Oct 31 12:08:56 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 31 Oct 2006 11:08:56 -0600 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net> Message-ID: <001401c6fd0f$4239aa50$15327e82@pyrimidine> >> BTW, was that supposed to be Bio::AnnotatableI, or >> Bio::AnnotationHolderI? > > Sorry, the former. I guess I got confused with > FeatureHolders. Too bad Featureable isn't an English word. > > -hilmar Having SimpleAlign be AnnotatableI shouldn't be too much of a burden, since the only additional implemented method is annotation(). So, I think all the various Stockholm tags can be placed somewhere. A bit OT: were we planning on getting rid of the various *_tag_* methods in AnnotatableI at some point? I'm a bit confused as to why they were added. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Tue Oct 31 12:09:26 2006 From: jason at bioperl.org (Jason Stajich) Date: Tue, 31 Oct 2006 09:09:26 -0800 Subject: [Bioperl-l] catfile and catdir In-Reply-To: <358f4d650610310554v5eb57db1x137ac307a73fa64@mail.gmail.com> References: <358f4d650610310554v5eb57db1x137ac307a73fa64@mail.gmail.com> Message-ID: <1AD4DB38-E08D-4E47-8A59-6539068474CB@bioperl.org> Yep. Unless we want this to also exist in Root::IO and delegate to File::Spec. -jason On Oct 31, 2006, at 5:54 AM, Albert Vilella wrote: > Hi, > > I was testing the bioperl-run/t/PAML.t and stumbled upon this a > catdir/catfile error: > > Can't locate object method "catdir" via package "Bio::Root::IO" at > /home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line > 113. > BEGIN failed--compilation aborted at > /home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line > 143. > Compilation failed in require at t/PAML.t line 64. > BEGIN failed--compilation aborted at t/PAML.t line 64. > > Should be be using File::Spec for catdir and catfile instead of > Root::IO? > > Cheers, > > Albert. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From jason at bioperl.org Tue Oct 31 12:10:51 2006 From: jason at bioperl.org (Jason Stajich) Date: Tue, 31 Oct 2006 09:10:51 -0800 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net> References: <000001c6f9f6$9ab12710$15327e82@pyrimidine> <24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu> <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net> Message-ID: <65F92B54-33FD-4D8F-90B7-49E2697CDBA2@bioperl.org> It just needs to have an annotation collection - so it would be Bio::AnnotateableI On Oct 31, 2006, at 8:21 AM, Hilmar Lapp wrote: > > On Oct 27, 2006, at 9:57 PM, Chris Fields wrote: > >> BTW, was that supposed to be Bio::AnnotatableI, or >> Bio::AnnotationHolderI? > > Sorry, the former. I guess I got confused with FeatureHolders. Too > bad Featureable isn't an English word. > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From hlapp at gmx.net Tue Oct 31 12:44:58 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 31 Oct 2006 12:44:58 -0500 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: References: Message-ID: Well isn't this a result of conflating some of the SeqFeatureI methods into the annotation collection? If I'm not mistaken on this then those methods were introduced in 1.5.0 and hence can go away without deprecation. -hilmar On Oct 31, 2006, at 11:37 AM, Brian Osborne wrote: > Chris, > > I don't think the intent was to remove the methods, rather we'd > just call > deprecated(). Example from AnnotatableI: > > sub remove_tag { > my ($self, at args) = @_; > > #uncomment in 1.6 > #$self->deprecated('remove_tag() is deprecated, use > remove_Annotations()'); > > return $self->annotation->remove_Annotations(@args); > } > > With regards to "why", I can't reconstruct the entire rationale > myself but I > can say that the newer names make more sense. Take that example > above - it's > function is to remove entire Annotations not just to remove tags, so > remove_Annotations is a better name. > > Brian O. > > > On 10/31/06 1:08 PM, "Chris Fields" wrote: > >> A bit OT: were we planning on getting rid of the various *_tag_* >> methods in >> AnnotatableI at some point? I'm a bit confused as to why they >> were added. > > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bosborne11 at verizon.net Tue Oct 31 11:37:01 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 31 Oct 2006 12:37:01 -0400 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: <001401c6fd0f$4239aa50$15327e82@pyrimidine> Message-ID: Chris, I don't think the intent was to remove the methods, rather we'd just call deprecated(). Example from AnnotatableI: sub remove_tag { my ($self, at args) = @_; #uncomment in 1.6 #$self->deprecated('remove_tag() is deprecated, use remove_Annotations()'); return $self->annotation->remove_Annotations(@args); } With regards to "why", I can't reconstruct the entire rationale myself but I can say that the newer names make more sense. Take that example above - it's function is to remove entire Annotations not just to remove tags, so remove_Annotations is a better name. Brian O. On 10/31/06 1:08 PM, "Chris Fields" wrote: > A bit OT: were we planning on getting rid of the various *_tag_* methods in > AnnotatableI at some point? I'm a bit confused as to why they were added. From cjfields at uiuc.edu Tue Oct 31 13:44:02 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 31 Oct 2006 12:44:02 -0600 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: Message-ID: <001e01c6fd1c$891fc4b0$15327e82@pyrimidine> Hilmar Lapp wrote: > Well isn't this a result of conflating some of the > SeqFeatureI methods into the annotation collection? > > If I'm not mistaken on this then those methods were > introduced in 1.5.0 and hence can go away without deprecation. > > -hilmar > > On Oct 31, 2006, at 11:37 AM, Brian Osborne wrote: > >> Chris, >> >> I don't think the intent was to remove the methods, rather we'd just >> call deprecated(). Example from AnnotatableI: >> >> sub remove_tag { >> my ($self, at args) = @_; >> >> #uncomment in 1.6 >> #$self->deprecated('remove_tag() is deprecated, use >> remove_Annotations()'); >> >> return $self->annotation->remove_Annotations(@args); } >> >> With regards to "why", I can't reconstruct the entire rationale >> myself but I can say that the newer names make more sense. Take that >> example above - it's function is to remove entire Annotations not >> just to remove tags, so remove_Annotations is a better name. >> >> Brian O. >> >> >> On 10/31/06 1:08 PM, "Chris Fields" wrote: >> >>> A bit OT: were we planning on getting rid of the various *_tag_* >>> methods in AnnotatableI at some point? I'm a bit confused as to why >>> they were added. Sorry Brian, what I meant was, based on CVS history, the various *tag* methods in AnnotatableI were added all at once, with deprecations already present in the commit. So the methods weren't there to begin with, then added only to be deprecated later? Hence the confusion... I think Hilmar's right; the CVS history indicates these were added just prior to rel. 1.5 by Allen and seem to be related to SeqFeatureI. I'm sure the intent was good, but they contradict methods in the Feature/Annotation HOWTO on retrieving Annotation objects via the Annotation::Collection object. I think that agrees with your point about the various Annotation* method names being the more appropriate ones. Does everybody agree we should just remove them? Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 31 13:53:16 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 31 Oct 2006 12:53:16 -0600 Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id() In-Reply-To: <375EA8B5-EE6E-47BA-A335-AAC44CC454C3@gmx.net> Message-ID: <000001c6fd1d$d4359c80$15327e82@pyrimidine> > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar Lapp > Sent: Tuesday, October 31, 2006 11:02 AM > To: n.haigh at sheffield.ac.uk > Cc: Bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id() > > The only thing I would add to Jason's reply is that it is easy to do > > if (! $seq->isa("Bio::SeqI")) { > my $bioseq = Bio::Seq->new(); > $bioseq->primary_seq($seq); > $seq = $bioseq; > } > > and from that point on all your objects are Bio::SeqI > compliant regardless of whether they were obtained that way or not. > > Aside from that I wonder why there isn't a -primary_seq > option in Bio::Seq::new - this would shorten the above into a > (more perl'ish) single line: > > $seq = Bio::Seq->new(-primary_seq=>$seq) unless > $seq->isa("Bio::SeqI"); > > Anyone takers to add that capability? > > -hilmar Sounds good to me! Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From nhansen at nhgri.nih.gov Tue Oct 31 14:51:23 2006 From: nhansen at nhgri.nih.gov (Nancy Hansen) Date: Tue, 31 Oct 2006 14:51:23 -0500 (EST) Subject: [Bioperl-l] Bio::SeqIO::scf header/comments handling Message-ID: Hello, As sequencing centers begin to deposit trace data from "Medical Sequencing" projects into the public archives, there is now the need to "anonymize" sequence trace files by removing embedded information which might be used to identify the individual who was the original source of the DNA being sequenced. I was hoping I might be able to use Bio::SeqIO to manipulate the comments contained in an SCF-formatted trace file, but I'm finding that Bio::SeqIO/Bio::Seq::SequenceTrace doesn't seem to store this information. Since SCF is a widely-accepted standard for trace files, would it be reasonable to include fields like "scf_comments" and "scf_header" in a Bio::Seq::SequenceTrace object and have Bio::SeqIO::scf populate them? Likewise, it would be great if write_seq could pull these values right from a SequenceTrace object rather than requiring them as arguments. I'd be happy to help in this effort if necessary. Thanks, --Nancy ************************************* Nancy F. Hansen, PhD nhansen at nhgri.nih.gov Bioinformatics Group NIH Intramural Sequencing Center (NISC) 5625 Fishers Lane Rockville, MD 20852 Phone: (301) 435-1560 Fax: (301) 435-6170 From lincoln.stein at gmail.com Tue Oct 31 15:24:17 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Tue, 31 Oct 2006 15:24:17 -0500 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <000001c6f78b$d1c65a30$15327e82@pyrimidine> References: <453E309B.9090007@sendu.me.uk> <000001c6f78b$d1c65a30$15327e82@pyrimidine> Message-ID: <6dce9a0b0610311224x79256b29sf102eb5c35865caf@mail.gmail.com> Are you going to go ahead with 1.52_XX ? If so, I will code GBrowse to look for 1.52 or higher. Lincoln On 10/24/06, Chris Fields wrote: > > .. > > > > 'handle'? I think it shows up as '6.2.13' simply because it was uploaded > > with the filename Perl6-Pugs-6.2.13.tar.gz > > Sorry, my point was that when Audrey T. uses '6.2.13', her $VERSION is > '6.002013'. So maybe we should follow a similar convention. Seems easier > and less confusing to me, at least. > > > As you point out, the code has the kind of $VERSION number we've been > > suggesting in this thread: > > > > > From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013': > > > > > > our $VERSION = 6.002013; > > > > > > That's also a very perlish-way to do it. And there are no developer > > > versions of Pugs, since it is always under active development. We > could > > try > > > something like: > > > > > > our $VERSION = 1.005002_01; > > > > Yes, this was already like one of my suggestions (1.0502_01), but I > > brought up the concern that 1.05 might be < 1.4. > > > > So then we have a question: do we try and fumble a 1.4 compatible number > > by using 1.60_10, or do we have a clean break, remove 1.4 from CPAN if > > it causes problems, and go for the 'proper' 1.006000 (1.6.0) with no > > room for RC numbering, or 1.006000010 (1.6.0.10) - the first final > > release following some 1.006000_001 (1.6.0.01 == rc1) RCs? > > I would go for the clean break if it follows perl/CPAN convention. > '1.60_10' looks like '1.60.10', not 1.6 or 1.6 RC1, so it's too confusing. > > If we can use '1.00600x' for 1.6, 1.6.1, etc, and '1.006000_00x' for 1.6 > RC1, 1.6 RC2 etc then that would be consistent and perl-compatible. > > BTW, the reason I looked at Pugs was to see what some of the Perl6 > developers were using. Who knows; they'll probably change it! > > .. > > > I don't think it would be a hassle; on the contrary it would be very > > useful to know the CPAN distribution actually works. I'm very happy with > > the idea that a release candidate gets fully tested... > > So you obviously feel strongly about it! ;> > > I don't have a problem as long as we stick with doing this from now on ( > i.e. > have a consistent versioning scheme, release policy, CPAN release policy, > etc). Would be nice for Jason/Brian/Hilmar to chime in as to the > reasoning > behind the older versioning scheme. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From hlapp at gmx.net Tue Oct 31 16:53:58 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 31 Oct 2006 16:53:58 -0500 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: <001e01c6fd1c$891fc4b0$15327e82@pyrimidine> References: <001e01c6fd1c$891fc4b0$15327e82@pyrimidine> Message-ID: On Oct 31, 2006, at 1:44 PM, Chris Fields wrote: > Does everybody agree we should just remove them? I wish you could but I'm afraid that would break stuff? Otherwise why were they added in the first place? I thought Bio::SeqFeature::Annotated needs them maybe? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Tue Oct 31 17:41:17 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 31 Oct 2006 16:41:17 -0600 Subject: [Bioperl-l] AnnotatableI tag methods, was Rfam/Pfam annotations and SimpleAlign In-Reply-To: Message-ID: <000001c6fd3d$ae37c240$15327e82@pyrimidine> > On Oct 31, 2006, at 1:44 PM, Chris Fields wrote: > > > Does everybody agree we should just remove them? > > I wish you could but I'm afraid that would break stuff? > Otherwise why were they added in the first place? I thought > Bio::SeqFeature::Annotated needs them maybe? > > -hilmar > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== Yep, removing them clobbers a ton of tests, including anything that requires SeqIO::FTHelper. Looks like SeqFeature::Generic and a few others use them. I could understand if these were meant to be permanent methods, but why add these in if they were to be deprecated in 1.6? Something that was meant to be a transition but wasn't finished? That seems to be indicated in the commented out lines for all the *tag* methods: #uncomment in 1.6 #$self->deprecated('remove_tag() is deprecated, use remove_Annotations()'); Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From lincoln.stein at gmail.com Tue Oct 31 18:18:07 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Tue, 31 Oct 2006 18:18:07 -0500 Subject: [Bioperl-l] Bio::DB::GFF::Util::Binning In-Reply-To: References: Message-ID: <6dce9a0b0610311518l3bec852q5d04a9b488621377@mail.gmail.com> Hi Keith, The current Bio/DB/GFF/Util/Binning.pm file just contains the hierarchical binning system that I implemented some time ago. Where is the R-tree system that you describe? How much of an improvement did the R-tree scheme give over the hierarchical scheme? FTYI the GFF3 implementation uses a different binning scheme in which there is a fixed-size bin. Every time a feature overlaps a bin, it creates a new row in a table. So big features will have multiple rows and little features that fit inside a bin will have only one row. The query for this is simpler and seems to give the same relative speedup as the hierarchical binning system. I'd really like to get these queries to go as fast as possible and would love to work with you on this if you're interested. Lincoln On 10/19/06, Keith Player wrote: > > I know that there may be some changes resulting from new GFF3 > implementations, > but thought I would see if the following is useful anyway. > > I implemented the R-tree binning schema as used by > Bio::DB::GFF::Util::Binning > and as mention in this article: > > I tested the following query on a normal table (no binning), but it > assumes > that you know the longest range in the table. So for example with a table > of > human genes, where the longest gene we know of is around 2.4Mb. > > SELECT COUNT(*) as count FROM groups WHERE start > max(0,[start-2.4Mb]) > AND > g.start < [end] AND g.end > [start] AND g.chromosome = '1' > > so for 100Mb:101Mb > > SELECT COUNT(*) as count FROM groups WHERE start > 97600000 AND g.start < > 101000000 AND g.end > 100000000 AND g.chromosome = '1' > > > where [start] and [end] define the region of interest. This query > outperforms > the R-Tree implementation on all tests that I have performed (for lengths > of > 200bp to 10Mb across a whole chromsome). Could this be of some practical > use? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From bosborne11 at verizon.net Tue Oct 31 21:31:49 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 31 Oct 2006 22:31:49 -0400 Subject: [Bioperl-l] Bio::SeqIO::scf header/comments handling In-Reply-To: Message-ID: Nancy, It looks like a good place to start would be the get_header() and _get_header methods in Bio::SeqIO::scf. If you read t/scf.t you can see that the author, at some point, wanted get_header to return meaningful information but stepping through the test shows it returning a lot of UNDEF. Now I don't know if this is due to the method or the source SCF file, but you might be able to get these methods to work yourself. But to answer your questions, yes, it certainly sounds reasonable that these values would be extracted by Bio::SeqIO::scf. Brian O. On 10/31/06 3:51 PM, "Nancy Hansen" wrote: > > Hello, > > As sequencing centers begin to deposit trace data from "Medical > Sequencing" projects into the public archives, there is now the need to > "anonymize" sequence trace files by removing embedded information which > might be used to identify the individual who was the original source of > the DNA being sequenced. > > I was hoping I might be able to use Bio::SeqIO to manipulate the > comments contained in an SCF-formatted trace file, but I'm finding that > Bio::SeqIO/Bio::Seq::SequenceTrace doesn't seem to store this information. > Since SCF is a widely-accepted standard for trace files, would it be > reasonable to include fields like "scf_comments" and "scf_header" in a > Bio::Seq::SequenceTrace object and have Bio::SeqIO::scf populate them? > Likewise, it would be great if write_seq could pull these values right > from a SequenceTrace object rather than requiring them as arguments. > > I'd be happy to help in this effort if necessary. > > Thanks, > --Nancy > > ************************************* > Nancy F. Hansen, PhD nhansen at nhgri.nih.gov > Bioinformatics Group > NIH Intramural Sequencing Center (NISC) > 5625 Fishers Lane > Rockville, MD 20852 > Phone: (301) 435-1560 Fax: (301) 435-6170 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Sun Oct 1 13:05:25 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 1 Oct 2006 12:05:25 -0500 Subject: [Bioperl-l] Tests involving remote databases In-Reply-To: <84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net> References: <000001c6e3e6$81630010$15327e82@pyrimidine> <6761B0A9-DEF0-4F33-9C89-39AFE104B61F@gmx.net> <79D77258-5A4E-4428-9322-00962D5C0419@uiuc.edu> <451E3707.4090400@sendu.me.uk> <0DA94AF7-DB0E-4DC5-8896-1048089D9AC5@uiuc.edu> <84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net> Message-ID: <3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu> On Sep 30, 2006, at 4:43 PM, Hilmar Lapp wrote: > > On Sep 30, 2006, at 10:57 AM, Chris Fields wrote: > >> There should be a failed test to let us know of the problem. As >> currently set up, the XEMBL server failure doesn't show up in >> Test::Harness test summaries. Biblio_biofetch.t had the similar >> problems before Brian's fixes. > > Just keep in mind that you may not want somebody's CPAN installation > to fail (or require a 'forced' install) just because some server > happens to be down for maintenance. > > -hilmar I don't think this would be a problem unless users specifically set BIOPERLDEBUG to 1, which is something most people don't bother with before installation (and probably not something we should promote for normal installation anyway). So, for CPAN installation we would suggest that BIOPERLDEBUG be 0 or not set at all, and outline the reasons why. The idea is to retain current behavior (remote DB access will not be run unless BIOPERLDEBUG is set to 1) and apply it to all tests requiring such access. Otherwise, just those tests are skipped (and not the rest of the tests, which occurs currently). If BIOPERLDEBUG is set, the next tests would check the URL, which passes/fails (based on the specific value of $@), and runs/skips tests based on the mere presence of $@, which indicates some URL issue. You can do this with Test::More, but I'm not sure this can be done with Test.pm or Test::Simple. The current behavior just skips all tests based on a single failed URL. Then, Test::Harness, as currently set, shows skipped tests as passed. The last run I posted previously where XEMBL_DB.t remote DB tests failed, I also ran all tests (make test) and get this, which doesn't tell us that the remote URL failed: ----------------------------------------- ... t/WABA.......................ok t/XEMBL_DB...................ok t/ztr........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping ztr.t tests ok All tests successful, 5 subtests skipped. ----------------------------------------- Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Sun Oct 1 13:17:24 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 1 Oct 2006 12:17:24 -0500 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: References: <7A592EAB-A869-4A6C-BFA8-F73F3DFD8F5B@gmx.net> <09FB1EB0-2C1C-4FCF-8339-E78556EFEFF2@uiuc.edu> <8D75FE6D-C02D-4A86-93FA-B7256050AF11@uiuc.edu> <40155903-555A-4662-BCCE-38E5E3784118@uiuc.edu> <54E79A5F-5446-4D8E-AD26-B70894048D60@gmx.net> <1D69005A-DF0E-4F37-93FE-7577A32CC625@gmx.net> Message-ID: The '-w' flag on the shebang line is the source of those errors. I never set it anymore on Windows due to this; I just use the 'use warnings' pragma. If you use 'perl -I. t/test.t' you can normally get around the '-w' assumed by using 'make test'. I will try running tests on bioperl-db and bioperl tomorrow on WinXP to confirm these. Chris On Sep 30, 2006, at 6:10 PM, Seth Johnson wrote: > How do I get rid of all of the warnings for "redefined subroutines" > during > the test?? It clutters the output and I can't see the errors. > > On 9/30/06, Hilmar Lapp wrote: >> >> It doesn't shed more light but it does raise an alert flag. All tests >> are supposed to pass. The fact that they don't means the problems you >> are seeing have nothing to do with your specific data or script. >> >> First off - can anyone else confirm those errors using the latest >> Bioperl-db and Bioperl? >> >> Second - Seth could you run those tests individually, e.g., using >> >> $ make test test_02species TEST_VERBOSE=1 >> >> and similarly for the other tests that have failures and post the >> output. Let's start with 02species and 03simpleseq. >> >> -hilmar >> >> On Sep 30, 2006, at 5:44 PM, Seth Johnson wrote: >> >>> There are errors during the test. Here's their summary: >>> ____________________________ >>> Failed Test Stat Wstat Total Fail Failed List of Failed >>> ------------------------------------------------------------- >>> t\02species.t 65 2 3.08% 63 65 >>> t\03simpleseq.t 1 256 59 106 179.66% 7-59 >>> t\04swiss.t 52 14 26.92% 25 27-34 38-42 >>> t\12ontology.t 2 512 738 1471 199.32% 3-738 >>> t\16obda.t 12 3 25.00% 10-12 >>> ____________________________ >>> >>> May be that can shed some light on the problem?!?! >>> >>> On 9/29/06, Hilmar Lapp < hlapp at gmx.net> wrote:This may in fact be >>> a knock-on effect of the fixes? >>> >>> Seth, did you run the test suite that comes with bioperl-db, and did >>> you get any errors? >>> >>> -hilmar >>> >>> On Sep 28, 2006, at 2:26 PM, Chris Fields wrote: >>> >>>> Seth, >>>> >>>> The organism issue is a bug and has been reported, though I thought >>>> it was fixed. >>>> >>>> The lack of the date and the version is a bit odd, but there have >>>> been a lot of changes lately to bioperl-live (core bioperl in CVS), >>>> and a few to bioperl-db. How old is your bioperl and bioperl-db >>>> installation. Hilmar, any additional thoughts? >>>> >>>> Chris >>>> >>>> On Sep 28, 2006, at 11:10 AM, Seth Johnson wrote: >>>> >>>>> Thank you. That takes care of that, however, I do have another >>>>> gripe. When >>>>> running my script, quoted before, with "my $out = >>>>> Bio::SeqIO->newFh('-format' => 'genbank');", I have several key >>>>> pieces of >>>>> information missing. The most important one is the version >>>>> number. There's >>>>> also a date missing, and source organism name is corrupted. >>>>> Here's what I >>>>> get: >>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>>>> LOCUS NM_014580 2145 bp dna linear >>>>> UNK >>>>> DEFINITION Homo sapiens solute carrier family 2, (facilitated >>>>> glucose >>>>> transporter) member 8 (SLC2A8), mRNA. >>>>> ACCESSION NM_014580 >>>>> SOURCE sapiens. >>>>> ORGANISM sapiens >>>>> Eukaryota; Fungi/Metazoa group; Metazoa; Eumetazoa; >>>>> Bilateria; >>>>> Coelomata; Deuterostomia; Chordata; Craniata; >>> Vertebrata; >>>>> Gnathostomata; Teleostomi; Euteleostomi; >>>>> Sarcopterygii; >>>>> Tetrapoda; >>>>> Amniota; Mammalia; Theria; Eutheria; Euarchontoglires; >>>>> Primates; >>>>> Haplorrhini; Simiiformes; Catarrhini; Hominoidea; >>>>> Hominidae; >>>>> Homo/Pan/Gorilla group; Homo. >>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>>>> >>>>> All of the missing information is stored in BioSQL and >>>>> theoretically should >>>>> be in the outpu. Here's how NCBI genbank file looks: >>>>> >>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>>>> >>>>> LOCUS NM_014580 2145 bp mRNA linear >>>>> PRI 17-OCT-2005 >>>>> DEFINITION Homo sapiens solute carrier family 2, (facilitated >>>>> glucose >>>>> transporter) member 8 (SLC2A8), mRNA. >>>>> ACCESSION NM_014580 >>>>> VERSION NM_014580.3 GI:51870928 >>>>> KEYWORDS . >>>>> SOURCE Homo sapiens (human) >>>>> ORGANISM Homo sapiens >>>>> >>>>> Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; >>>>> Euteleostomi; >>>>> Mammalia; Eutheria; Euarchontoglires; Primates; >>>>> Haplorrhini; >>>>> Catarrhini; Hominidae; Homo. >>>>> >>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>>>> >>>>> >>>>> On 9/28/06, Chris Fields wrote: >>>>>> >>>>>> Those are from the excessively paranoid '-w' flag on the shebang >>>>>> line. If you remove the flag but add the 'use warnings' pragma >>> the >>>>>> 'subroutine x redefined' warnings go away. This, BTW, is one >>> of the >>>>>> quirks of the ActivePerl distribution; other OSs don't have the >>> same >>>>>> problem. >>>>>> >>>>>> The 'solution' described on that page is actually a workaround, >>>>>> not a >>>>>> bugfix. It causes problems with stack traces with error handling >>>>>> but >>>>>> seems harmless beyond that. I haven't been able to find a >>>>>> satisfactory fix which works on all OS's. >>>>>> >>>>>> Chris >>>>>> >>>>>> >>>>>> On Sep 28, 2006, at 10:42 AM, Seth Johnson wrote: >>>>>> >>>>>>> This is under Windows, but using ActiveState Komodo 3.5 and >>>>>>> their >>>>>>> latest Perl for Windows and latest BioPerl & BioPerl-db from >>>>>>> CVS. >>>>>>> >>>>>>> I actually just stumbled upon a solution. It's described in the >>>>>>> "Installing Bioperl on Windows" by adding a comma after >>> $class: in >>>>>>> Bio::Root::Root throw() subroutine. Thanks for hinting me about >>>>>>> what I run it on. >>>>>>> >>>>>>> The code works now, BUT it spews whole bunch of warnings about >>>>>>> "Subroutine .... redefined": >>>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>>>>> Subroutine new redefined at c:/Perl/site/lib/Bio\BioEntry >>>>>>> .pm line 88. >>>>>>> Subroutine object_id redefined at c:/Perl/site/lib/Bio >>> \BioEntry.pm >>>>>>> line 128. >>>>>>> Subroutine version redefined at c:/Perl/site/lib/Bio\BioEntry.pm >>>>>>> line 150. >>>>>>> Subroutine authority redefined at c:/Perl/site/lib/Bio >>> \BioEntry.pm >>>>>>> line 171. >>>>>>> Subroutine namespace redefined at c:/Perl/site/lib/Bio >>> \BioEntry.pm >>>>>>> line 192. >>>>>>> Subroutine display_name redefined at c:/Perl/site/lib/Bio >>>>>>> \BioEntry.pm line 217. >>>>>>> Subroutine description redefined at c:/Perl/site/lib/Bio >>>>>>> \BioEntry.pm line 241. >>>>>>> Subroutine new redefined at c:/Perl/site/lib/Bio\Root\Root.pm >>> line >>>>>>> 201. >>>>>>> Subroutine verbose redefined at c:/Perl/site/lib/Bio\Root >>>>>>> \Root.pm >>>>>>> line 234. >>>>>>> Subroutine _register_for_cleanup redefined at c:/Perl/site/lib/ >>> Bio >>>>>>> \Root\Root.pm line 246. >>>>>>> Subroutine _unregister_for_cleanup redefined at c:/Perl/site/ >>>>>>> lib/ >>>>>>> Bio >>>>>>> \Root\Root.pm line 256. >>>>>>> Subroutine _cleanup_methods redefined at c:/Perl/site/lib/Bio >>> \Root >>>>>>> \Root.pm line 263. >>>>>>> Subroutine throw redefined at c:/Perl/site/lib/Bio\Root\Root.pm >>>>>>> line 316. >>>>>>> Subroutine debug redefined at c:/Perl/site/lib/Bio\Root\Root.pm >>>>>>> line 379. >>>>>>> Subroutine _load_module redefined at c:/Perl/site/lib/Bio\Root >>>>>>> \Root.pm line 398. >>>>>>> Subroutine DESTROY redefined at c:/Perl/site/lib/Bio\Root >>>>>>> \Root.pm >>>>>>> line 426. >>>>>>> Subroutine new redefined at c:/Perl/site/lib/Bio\Root\RootI.pm >>> line >>>>>>> 117. >>>>>>> Subroutine _initialize redefined at c:/Perl/site/lib/Bio\Root >>>>>>> \RootI.pm line 128. >>>>>>> ... >>>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>>>>> >>>>>>> >>>>>>> On 9/28/06, Chris Fields wrote: I had >>> problems >>>>>>> with bioperl-db on native WinXP (not cygwin), but I >>>>>>> did manage to get it running in cygwin with some effort. The >>> issue >>>>>>> on native WinXP was related to Bio::Root::Root::throw(), though. >>>>>>> >>>>>>> There is a bug and workaround filed on Bugzilla, but I haven't >>>>>>> worked >>>>>>> on it in a while (and the workaround has some problems as >>> well). I >>>>>>> may try running it again to see what happens. >>>>>>> >>>>>>> http://bugzilla.open-bio.org/show_bug.cgi?id=1938 >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On Sep 28, 2006, at 9:04 AM, Hilmar Lapp wrote: >>>>>>> >>>>>>>> Very odd. This is under Windows, presumably using Cygwin? >>>>>>>> >>>>>>>> The method Bio::Root::Root::throw() clearly exists, and >>>>>>>> PersistentObject inherits from it. The exception it was >>> trying to >>>>>>>> throw has nothing to do with failure or success to find the >>>>>>>> database >>>>>>>> row (actually it did succeed since otherwise it wouldn't >>> construct >>>>>>>> the object) but with dynamically loading a class, presumably >>>>>>>> Bio::DB::Persistent::Seq. >>>>>>>> >>>>>>>> Are you using the 1.5.x release of bioperl? >>>>>>>> >>>>>>>> Does anyone on the list have any experience with these sorts of >>>>>>>> things on Windows? >>>>>>>> >>>>>>>> (Seth, I've moved this thread to the bioperl list, since >>>>>>>> this is >>>>>>> what >>>>>>>> the problem is about.) >>>>>>>> >>>>>>>> -hilmar >>>>>>>> >>>>>>>> On Sep 27, 2006, at 1:39 PM, Seth Johnson wrote: >>>>>>>> >>>>>>>>> Hello guys, >>>>>>>>> >>>>>>>>> I successfully populated the biosql database, thanks to you. >>>>>>>>> Now, >>>>>>>>> I'm >>>>>>>>> trying to retrieve a sequence from it following the example >>> from >>>>>>>>> BOSC2003 >>>>>>>>> slides and ran into uninformative error (at least to me it >>>>>>>>> doesn't >>>>>>>>> mean >>>>>>>>> anyting). I suspect that I'm missing something and hope you >>> can >>>>>>>>> point me in >>>>>>>>> the right direction. Here's my source code: >>>>>>>>> >>>>>>> >>> ------------------------------------------------------------------- >>>>>>> -- >>>>>>>>> - >>>>>>>>> --- >>>>>>>>> #!/usr/bin/perl -w >>>>>>>>> use strict; >>>>>>>>> use warnings; >>>>>>>>> >>>>>>>>> use Bio::Seq; >>>>>>>>> use Bio::Seq::SeqFactory; >>>>>>>>> use Bio::DB::SimpleDBContext; >>>>>>>>> use Bio::DB::BioDB; >>>>>>>>> >>>>>>>>> my $dbc = Bio::DB::SimpleDBContext->new( >>>>>>>>> -driver => 'mysql', >>>>>>>>> -dbname => 'BioSQL_1', >>>>>>>>> -host => ' 192.168.1.3', >>>>>>>>> -user => 'xxxxx', >>>>>>>>> -pass => 'xxxxxx' >>>>>>>>> ); >>>>>>>>> >>>>>>>>> my $db = Bio::DB::BioDB->new(-database => 'biosql', >>>>>>>>> -dbcontext => $dbc); >>>>>>>>> >>>>>>>>> my $seq = Bio::Seq->new(-accession_number => 'NM_014580', - >>>>>>>>> namespace => >>>>>>>>> 'refseq_H_sapiens'); >>>>>>>>> my $seqfact = Bio::Seq::SeqFactory->new(-type => 'Bio::Seq'); >>>>>>>>> my $adp = $db->get_object_adaptor($seq); >>>>>>>>> my $dbseq = $adp->find_by_unique_key($seq, -obj_factory => >>>>>>> $seqfact); >>>>>>>>> >>>>>>>>> my $out = Bio::SeqIO->newFh('-format' => 'EMBL'); >>>>>>>>> print $out $dbseq; >>>>>>>>> >>>>>>>>> exit; >>>>>>>>> >>> ----------------------------------------------------------------- >>>>>>>>> >>>>>>>>> Just when the "find_by_unique_key" function is executed I >>> get the >>>>>>>>> following >>>>>>>>> error: >>>>>>>>> >>>>>>>>> ================================ >>>>>>>>> Undefined subroutine &Bio::Root::Root::throw called at >>>>>>>>> c:/Perl/site/lib/Bio/DB/Persistent/PersistentObject.pm line >>> 199. >>>>>>>>> ================================ >>>>>>>>> >>>>>>>>> The sequence does exist in the database. I checked that. Any >>>>>>>>> ideas??? >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Best Regards, >>>>>>>>> >>>>>>>>> >>>>>>>>> Seth Johnson >>>>>>>>> Senior Bioinformatics Associate >>>>>>>>> _______________________________________________ >>>>>>>>> BioSQL-l mailing list >>>>>>>>> BioSQL-l at lists.open-bio.org >>>>>>>>> http://lists.open-bio.org/mailman/listinfo/biosql-l >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> =========================================================== >>>>>>>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>>>>>>> =========================================================== >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Bioperl-l mailing list >>>>>>>> Bioperl-l at lists.open-bio.org >>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>>>> >>>>>>> Christopher Fields >>>>>>> Postdoctoral Researcher >>>>>>> Lab of Dr. Robert Switzer >>>>>>> Dept of Biochemistry >>>>>>> University of Illinois Urbana-Champaign >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best Regards, >>>>>>> >>>>>>> >>>>>>> Seth Johnson >>>>>>> Senior Bioinformatics Associate >>>>>>> >>>>>>> Ph: (202) 470-0900 >>>>>>> Fx: (775) 251-0358 >>>>>> >>>>>> Christopher Fields >>>>>> Postdoctoral Researcher >>>>>> Lab of Dr. Robert Switzer >>>>>> Dept of Biochemistry >>>>>> University of Illinois Urbana-Champaign >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Best Regards, >>>>> >>>>> >>>>> Seth Johnson >>>>> Senior Bioinformatics Associate >>>>> >>>>> Ph: (202) 470-0900 >>>>> Fx: (775) 251-0358 >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> Christopher Fields >>>> Postdoctoral Researcher >>>> Lab of Dr. Robert Switzer >>>> Dept of Biochemistry >>>> University of Illinois Urbana-Champaign >>>> >>>> >>>> >>> >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> >>> >>> >>> >>> >>> >>> >>> >>> -- >>> Best Regards, >>> >>> >>> Seth Johnson >>> Senior Bioinformatics Associate >>> >>> Ph: (202) 470-0900 >>> Fx: (775) 251-0358 >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> > > > -- > Best Regards, > > > Seth Johnson > Senior Bioinformatics Associate > > Ph: (202) 470-0900 > Fx: (775) 251-0358 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From osborne1 at optonline.net Sun Oct 1 17:49:47 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Sun, 01 Oct 2006 17:49:47 -0400 Subject: [Bioperl-l] bioperl-network warnings when loading the DIP dataset In-Reply-To: <20061001183214.GB12075@iucha.net> Message-ID: Florin, This is fixed in CVS now. What had happened is that the DIP file had some minimal protein (node) entries where the only id available was DIP's internal identifier. Not ideal to have to use these as accessions but there's no other choice. Thank you for the note, and in the future write to bioperl-l since there may be others who are interested in hearing about what you've encountered. Brian O. On 10/1/06 2:32 PM, "Florin Iucha" wrote: > Hello, > > I have downloaded a CVS snapshot [1] of your module, bioperl-network, and > I am using it to read the 20060402 edition release of the DIP [2] dataset. > > Starting with the simple program you show in the man page: > > my $io = Bio::Network::IO->new(-format => 'psi', > -file => $ARGV[0]); > > my $network = $io->next_network; > > I get 772 instances of: > > Use of uninitialized value in string eq at > /usr/local/share/perl/5.8.8/Bio/Network/IO/psi.pm line 326. > > I don't know if it is just an annoyance or something bad, so you might > want to take a look at it. > > Thank you for your work, > florin > > [1] http://cvs.open-bio.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-network/ > [2] http://dip.doe-mbi.ucla.edu/ From osborne1 at optonline.net Sun Oct 1 17:56:39 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Sun, 01 Oct 2006 17:56:39 -0400 Subject: [Bioperl-l] bioperl-network warnings when loading the DIP dataset In-Reply-To: <20061001211844.GC12075@iucha.net> Message-ID: Florin, I'm not seeing any segmentation fault using the same file you're using as input (dip20060402.mif). I'm assuming you don't see this error when you use smaller files as input, like those in the t/data directory. When I watch the script in top I see Perl using about 135Mb (RSIZE) right before the script exits. How much memory do you use? Thank you for the note, and in the future write to bioperl-l since there may be others who are interested in hearing about what you've encountered. Brian O. On 10/1/06 5:18 PM, "Florin Iucha" wrote: > On Sun, Oct 01, 2006 at 01:32:14PM -0500, Florin Iucha wrote: >> I have downloaded a CVS snapshot [1] of your module, bioperl-network, and >> I am using it to read the 20060402 edition release of the DIP [2] dataset. > > Using the attached script, I am getting a segmentation fault at the > end, right after printing "That's all, Folks!" Maybe some cleanup is > going off in a wrong direction. > > florin From florin at iucha.net Sun Oct 1 20:24:03 2006 From: florin at iucha.net (Florin Iucha) Date: Sun, 1 Oct 2006 19:24:03 -0500 Subject: [Bioperl-l] bioperl-network warnings when loading the DIP dataset In-Reply-To: References: <20061001211844.GC12075@iucha.net> Message-ID: <20061002002403.GD12075@iucha.net> On Sun, Oct 01, 2006 at 05:56:39PM -0400, Brian Osborne wrote: > I'm not seeing any segmentation fault using the same file you're using as > input (dip20060402.mif). I'm assuming you don't see this error when you use > smaller files as input, like those in the t/data directory. The t/data files are fine. Also the largest file (full_2_psi1.xml) from release 2006-09-19 of the MINT [1] database does not produce the crash. It has a new warning, however: Can't call method "text" on an undefined value at /usr/local/share/perl/5.8.8/Bio/Network/IO/psi.pm line 290. > When I watch the script in top I see Perl using about 135Mb (RSIZE) right > before the script exits. How much memory do you use? "ps ux" tells me VSZ = 272788 and RSZ = 254992. This is on x86-64 with 64 bit perl. The box has 2 GB of physical memory so these numbers don't seem to be a concern. > Thank you for the note, and in the future write to bioperl-l since there may > be others who are interested in hearing about what you've encountered. Do'h! You have the list address loud and clear in three places, but I got your contact info from the AUTHORS. Will use the proper channel from now on! Thanks, florin [1] ftp://mint.bio.uniroma2.it/pub/release/psi1/ -- If we wish to count lines of code, we should not regard them as lines produced but as lines spent. -- Edsger Dijkstra -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: From cjfields at uiuc.edu Mon Oct 2 00:35:22 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 1 Oct 2006 23:35:22 -0500 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: Message-ID: <000001c6e5dc$2eceabe0$15327e82@pyrimidine> Seth, What version of MySQL and perl are you using? I'm using MySQL 5.0.18 (but am upgrading to 5.0.24 tomorrow) and ActivePerl 5.8.819. I ran into a few problems with bioperl-db tests which were unrelated the ones below, but I'm wondering if it is a difference in MySQL versions. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Seth Johnson > Sent: Saturday, September 30, 2006 6:35 PM > To: Hilmar Lapp > Cc: Chris Fields; Bioperl List > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > Here're complete test details: > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ... > FAILED tests 10-12 > Failed 3/12 tests, 75.00% okay > Failed Test Stat Wstat Total Fail Failed List of Failed > -------------------------------------------------------------------------- > ----- > t\02species.t 65 2 3.08% 63 65 > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > t\12ontology.t 2 512 738 1471 199.32% 3-738 > t\16obda.t 12 3 25.00% 10-12 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From torsten.seemann at infotech.monash.edu.au Mon Oct 2 02:06:50 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Mon, 02 Oct 2006 16:06:50 +1000 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: <2A31B5F3-8D44-45C3-89C6-E42D802C5D14@gmx.net> References: <451C8ED8.2060003@infotech.monash.edu.au> <451CC40D.2030401@sendu.me.uk> <2A31B5F3-8D44-45C3-89C6-E42D802C5D14@gmx.net> Message-ID: <4520AC7A.1050009@infotech.monash.edu.au> >>> I have removed all use/@ISA Bio::Root::Object references from >>> bioperl-live, except for those in Bio::Root::* itself: >> So I'd say they're both relics that can be removed. In fact I was >> planning on getting rid off all references to both of these modules >> before you did, so thanks! :) > I think they can go. It's probably a pre-1.0 deprecation that somehow > was never followed through on. Today I did a fresh CVS checkout of bioperl-live, and deleted the following modules and tests, and all tests passed with BIOPERLDEBUG=0 * Bio::Root::Err * Bio::Root::Global * Bio::Root::IOManager * Bio::Root::Object * Bio::Root::Storable * Bio::Root::Utilities # may be used by third parties? * Bio::Root::Vector * Bio::Root::Xref * t/Root-Utilities.t # need to keep if we keep Utilities.pm * t/RootStorable.t Should we schedule for deprecation, or deprecate immediately as Hilmar suggested they were meant to be deprecated long ago ? -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From bix at sendu.me.uk Mon Oct 2 05:40:02 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 02 Oct 2006 10:40:02 +0100 Subject: [Bioperl-l] Tests involving remote databases In-Reply-To: <3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu> References: <000001c6e3e6$81630010$15327e82@pyrimidine> <6761B0A9-DEF0-4F33-9C89-39AFE104B61F@gmx.net> <79D77258-5A4E-4428-9322-00962D5C0419@uiuc.edu> <451E3707.4090400@sendu.me.uk> <0DA94AF7-DB0E-4DC5-8896-1048089D9AC5@uiuc.edu> <84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net> <3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu> Message-ID: <4520DE72.4000603@sendu.me.uk> Chris Fields wrote: > > The idea is to retain current behavior (remote DB access will not be > run unless BIOPERLDEBUG is set to 1) and apply it to all tests > requiring such access. Otherwise, just those tests are skipped (and > not the rest of the tests, which occurs currently). If BIOPERLDEBUG > is set, the next tests would check the URL, which passes/fails (based > on the specific value of $@), and runs/skips tests based on the mere > presence of $@, which indicates some URL issue. You can do this with > Test::More, but I'm not sure this can be done with Test.pm or > Test::Simple. Firstly, BIOPERLDEBUG should not be abused; it should be used only when you want to see extra debugging messages. There should be another variable that you can set to choose if network-requiring tests are run, and it should also be a configurable choice when you run perl Makefile.PL. (But changing this isn't going to happen for 1.5.2) When the server problem is ambiguous we should not fail the test. Just make the skip message visible and pass all ok... > The current behavior just skips all tests based on a single failed > URL. Then, Test::Harness, as currently set, shows skipped tests as > passed. The last run I posted previously where XEMBL_DB.t remote DB > tests failed, I also ran all tests (make test) and get this, which > doesn't tell us that the remote URL failed: > > ----------------------------------------- > > ... > t/WABA.......................ok > t/XEMBL_DB...................ok > t/ztr........................Bio::SeqIO::staden::read of bioperl-ext > is not installed or is installed incorrectly - skipping ztr.t tests > ok > All tests successful, 5 subtests skipped. All you have to do to make it visible is start the skip message with the work 'Skip': skip('Skip server may be down',1); ... t/WABA.......................ok t/XEMBL_DB...................ok 1/9 skipped: server may be down t/ztr........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping ztr.t tests t/ztr........................ok It's nicer when using Test::More. From bix at sendu.me.uk Mon Oct 2 05:55:27 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 02 Oct 2006 10:55:27 +0100 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: <4520AC7A.1050009@infotech.monash.edu.au> References: <451C8ED8.2060003@infotech.monash.edu.au> <451CC40D.2030401@sendu.me.uk> <2A31B5F3-8D44-45C3-89C6-E42D802C5D14@gmx.net> <4520AC7A.1050009@infotech.monash.edu.au> Message-ID: <4520E20F.6040406@sendu.me.uk> Torsten Seemann wrote: > >>> I have removed all use/@ISA Bio::Root::Object references from > >>> bioperl-live, except for those in Bio::Root::* itself: > > >> So I'd say they're both relics that can be removed. In fact I was > >> planning on getting rid off all references to both of these modules > >> before you did, so thanks! :) > >> I think they can go. It's probably a pre-1.0 deprecation that somehow >> was never followed through on. > > Today I did a fresh CVS checkout of bioperl-live, and deleted the > following modules and tests, and all tests passed with BIOPERLDEBUG=0 > > * Bio::Root::Err > * Bio::Root::Global > * Bio::Root::IOManager > * Bio::Root::Object > * Bio::Root::Storable > * Bio::Root::Utilities # may be used by third parties? > * Bio::Root::Vector > * Bio::Root::Xref > * t/Root-Utilities.t # need to keep if we keep Utilities.pm > * t/RootStorable.t > > Should we schedule for deprecation, or deprecate immediately as Hilmar > suggested they were meant to be deprecated long ago ? I'm happy to get rid of them all straight away. Does anyone object? From florin at iucha.net Sun Oct 1 21:40:07 2006 From: florin at iucha.net (Florin Iucha) Date: Sun, 1 Oct 2006 20:40:07 -0500 Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext on AMD64 Message-ID: <20061002014007.GG12075@iucha.net> Hello, I am trying to install bioperl-network from CVS. I found this to require bioperl from CVS, which requires bioperl-ext from CVS. I have compiled and installed io_lib 1.10.1. After running "perl Makefile.PL; make test" in bioperl-ext I see a lot sources being compiled, then: cc -c -I./libs -D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -O2 -DVERSION=\"1.5.1\" -DXS_VERSION=\"1.5.1\" -fPIC "-I/usr/lib/perl/5.8/CORE" -DPOSIX -DNOERROR Align.c Running Mkbootstrap for Bio::Ext::Align () chmod 644 Align.bs rm -f ../blib/arch/auto/Bio/Ext/Align/Align.so cc -shared -L/usr/local/lib Align.o -o ../blib/arch/auto/Bio/Ext/Align/Align.so libs/libsw.a \ -lm \ /usr/bin/ld: libs/libsw.a(aln.o): relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC libs/libsw.a: could not read symbols: Bad value collect2: ld returned 1 exit status make[1]: *** [../blib/arch/auto/Bio/Ext/Align/Align.so] Error 1 make[1]: Leaving directory `/scratch/dmbio/tools/bioperl-ext/Bio/Ext/Align' make: *** [subdirs] Error 2 This is on a Debian AMD64 box: florin at zeus $ gcc -v Using built-in specs. Target: x86_64-linux-gnu Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr --enable-checking=release x86_64-linux-gnu Thread model: posix gcc version 4.1.2 20060901 (prerelease) (Debian 4.1.1-13) florin at zeus $ perl -V Summary of my perl5 (revision 5 version 8 subversion 8) configuration: Platform: osname=linux, osvers=2.6.16-1-vserver-amd64-k8, archname=x86_64-linux-gnu-thread-multi uname='linux excelsior 2.6.16-1-vserver-amd64-k8 #2 smp tue apr 4 03:40:49 utc 2006 x86_64 gnulinux ' config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=x86_64-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.8 -Darchlib=/usr/lib/perl/5.8 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.8.8 -Dsitearch=/usr/local/lib/perl/5.8.8 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Uusesfio -Uusenm -Duseshrplib -Dlibperl=libperl.so.5.8.8 -Dd_dosuid -des' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=define use64bitall=define uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2', cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include' ccversion='', gccversion='4.1.2 20060729 (prerelease) (Debian 4.1.1-10)', gccosandvers='' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='cc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt perllibs=-ldl -lm -lpthread -lc -lcrypt libc=/lib/libc-2.3.6.so, so=so, useshrplib=true, libperl=libperl.so.5.8.8 gnulibc_version='2.3.6' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: MULTIPLICITY PERL_IMPLICIT_CONTEXT PERL_MALLOC_WRAP THREADS_HAVE_PIDS USE_64_BIT_ALL USE_64_BIT_INT USE_ITHREADS USE_LARGE_FILES USE_PERLIO USE_REENTRANT_API The compiler command line for aln.o is lacking -fPIC: cc -D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DPOSIX -DNOERROR -c -o aln.o aln.c Adding -fPIC to the CCFLAGS variable in Bio/Ext/Align/Makefile and Makefile seems to take build further, but it fails with a similar error in Bio/SeqIO/staden/_Inline/build/Bio/SeqIO/staden/read. That Makefile seems to be regenerated every time I run 'make test' in the top level directory. The error in ../staden/read is: rm -f blib/arch/auto/Bio/SeqIO/staden/read/read.so cc -shared -L/usr/local/lib read.o -o blib/arch/auto/Bio/SeqIO/staden/read/read.so \ -L/usr/local/lib -lread -lz \ /usr/bin/ld: /usr/local/lib/libread.a(libread_a-Read.o): relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC /usr/local/lib/libread.a: could not read symbols: Bad value collect2: ld returned 1 exit status make: *** [blib/arch/auto/Bio/SeqIO/staden/read/read.so] Error 1 So, the questions appears to be: - should "-fPIC" be appended to CFLAGS in the generated Makefiles? - is there anything wrong with io_lib flags? - has anybody built bioperl-ext on AMD64? I can help with debugging or testing if given a gentle nudge in the right direction, but I have little experience with the interactions between perl and static libraries on 64 bit. Thanks, florin -- If we wish to count lines of code, we should not regard them as lines produced but as lines spent. -- Edsger Dijkstra -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: From bix at sendu.me.uk Mon Oct 2 06:52:47 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 02 Oct 2006 11:52:47 +0100 Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext on AMD64 In-Reply-To: <20061002014007.GG12075@iucha.net> References: <20061002014007.GG12075@iucha.net> Message-ID: <4520EF7F.40908@sendu.me.uk> Florin Iucha wrote: > Hello, > > I am trying to install bioperl-network from CVS. I found this to > require bioperl from CVS, which requires bioperl-ext from CVS. I can't help with the compile problems you encountered (other than to say I also have problems under AMD64), but from where did you get the idea that bioperl (live/core) requires bioperl-ext? It doesn't, though recent changes to Makefile.PL may give that impression... From cjfields at uiuc.edu Mon Oct 2 08:26:57 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 07:26:57 -0500 Subject: [Bioperl-l] Tests involving remote databases In-Reply-To: <4520DE72.4000603@sendu.me.uk> References: <000001c6e3e6$81630010$15327e82@pyrimidine> <6761B0A9-DEF0-4F33-9C89-39AFE104B61F@gmx.net> <79D77258-5A4E-4428-9322-00962D5C0419@uiuc.edu> <451E3707.4090400@sendu.me.uk> <0DA94AF7-DB0E-4DC5-8896-1048089D9AC5@uiuc.edu> <84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net> <3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu> <4520DE72.4000603@sendu.me.uk> Message-ID: On Oct 2, 2006, at 4:40 AM, Sendu Bala wrote: > Chris Fields wrote: >> >> The idea is to retain current behavior (remote DB access will not be >> run unless BIOPERLDEBUG is set to 1) and apply it to all tests >> requiring such access. Otherwise, just those tests are skipped (and >> not the rest of the tests, which occurs currently). If BIOPERLDEBUG >> is set, the next tests would check the URL, which passes/fails (based >> on the specific value of $@), and runs/skips tests based on the mere >> presence of $@, which indicates some URL issue. You can do this with >> Test::More, but I'm not sure this can be done with Test.pm or >> Test::Simple. > > Firstly, BIOPERLDEBUG should not be abused; it should be used only > when > you want to see extra debugging messages. There should be another > variable that you can set to choose if network-requiring tests are > run, > and it should also be a configurable choice when you run perl > Makefile.PL. > > (But changing this isn't going to happen for 1.5.2) > > When the server problem is ambiguous we should not fail the test. Just > make the skip message visible and pass all ok... I agree, as well as with your assessment of BIOPERLDEBUG (which I alluded to in a previous post). Torsten suggested creating a new env. variable for network tests. It's obvious this won't be done before 1.5.2, but we can make plans towards the next release. >> The current behavior just skips all tests based on a single failed >> URL. Then, Test::Harness, as currently set, shows skipped tests as >> passed. The last run I posted previously where XEMBL_DB.t remote DB >> tests failed, I also ran all tests (make test) and get this, which >> doesn't tell us that the remote URL failed: >> >> ----------------------------------------- >> >> ... >> t/WABA.......................ok >> t/XEMBL_DB...................ok >> t/ztr........................Bio::SeqIO::staden::read of bioperl-ext >> is not installed or is installed incorrectly - skipping ztr.t tests >> ok >> All tests successful, 5 subtests skipped. > > All you have to do to make it visible is start the skip message > with the > work 'Skip': > > skip('Skip server may be down',1); > > ... > t/WABA.......................ok > > t/XEMBL_DB...................ok > > 1/9 skipped: server may be down > t/ztr........................Bio::SeqIO::staden::read of bioperl- > ext is > not installed or is installed incorrectly - skipping ztr.t tests > t/ztr........................ok > > > It's nicer when using Test::More. Okay, if Test::Harness picks that up it would be okay. We could use skip blocks to skip subsets of tests that require remote access (like SeqFeature.t) as opposed to skipping all tests. I think we want to avoid promoting running tests with BIOPERLDEBUG (or similar) upon installation for everyday installation anyway (such as from CPAN, which Hilmar points out). It's not something everybody installing a new BioPerl should be running unless they run into problems. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From florin at iucha.net Mon Oct 2 08:15:06 2006 From: florin at iucha.net (Florin Iucha) Date: Mon, 2 Oct 2006 07:15:06 -0500 Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext on AMD64 In-Reply-To: <4520EF7F.40908@sendu.me.uk> References: <20061002014007.GG12075@iucha.net> <4520EF7F.40908@sendu.me.uk> Message-ID: <20061002121506.GB14409@iucha.net> On Mon, Oct 02, 2006 at 11:52:47AM +0100, Sendu Bala wrote: > Florin Iucha wrote: > > I am trying to install bioperl-network from CVS. I found this to > > require bioperl from CVS, which requires bioperl-ext from CVS. > > I can't help with the compile problems you encountered (other than to > say I also have problems under AMD64), but from where did you get the > idea that bioperl (live/core) requires bioperl-ext? It doesn't, though > recent changes to Makefile.PL may give that impression... Running the tests for bioperl-live mention in some places that 'this test has been skipped since $foo is not available' and I found the 'foos' in bioperl-ext. florin -- If we wish to count lines of code, we should not regard them as lines produced but as lines spent. -- Edsger Dijkstra -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: From bix at sendu.me.uk Mon Oct 2 10:05:11 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 02 Oct 2006 15:05:11 +0100 Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext on AMD64 In-Reply-To: <20061002121506.GB14409@iucha.net> References: <20061002014007.GG12075@iucha.net> <4520EF7F.40908@sendu.me.uk> <20061002121506.GB14409@iucha.net> Message-ID: <45211C97.2060800@sendu.me.uk> Florin Iucha wrote: > On Mon, Oct 02, 2006 at 11:52:47AM +0100, Sendu Bala wrote: >> Florin Iucha wrote: >>> I am trying to install bioperl-network from CVS. I found this to >>> require bioperl from CVS, which requires bioperl-ext from CVS. >> I can't help with the compile problems you encountered (other than to >> say I also have problems under AMD64), but from where did you get the >> idea that bioperl (live/core) requires bioperl-ext? It doesn't, though >> recent changes to Makefile.PL may give that impression... > > Running the tests for bioperl-live mention in some places that 'this > test has been skipped since $foo is not available' and I found the > 'foos' in bioperl-ext. Right, yes. The idea is, you'd only need to install bioperl-ext if you wanted to use the modules that the complaining tests test. So if none of the things that were skipped matter to you, don't install ext. I guess this needs to be clarified in documentation somewhere. From cjfields at uiuc.edu Mon Oct 2 10:13:56 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 09:13:56 -0500 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: <4520AC7A.1050009@infotech.monash.edu.au> Message-ID: <001801c6e62d$02c883d0$15327e82@pyrimidine> > >>> I have removed all use/@ISA Bio::Root::Object references from > >>> bioperl-live, except for those in Bio::Root::* itself: > > >> So I'd say they're both relics that can be removed. In fact I was > >> planning on getting rid off all references to both of these modules > >> before you did, so thanks! :) > > > I think they can go. It's probably a pre-1.0 deprecation that somehow > > was never followed through on. > > Today I did a fresh CVS checkout of bioperl-live, and deleted the > following modules and tests, and all tests passed with BIOPERLDEBUG=0 > > * Bio::Root::Err > * Bio::Root::Global > * Bio::Root::IOManager > * Bio::Root::Object > * Bio::Root::Storable > * Bio::Root::Utilities # may be used by third parties? > * Bio::Root::Vector > * Bio::Root::Xref > * t/Root-Utilities.t # need to keep if we keep Utilities.pm > * t/RootStorable.t > > Should we schedule for deprecation, or deprecate immediately as Hilmar > suggested they were meant to be deprecated long ago ? I vote for quick deprecation; I had also noticed that these were superfluous and added them as possible deprecations to the wiki page. However, we need to be careful about that 'third-party use' caveat you have for Bio::Root::Utilities; there's another one with Bio::Root::Storable and Ensembl: http://thread.gmane.org/gmane.comp.lang.perl.bio.general/2924/focus=2924 and it seems to have it's users: http://thread.gmane.org/gmane.comp.lang.perl.bio.general/8242/focus=8242 The others (including Bio::Root::Utilities) haven't had any major threads on the mail lists in a very long time. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -- > Dr Torsten Seemann http://www.vicbioinformatics.com > Victorian Bioinformatics Consortium, Monash University, Australia > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Mon Oct 2 10:16:31 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 09:16:31 -0500 Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-exton AMD64 In-Reply-To: <20061002121506.GB14409@iucha.net> Message-ID: <001901c6e62d$5c4fac80$15327e82@pyrimidine> They're not absolutely necessary; the tests are skipped w/o failure because bioperl-ext is optional. These are only necessary if you want the ability to read sequence trace files. BTW, you might have a rough time on trying to install bioperl-ext depending on your platform. Note the following bug report: http://bugzilla.open-bio.org/show_bug.cgi?id=2074 Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Florin Iucha > Sent: Monday, October 02, 2006 7:15 AM > To: bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] Failure to compile the CVS snapshot of bioperl- > exton AMD64 > > On Mon, Oct 02, 2006 at 11:52:47AM +0100, Sendu Bala wrote: > > Florin Iucha wrote: > > > I am trying to install bioperl-network from CVS. I found this to > > > require bioperl from CVS, which requires bioperl-ext from CVS. > > > > I can't help with the compile problems you encountered (other than to > > say I also have problems under AMD64), but from where did you get the > > idea that bioperl (live/core) requires bioperl-ext? It doesn't, though > > recent changes to Makefile.PL may give that impression... > > Running the tests for bioperl-live mention in some places that 'this > test has been skipped since $foo is not available' and I found the > 'foos' in bioperl-ext. > > florin > > -- > If we wish to count lines of code, we should not regard them as lines > produced but as lines spent. -- Edsger Dijkstra From osborne1 at optonline.net Mon Oct 2 10:14:13 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Mon, 02 Oct 2006 10:14:13 -0400 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: <4520E20F.6040406@sendu.me.uk> Message-ID: Sendu, No objection but someone should check the scripts in examples/root to make sure that they are not used there. Brian O. On 10/2/06 5:55 AM, "Sendu Bala" wrote: > Torsten Seemann wrote: >>>>> I have removed all use/@ISA Bio::Root::Object references from >>>>> bioperl-live, except for those in Bio::Root::* itself: >> >>>> So I'd say they're both relics that can be removed. In fact I was >>>> planning on getting rid off all references to both of these modules >>>> before you did, so thanks! :) >> >>> I think they can go. It's probably a pre-1.0 deprecation that somehow >>> was never followed through on. >> >> Today I did a fresh CVS checkout of bioperl-live, and deleted the >> following modules and tests, and all tests passed with BIOPERLDEBUG=0 >> >> * Bio::Root::Err >> * Bio::Root::Global >> * Bio::Root::IOManager >> * Bio::Root::Object >> * Bio::Root::Storable >> * Bio::Root::Utilities # may be used by third parties? >> * Bio::Root::Vector >> * Bio::Root::Xref >> * t/Root-Utilities.t # need to keep if we keep Utilities.pm >> * t/RootStorable.t >> >> Should we schedule for deprecation, or deprecate immediately as Hilmar >> suggested they were meant to be deprecated long ago ? > > I'm happy to get rid of them all straight away. Does anyone object? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From johnson.biotech at gmail.com Mon Oct 2 10:21:50 2006 From: johnson.biotech at gmail.com (Seth Johnson) Date: Mon, 2 Oct 2006 10:21:50 -0400 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: <000001c6e5dc$2eceabe0$15327e82@pyrimidine> References: <000001c6e5dc$2eceabe0$15327e82@pyrimidine> Message-ID: I'm using MySQL 5.0.19 and Perl v5.8.7 [MSWin32-x86-multi-thread] On 10/2/06, Chris Fields wrote: > > Seth, > > What version of MySQL and perl are you using? I'm using MySQL 5.0.18 (but > am upgrading to 5.0.24 tomorrow) and ActivePerl 5.8.819. > > I ran into a few problems with bioperl-db tests which were unrelated the > ones below, but I'm wondering if it is a difference in MySQL versions. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > -----Original Message----- > > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > > bounces at lists.open-bio.org] On Behalf Of Seth Johnson > > Sent: Saturday, September 30, 2006 6:35 PM > > To: Hilmar Lapp > > Cc: Chris Fields; Bioperl List > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > > > Here're complete test details: > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > ... > > > FAILED tests 10-12 > > Failed 3/12 tests, 75.00% okay > > Failed Test Stat Wstat Total Fail Failed List of Failed > > > -------------------------------------------------------------------------- > > ----- > > t\02species.t 65 2 3.08% 63 65 > > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > > t\12ontology.t 2 512 738 1471 199.32% 3-738 > > t\16obda.t 12 3 25.00% 10-12 > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- Best Regards, Seth Johnson Senior Bioinformatics Associate Ph: (202) 470-0900 Fx: (775) 251-0358 From osborne1 at optonline.net Mon Oct 2 10:08:50 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Mon, 02 Oct 2006 10:08:50 -0400 Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext on AMD64 In-Reply-To: <20061002014007.GG12075@iucha.net> Message-ID: Florian, Minor correction here, the Bioperl package does not require bioperl-ext. However we see there is a problem compiling bioperl-ext... Brian O. On 10/1/06 9:40 PM, "Florin Iucha" wrote: > I am trying to install bioperl-network from CVS. I found this to > require bioperl from CVS, which requires bioperl-ext from CVS. From JK at novozymes.com Mon Oct 2 10:05:34 2006 From: JK at novozymes.com (JK (Jesper Agerbo Krogh)) Date: Mon, 2 Oct 2006 16:05:34 +0200 Subject: [Bioperl-l] Blast parser. Message-ID: <934F95E71B6C9347A873C42AE3C196190B84C5D6@NZT0004E.dknz.nzcorp.net> Hi. I've tried to use the blast-parser but I cannot get the original alignment out of the parser. Is it possible to get that out of the Bio::Search::HSP::BlastHSP in some way. It seems quite odd to get a clustalw alignment out when it isn't that type of alignment people are used to get from blast. Thanks Jesper From cjfields at uiuc.edu Mon Oct 2 10:36:31 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 09:36:31 -0500 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: Message-ID: <001d01c6e630$27792fb0$15327e82@pyrimidine> > Sendu, > > No objection but someone should check the scripts in examples/root to make > sure that they are not used there. > > Brian O. I suppose it's also possible that the other bioperl distributions (like bioperl-run) could use them as well. If they do we can take care of them as they pop up. These are really old and haven't been revised in a long time. The only one I worry about is Bio::Root::Storable b/c of Ensembl. Does anyone know where Will Spooner is? He's the maintainer for Bio::Root::Storable. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 2 11:01:44 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 10:01:44 -0500 Subject: [Bioperl-l] Blast parser. In-Reply-To: <934F95E71B6C9347A873C42AE3C196190B84C5D6@NZT0004E.dknz.nzcorp.net> Message-ID: <000001c6e633$ad0a6ce0$15327e82@pyrimidine> The alignment that you get should come from GenericHSP, not BLASTHSP. Either way, the HSP alignment that is retrieved using $hsp->get_aln() should be a Bio::SimpleAlign object. You can then output that to the proper AlignIO format using an AlignIO stream object or use the Bio::SimpleAlign methods for further analysis. my $aln = $hsp->get_aln(); my $alnout = Bio::AlignIO->new(-format => 'msf', -fh => \*STDOUT); $alnout->write_aln($aln); Quick note: not all AlignIO formats have write_aln() support at this time, but most do. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of JK (Jesper Agerbo Krogh) > Sent: Monday, October 02, 2006 9:06 AM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] Blast parser. > > > Hi. > > I've tried to use the blast-parser but I cannot get the original alignment > out of the parser. Is it possible to get that out of the > Bio::Search::HSP::BlastHSP in some way. It seems quite odd to get a > clustalw alignment out when it isn't that type of alignment people are > used to get from blast. > > Thanks > > Jesper > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From whs at ebi.ac.uk Mon Oct 2 12:00:19 2006 From: whs at ebi.ac.uk (Will Spooner) Date: Mon, 2 Oct 2006 17:00:19 +0100 (BST) Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: <001d01c6e630$27792fb0$15327e82@pyrimidine> References: <001d01c6e630$27792fb0$15327e82@pyrimidine> Message-ID: On Mon, 2 Oct 2006, Chris Fields wrote: >> Sendu, >> >> No objection but someone should check the scripts in examples/root to make >> sure that they are not used there. >> >> Brian O. > > I suppose it's also possible that the other bioperl distributions (like > bioperl-run) could use them as well. > > If they do we can take care of them as they pop up. These are really old > and haven't been revised in a long time. > > The only one I worry about is Bio::Root::Storable b/c of Ensembl. Does > anyone know where Will Spooner is? He's the maintainer for > Bio::Root::Storable. > Hi Chris, I'm still lurking... If the tests for Bio::Root::Storable still pass (I assume that they do), then the module is working as advertised. The idea behind Storable is very simple; object instances of any inhereting class can be serialised/retrieved from disk. BioPerl objects will probably not want this functionality by default, but it is trival to implement if needed. Will From cjfields at uiuc.edu Mon Oct 2 13:58:15 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 12:58:15 -0500 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: Message-ID: <000601c6e64c$5746f990$15327e82@pyrimidine> > On Mon, 2 Oct 2006, Chris Fields wrote: > > >> Sendu, > >> > >> No objection but someone should check the scripts in examples/root to > make > >> sure that they are not used there. > >> > >> Brian O. > > > > I suppose it's also possible that the other bioperl distributions (like > > bioperl-run) could use them as well. > > > > If they do we can take care of them as they pop up. These are really > old > > and haven't been revised in a long time. > > > > The only one I worry about is Bio::Root::Storable b/c of Ensembl. Does > > anyone know where Will Spooner is? He's the maintainer for > > Bio::Root::Storable. > > > > Hi Chris, > > I'm still lurking... > > If the tests for Bio::Root::Storable still pass (I assume that they do), > then the module is working as advertised. > > The idea behind Storable is very simple; object instances of any > inhereting class can be serialised/retrieved from disk. BioPerl objects > will probably not want this functionality by default, but it is trival to > implement if needed. > > Will Okay, nice to know you're listening in! Based on that we should keep it in. The rest that Torsten mentioned could probably be removed right away. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From osborne1 at optonline.net Mon Oct 2 13:59:58 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Mon, 02 Oct 2006 13:59:58 -0400 Subject: [Bioperl-l] bioperl-network warnings when loading the DIP dataset In-Reply-To: <20061002002403.GD12075@iucha.net> Message-ID: Florin, OK, this is fixed in CVS now. The problem is that there's some variability in how the PSI MI "standard" is used. In this case there was a species that was not given a value for its scientific name ("fullName"), I had to use common name in its place. Fortunately there's an NCBI taxon id behind all this. Thanks again, Brian O. On 10/1/06 8:24 PM, "Florin Iucha" wrote: > Also the largest file (full_2_psi1.xml) from release 2006-09-19 of the > MINT [1] database does not produce the crash. It has a new warning, however: > > Can't call method "text" on an undefined value at > /usr/local/share/perl/5.8.8/Bio/Network/IO/psi.pm line 290. From mmacho at gmail.com Mon Oct 2 13:43:13 2006 From: mmacho at gmail.com (ende) Date: Mon, 2 Oct 2006 19:43:13 +0200 Subject: [Bioperl-l] Variable scope Message-ID: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com> Hi this may be a typical perl topic and then out of this list center topic. My apologize for any inconvenience. It is a annoying problem that is making me waste lot of time. I have a package with its new object, etc... and constants in it like: #----- use constant False => 0; use constant True => 1; our %CLRFG = ( PLASMIDO => RED, POLY_A => GREEN, RESTR_SITES => BLUE, CONECTORS => MAGENTA, CONTAMINANTS => CYAN, ); our %CLRBG = ( PLASMIDO => "", POLY_A => "", RESTR_SITES => "", CONECTORS => "", CONTAMINANTS => "", ); #------ this constants are include with require "h.pl" from the main package file. I use this module from the mail command line driver to test it "using" it. In the command line driver I can use with no gripe the constants False and True directly, for example "return True", etc without any reference to the origin of that constant. But, with respect to the variables (I would like they also were constants.. but how?), %CLRFG and %CLRBG I can't find the way of refering those int the module. Finally I have desisted and _copy_ the definitions where I have needed it (in the sub were I print Ansi terminal colouring seqs...). I don't find how to refer those variables out of the module. I have tried %modulename::CLRFG, for example, but Perl gives me errors. Any help? -- Juan Falgueras Profesor del Depto. de Lenguajes y Ciencias de la Computaci?n Universidad de M?laga From cjfields at uiuc.edu Mon Oct 2 16:52:11 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 15:52:11 -0500 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: Message-ID: <000001c6e664$a25538d0$15327e82@pyrimidine> I have updated the Deprecation page with the Bio::Root::* modules that we plan on deprecating (note that I have them being removed for rel. 1.5.2). I have left out Bio::Root::Storable for now based on Will's response. http://www.bioperl.org/wiki/Deprecated_modules I'll update the DEPRECATED doc in CVS as well. There is a tentative schedule for when warnings are added for modules before they are removed. In relation to the recent trend for house-cleaning, I noticed that all of the Bio::Tools::BP* BLAST-related modules all are still present but haven't been modified or had deprecation warnings added. BPLite was marked for deprecation around rel 1.5 since the functionality is present in Bio::SearchIO, as well as the others. Judging by the mail list, no one has used these in quite a while, and everyone has been redirected to use Bio::SearchIO instead. Based on that I have added warnings in CVS for deprecation to BPlite and the related modules BPpsilite and BPbl2seq. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Brian Osborne > Sent: Monday, October 02, 2006 9:14 AM > To: Sendu Bala; bioperl-l > Subject: Re: [Bioperl-l] Do we need Bio::Root::Object anymore? > > Sendu, > > No objection but someone should check the scripts in examples/root to make > sure that they are not used there. > > Brian O. > > > On 10/2/06 5:55 AM, "Sendu Bala" wrote: > > > Torsten Seemann wrote: > >>>>> I have removed all use/@ISA Bio::Root::Object references from > >>>>> bioperl-live, except for those in Bio::Root::* itself: > >> > >>>> So I'd say they're both relics that can be removed. In fact I was > >>>> planning on getting rid off all references to both of these modules > >>>> before you did, so thanks! :) > >> > >>> I think they can go. It's probably a pre-1.0 deprecation that somehow > >>> was never followed through on. > >> > >> Today I did a fresh CVS checkout of bioperl-live, and deleted the > >> following modules and tests, and all tests passed with BIOPERLDEBUG=0 > >> > >> * Bio::Root::Err > >> * Bio::Root::Global > >> * Bio::Root::IOManager > >> * Bio::Root::Object > >> * Bio::Root::Storable > >> * Bio::Root::Utilities # may be used by third parties? > >> * Bio::Root::Vector > >> * Bio::Root::Xref > >> * t/Root-Utilities.t # need to keep if we keep Utilities.pm > >> * t/RootStorable.t > >> > >> Should we schedule for deprecation, or deprecate immediately as Hilmar > >> suggested they were meant to be deprecated long ago ? > > > > I'm happy to get rid of them all straight away. Does anyone object? > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From florin at iucha.net Mon Oct 2 16:47:01 2006 From: florin at iucha.net (Florin Iucha) Date: Mon, 2 Oct 2006 15:47:01 -0500 Subject: [Bioperl-l] Variable scope In-Reply-To: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com> References: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com> Message-ID: <20061002204701.GG14409@iucha.net> On Mon, Oct 02, 2006 at 07:43:13PM +0200, ende wrote: > It is a annoying problem that is making me waste lot of time. > > I have a package with its new object, etc... and constants in it like: > > #----- > use constant False => 0; > use constant True => 1; > > our %CLRFG = ( > PLASMIDO => RED, > POLY_A => GREEN, > RESTR_SITES => BLUE, > CONECTORS => MAGENTA, > CONTAMINANTS => CYAN, > ); > > our %CLRBG = ( > PLASMIDO => "", > POLY_A => "", > RESTR_SITES => "", > CONECTORS => "", > CONTAMINANTS => "", > ); > #------ > > this constants are include with require "h.pl" from the main package > file. > > I use this module from the mail command line driver to test it > "using" it. In the command line driver I can use with no gripe the > constants False and True directly, for example "return True", etc > without any reference to the origin of that constant. It is possible you get them from somewhere else. > But, with respect to the variables (I would like they also were > constants.. but how?), %CLRFG and %CLRBG I can't find the way of > refering those int the module. Finally I have desisted and _copy_ > the definitions where I have needed it (in the sub were I print Ansi > terminal colouring seqs...). I don't find how to refer those > variables out of the module. > > I have tried %modulename::CLRFG, for example, but Perl gives me errors. Did you actually declare a package name in "h.pl" ? Is there any reason you don't call the file ".pm" and load it with "use"? I have attached a small example of importing that works. florin -- If we wish to count lines of code, we should not regard them as lines produced but as lines spent. -- Edsger Dijkstra -------------- next part -------------- A non-text attachment was scrubbed... Name: one.pm Type: text/x-perl Size: 118 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: two.pl Type: text/x-perl Size: 69 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: From Kevin.M.Brown at asu.edu Mon Oct 2 19:44:50 2006 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Mon, 2 Oct 2006 16:44:50 -0700 Subject: [Bioperl-l] ClustalW alignment and Bioperl::Run module Message-ID: <1A4207F8295607498283FE9E93B775B4021960CD@EX02.asurite.ad.asu.edu> Well, for anyone that wants to know, I found a way to capture the output of ClustalW to get at things like the score. Copy STDOUT to another handle open(OUTCOPY, ">&STDOUT") or die "Couldn't dup STDOUT: $!"; Change where STDOUT goes open(STDOUT, ">log.test") or die "Couldn't open log.test: $!"; Run the alignment and its output will be captured by the STDOUT redirection $aln, $factory->align(\@seq); Restore STDOUT to its normal location for the rest of the script close STDOUT; open(STDOUT, ">&OUTCOPY"); I guess I can understand why most of this is just dropped by the ClustalW.pm module since there doesn't seem to be a way to hold it all in a SimpleAlign object. > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Kevin Brown > Sent: Thursday, September 28, 2006 2:48 PM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] ClustalW alignment and Bioperl::Run module > > I've gotten a very simple script to run using bioperl that creates an > alignment using clustalw of two sequences. I see that clustal outputs > to stdout information like the score, but I don't see any way to store > that or retrieve that from the alignment object that is > returned (unless > I'm just blind). What follows is my very basic script which used code > found in the Wiki. > > print $aln->score() spits out an error about using an uninitialized > value. > > > #!/usr/bin/perl -w > > use strict; > use Bio::SeqIO; > use Bio::Perl; > use Bio::AlignIO; > use Getopt::Long qw(:config no_ignore_case bundling pass_through); > use POSIX; > use Bio::Tools::Run::Alignment::Clustalw; > > my $fileName = ""; # filename(s) to be parsed for > information > my $output_dir = ""; > my $format = 'fasta'; # default format for SeqIO module > > GetOptions( > 'file=s' => \$fileName, > 'output=s' => \$output_dir, > ); > > # Parse the input file for the needed information > # SeqIO supports several normal formats including , and > > > my @files = split(/\|/, $fileName); > my @seq_array; > > my $stream_out = > Bio::AlignIO->new(-file => '>test.msf', -format => 'msf', -flush => > 0); > > foreach my $fileName (@files) > { > my $file = Bio::SeqIO->new(-format => $format, -file => > $fileName); > my $seq; > while ($seq = $file->next_seq()) > { > push(@seq_array, $seq); > } > } > > my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM'); > my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params); > my $ktuple = 3; > $factory->ktuple($ktuple); # change the parameter before executing > # where @seq_array is an array of {{PM|Bio::Seq}} objects > > open my $out, ">seq.txt"; > > for (my $i = 1 ; $i <= $#seq_array ; $i++) > { > my @seq = ($seq_array[0], $seq_array[$i]); > my $aln = $factory->align(\@seq); > $stream_out->write_aln($aln); > print $aln->score; > for my $seq ($aln->each_seq) { > print $out $seq->display_id() ."\t". $seq->seq()."\n"; > } > } > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From bix at sendu.me.uk Mon Oct 2 19:48:34 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 03 Oct 2006 00:48:34 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 Message-ID: <4521A552.60301@sendu.me.uk> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll upload tar.gz files when I have access to the server, then reply here with links. In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for instructions on getting and testing this RC. Developers: Make sure you're in the AUTHORS file in all 4 packages, as appropriate. Users: Even though 1.5.2 is a 'developer' release, we consider it the most stable and capable version of Bioperl, and recommend that you use it in all but the most critical production environments. Please try it out and let us know of any problems or difficulties you run into. Thank you, Sendu. From lincoln.stein at gmail.com Mon Oct 2 17:53:38 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Mon, 2 Oct 2006 21:53:38 +0000 Subject: [Bioperl-l] Variable scope In-Reply-To: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com> References: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com> Message-ID: <6dce9a0b0610021453va2132c7u73747b9253211a66@mail.gmail.com> Hi, Read the documentation in Export. It is much better to formally export constants, variables and functions and to import them with "use" than to use "require". Also be sure that you understand how namespaces and modules work. This is not a BioPerl topic and should have been directed to a general Perl discussion list, such as Perl Monks. Lincoln On 10/2/06, ende wrote: > > > Hi > > this may be a typical perl topic and then out of this list center > topic. My apologize for any inconvenience. > > It is a annoying problem that is making me waste lot of time. > > I have a package with its new object, etc... and constants in it like: > > #----- > use constant False => 0; > use constant True => 1; > > our %CLRFG = ( > PLASMIDO => RED, > POLY_A => GREEN, > RESTR_SITES => BLUE, > CONECTORS => MAGENTA, > CONTAMINANTS => CYAN, > ); > > our %CLRBG = ( > PLASMIDO => "", > POLY_A => "", > RESTR_SITES => "", > CONECTORS => "", > CONTAMINANTS => "", > ); > #------ > > this constants are include with require "h.pl" from the main package > file. > > I use this module from the mail command line driver to test it > "using" it. In the command line driver I can use with no gripe the > constants False and True directly, for example "return True", etc > without any reference to the origin of that constant. > > But, with respect to the variables (I would like they also were > constants.. but how?), %CLRFG and %CLRBG I can't find the way of > refering those int the module. Finally I have desisted and _copy_ > the definitions where I have needed it (in the sub were I print Ansi > terminal colouring seqs...). I don't find how to refer those > variables out of the module. > > I have tried %modulename::CLRFG, for example, but Perl gives me errors. > > Any help? > > > > > -- > Juan Falgueras > Profesor del Depto. de Lenguajes y Ciencias de la Computaci?n > Universidad de M?laga > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From florin at iucha.net Mon Oct 2 22:30:31 2006 From: florin at iucha.net (Florin Iucha) Date: Mon, 2 Oct 2006 21:30:31 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <4521A552.60301@sendu.me.uk> References: <4521A552.60301@sendu.me.uk> Message-ID: <20061003023031.GI14409@iucha.net> On Tue, Oct 03, 2006 at 12:48:34AM +0100, Sendu Bala wrote: > Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll > upload tar.gz files when I have access to the server, then reply here > with links. > > In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for > instructions on getting and testing this RC. [I won't create a wiki account just to report this.] Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG not set. Lots of warnings about missing packages and all, but this looks interesting: Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423. Otherwise: Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, 99.99% okay. The failed test is: t/ESEfinder..................dubious Test returned status 255 (wstat 65280, 0xff00) DIED. FAILED test 15 florin -- If we wish to count lines of code, we should not regard them as lines produced but as lines spent. -- Edsger Dijkstra From cjfields at uiuc.edu Mon Oct 2 23:50:47 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 22:50:47 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <4521A552.60301@sendu.me.uk> References: <4521A552.60301@sendu.me.uk> Message-ID: <7F5CF5FF-5A1D-4DEA-A861-D4C47756F67D@uiuc.edu> So far all tests pass on Mac OS X. I'll add this to the release page. This RC will throw warnings for four tests I didn't remove in time (BPlite.t, BPpsilite, BPbl2seq.t, and RestrictionEnzyme.t), which correspond to their namesake deprecated Bio::Tools modules. These are no longer in CVS HEAD so should be gone by the next RC, and the relevant modules marked for deprecation. I can verify the Bio::DB::SeqFeature.t warning on Mac OS X that Florin reported, but ESEFinder.t works fine: t/BioDBSeqFeature............Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423. ok .... I'll report WinXP tests tomorrow on the wiki. Chris On Oct 2, 2006, at 6:48 PM, Sendu Bala wrote: > Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll > upload tar.gz files when I have access to the server, then reply here > with links. > > In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for > instructions on getting and testing this RC. > > Developers: > Make sure you're in the AUTHORS file in all 4 packages, as > appropriate. > > Users: > Even though 1.5.2 is a 'developer' release, we consider it the most > stable and capable version of Bioperl, and recommend that you use > it in all but the most critical production environments. Please > try it out and let us know of any problems or difficulties you run > into. > > > Thank you, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 2 23:54:29 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 22:54:29 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <20061003023031.GI14409@iucha.net> References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net> Message-ID: > [I won't create a wiki account just to report this.] > > Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG > not set. Lots of warnings about missing packages and all, but this > looks interesting: > > Argument "+" isn't numeric in numeric lt (<) at Bio/DB/ > SeqFeature/Segment.pm line 423. This is verified on Mac OS X. > Otherwise: > > Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, > 99.99% okay. > > The failed test is: > > t/ESEfinder..................dubious > Test returned status 255 (wstat 65280, 0xff00) > DIED. FAILED test 15 What do you get when you run that set of tests using 'perl -I. -w t/ ESEFinder.t'? The bad status code is odd and could be a remote server issue. Chris > > florin > > -- > If we wish to count lines of code, we should not regard them as lines > produced but as lines spent. -- Edsger Dijkstra > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From torsten.seemann at infotech.monash.edu.au Tue Oct 3 00:30:06 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Tue, 03 Oct 2006 14:30:06 +1000 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm Message-ID: <4521E74E.1040404@infotech.monash.edu.au> My understanding is that all Bioperl-compliant classes should inherit from Bio::Root::Root, not Bio::Root::RootI. Additionally, if functions such as throw() or _rearrange() are to be used without a class instance reference, they are to be used as class methods via Bio::Root::Root, not Bio::Root::RootI. Is this correct? My naive audit of bioperl-live CVS brought up the following statistics: # Root.pm /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l 26 /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | wc -l 346 # RootI.pm /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l 9 /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | wc -l 79 My guess would be that all RootI should be changed to plain Root ? Any help appreciated, -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From jason at bioperl.org Tue Oct 3 02:03:17 2006 From: jason at bioperl.org (Jason Stajich) Date: Mon, 2 Oct 2006 23:03:17 -0700 Subject: [Bioperl-l] t/ESEFinder.t fixed on branch Message-ID: <8927E580-997E-402A-BA81-D09FF9AD5D9A@bioperl.org> Looks like good work everyone. All tests pass for me on OSX perl 5.8.6. w and w/o BIOPERLDEBUG=1 with RC1 except for the t/ESEFinder problem which I've fixed. It skipped too few tests when BIOPERLDEBUG=0. Don't forget to merge branch changes back to head for this test when it is done. I don't want to muddy water so I'm holding off migrating the changes to main trunk as the files is substantially different (I presume pre-Test::More adoption?). -jason From bix at sendu.me.uk Tue Oct 3 03:28:48 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 03 Oct 2006 08:28:48 +0100 Subject: [Bioperl-l] t/ESEFinder.t fixed on branch In-Reply-To: <8927E580-997E-402A-BA81-D09FF9AD5D9A@bioperl.org> References: <8927E580-997E-402A-BA81-D09FF9AD5D9A@bioperl.org> Message-ID: <45221130.2060405@sendu.me.uk> Jason Stajich wrote: > Looks like good work everyone. > > All tests pass for me on OSX perl 5.8.6. w and w/o BIOPERLDEBUG=1 > with RC1 except for the t/ESEFinder problem which I've fixed. > > It skipped too few tests when BIOPERLDEBUG=0. > > Don't forget to merge branch changes back to head for this test when > it is done. I don't want to muddy water so I'm holding off > migrating the changes to main trunk as the files is substantially > different (I presume pre-Test::More adoption?). Actually, it was the same until Torsten made his own (different) fixes to HEAD but not to branch. It was my mistake and I've corrected in yet a third way, and now branch and HEAD match. No harm done :) From bix at sendu.me.uk Tue Oct 3 03:31:10 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 03 Oct 2006 08:31:10 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <7F5CF5FF-5A1D-4DEA-A861-D4C47756F67D@uiuc.edu> References: <4521A552.60301@sendu.me.uk> <7F5CF5FF-5A1D-4DEA-A861-D4C47756F67D@uiuc.edu> Message-ID: <452211BE.6080107@sendu.me.uk> Chris Fields wrote: > So far all tests pass on Mac OS X. I'll add this to the release page. > > This RC will throw warnings for four tests I didn't remove in time > (BPlite.t, BPpsilite, BPbl2seq.t, and RestrictionEnzyme.t), which > correspond to their namesake deprecated Bio::Tools modules. These > are no longer in CVS HEAD so should be gone by the next RC, and the > relevant modules marked for deprecation. Thanks Chris. Sorry I missed these. From bix at sendu.me.uk Tue Oct 3 03:32:08 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 03 Oct 2006 08:32:08 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <20061003023031.GI14409@iucha.net> References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net> Message-ID: <452211F8.8040104@sendu.me.uk> Florin Iucha wrote: > On Tue, Oct 03, 2006 at 12:48:34AM +0100, Sendu Bala wrote: >> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll >> upload tar.gz files when I have access to the server, then reply here >> with links. >> >> In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for >> instructions on getting and testing this RC. > > [I won't create a wiki account just to report this.] > > Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG > not set. Lots of warnings about missing packages and all, but this > looks interesting: > > Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423. > > Otherwise: > > Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, 99.99% okay. > > The failed test is: > > t/ESEfinder..................dubious > Test returned status 255 (wstat 65280, 0xff00) > DIED. FAILED test 15 Thanks for your feedback Florin. The ESEfinder fail will be fixed in the next RC. From bix at sendu.me.uk Tue Oct 3 04:29:37 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 03 Oct 2006 09:29:37 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <4521A552.60301@sendu.me.uk> References: <4521A552.60301@sendu.me.uk> Message-ID: <45221F71.40206@sendu.me.uk> Sendu Bala wrote: > Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll > upload tar.gz files when I have access to the server, then reply here > with links. Live/core: http://bioperl.org/DIST/bioperl-1.5.2-RC1.tar.gz http://bioperl.org/DIST/bioperl-1.5.2-RC1.tar.bz2 http://bioperl.org/DIST/bioperl-1.5.2-RC1.zip Run: http://bioperl.org/DIST/bioperl-run-1.5.2-RC1.tar.gz http://bioperl.org/DIST/bioperl-run-1.5.2-RC1.tar.bz2 http://bioperl.org/DIST/bioperl-run-1.5.2-RC1.zip DB: http://bioperl.org/DIST/bioperl-db-1.5.2-RC1.tar.gz http://bioperl.org/DIST/bioperl-db-1.5.2-RC1.tar.bz2 http://bioperl.org/DIST/bioperl-db-1.5.2-RC1.zip Network: http://bioperl.org/DIST/bioperl-network-1.5.2-RC1.tar.gz http://bioperl.org/DIST/bioperl-network-1.5.2-RC1.tar.bz2 http://bioperl.org/DIST/bioperl-network-1.5.2-RC1.zip Md5 checksums are in: http://bioperl.org/DIST/SIGNATURES.md5 From jason at bioperl.org Tue Oct 3 02:11:30 2006 From: jason at bioperl.org (Jason Stajich) Date: Mon, 2 Oct 2006 23:11:30 -0700 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm Message-ID: <87F9B64E-8BDA-464B-814D-3F117AA646A1@bioperl.org> I only briefly saw your question - but RootI is for interfaces, Root.pm is for instantiated objects. From florin at iucha.net Tue Oct 3 07:39:12 2006 From: florin at iucha.net (Florin Iucha) Date: Tue, 3 Oct 2006 06:39:12 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net> Message-ID: <20061003113912.GJ14409@iucha.net> On Mon, Oct 02, 2006 at 10:54:29PM -0500, Chris Fields wrote: > >Otherwise: > > > > Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, > >99.99% okay. > > > >The failed test is: > > > > t/ESEfinder..................dubious > > Test returned status 255 (wstat 65280, 0xff00) > > DIED. FAILED test 15 $ perl -I. -w t/ESEfinder.t 1..15 ok 1 - use Bio::Tools::Analysis::DNA::ESEfinder; ok 2 - use Data::Dumper; ok 3 - use Bio::PrimarySeq; ok 4 - use Bio::Seq; ok 5 ok 6 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 7 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 8 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 9 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 10 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 11 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 12 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 13 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 14 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test # Looks like you planned 15 tests but only ran 14. $ grep Id t/ESEfinder.t # $Id: ESEfinder.t,v 1.13.6.2 2006/10/02 23:10:39 sendu Exp $ florin -- If we wish to count lines of code, we should not regard them as lines produced but as lines spent. -- Edsger Dijkstra From hlapp at gmx.net Tue Oct 3 08:27:46 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 3 Oct 2006 08:27:46 -0400 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <4521E74E.1040404@infotech.monash.edu.au> References: <4521E74E.1040404@infotech.monash.edu.au> Message-ID: <338034D7-BCB0-4653-9C80-2E7EC9E09E9B@gmx.net> The interface classes (those ending in 'I') should actually inherit from RootI, not Root. In reality this recommendation is more theoretical than it makes that much of a difference I think. The motivation is that interface classes should not determine the actual implementation of a class (hash ref, array ref, whatever), and since Root.pm contains lots of implementation using a hash ref that decision will basically have been made. On the contrary though, RootI contains implementation too, although I'm not sure it would prescribe the object implementation as opposed to merely implementing static methods (like throw(), warn(), etc). That would need to be checked. -hilmar On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote: > My understanding is that all Bioperl-compliant classes should inherit > from Bio::Root::Root, not Bio::Root::RootI. > > Additionally, if functions such as throw() or _rearrange() are to be > used without a class instance reference, they are to be used as class > methods via Bio::Root::Root, not Bio::Root::RootI. > > Is this correct? > > My naive audit of bioperl-live CVS brought up the following > statistics: > > # Root.pm > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l > 26 > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | wc -l > 346 > > # RootI.pm > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l > 9 > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | wc -l > 79 > > My guess would be that all RootI should be changed to plain Root ? > > Any help appreciated, > > -- > Dr Torsten Seemann http://www.vicbioinformatics.com > Victorian Bioinformatics Consortium, Monash University, Australia > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Tue Oct 3 08:33:37 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 3 Oct 2006 07:33:37 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <20061003113912.GJ14409@iucha.net> References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net> <20061003113912.GJ14409@iucha.net> Message-ID: <44724E16-74CD-4778-B04F-529475B47E37@uiuc.edu> Florin, Looks like this is fixed and should be working in the next release. Chris On Oct 3, 2006, at 6:39 AM, Florin Iucha wrote: > On Mon, Oct 02, 2006 at 10:54:29PM -0500, Chris Fields wrote: >>> Otherwise: >>> >>> Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, >>> 99.99% okay. >>> >>> The failed test is: >>> >>> t/ESEfinder..................dubious >>> Test returned status 255 (wstat 65280, 0xff00) >>> DIED. FAILED test 15 > > $ perl -I. -w t/ESEfinder.t > 1..15 > ok 1 - use Bio::Tools::Analysis::DNA::ESEfinder; > ok 2 - use Data::Dumper; > ok 3 - use Bio::PrimarySeq; > ok 4 - use Bio::Seq; > ok 5 > ok 6 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 7 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 8 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 9 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 10 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 11 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 12 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 13 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 14 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > # Looks like you planned 15 tests but only ran 14. > $ grep Id t/ESEfinder.t > # $Id: ESEfinder.t,v 1.13.6.2 2006/10/02 23:10:39 sendu Exp $ > > florin > > -- > If we wish to count lines of code, we should not regard them as lines > produced but as lines spent. -- Edsger Dijkstra > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 3 10:29:51 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 3 Oct 2006 09:29:51 -0500 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <338034D7-BCB0-4653-9C80-2E7EC9E09E9B@gmx.net> Message-ID: <002101c6e6f8$67b4ae10$15327e82@pyrimidine> > The interface classes (those ending in 'I') should actually inherit > from RootI, not Root. > > In reality this recommendation is more theoretical than it makes that > much of a difference I think. The motivation is that interface > classes should not determine the actual implementation of a class > (hash ref, array ref, whatever), and since Root.pm contains lots of > implementation using a hash ref that decision will basically have > been made. > > On the contrary though, RootI contains implementation too, although > I'm not sure it would prescribe the object implementation as opposed > to merely implementing static methods (like throw(), warn(), etc). > That would need to be checked. > > -hilmar The constructor in Bio::Root::RootI lets one know that its use is deprecated, so you shouldn't have any cases of 'our qw(Bio::Root::RootI)'; there should be some way of inheriting Root directly or indirectly. I would say that any direct use of RootI is not good practice, though. For the current implementation we should only inherit Bio::Root::Root, which implements RootI. Is there any reason to shut off the warning with BIOPERLDEBUG? >From RootI: sub new { my $class = shift; my @args = @_; unless ( $ENV{'BIOPERLDEBUG'} ) { carp("Use of new in Bio::Root::RootI is deprecated. Please use Bio::Root::Root instead"); } eval "require Bio::Root::Root"; return Bio::Root::Root->new(@args); } Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > > On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote: > > > My understanding is that all Bioperl-compliant classes should inherit > > from Bio::Root::Root, not Bio::Root::RootI. > > > > Additionally, if functions such as throw() or _rearrange() are to be > > used without a class instance reference, they are to be used as class > > methods via Bio::Root::Root, not Bio::Root::RootI. > > > > Is this correct? > > > > My naive audit of bioperl-live CVS brought up the following > > statistics: > > > > # Root.pm > > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l > > 26 > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | wc -l > > 346 > > > > # RootI.pm > > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l > > 9 > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | wc -l > > 79 > > > > My guess would be that all RootI should be changed to plain Root ? > > > > Any help appreciated, > > > > -- > > Dr Torsten Seemann http://www.vicbioinformatics.com > > Victorian Bioinformatics Consortium, Monash University, Australia > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From slenk at emich.edu Tue Oct 3 13:31:47 2006 From: slenk at emich.edu (Stephen Gordon Lenk) Date: Tue, 03 Oct 2006 13:31:47 -0400 Subject: [Bioperl-l] Perl 6 has 'roles' - may be cleanly applicable to the Root/RootI issue Message-ID: <5147da5514e402.514e4025147da5@emich.edu> I looked at the Perl6 site, there is an RFC on interfaces: http://dev.perl.org/perl6/rfc/265.html Roles seem to be the Perl 6 answer to the Root/RootI issue in Bioperl. Maybe it is too early to suggest this. http://dev.perl.org/perl6/doc/design/apo/A12.html: The primary role of a class is to manage instances, that is, objects. So a class must worry about object creation and destruction, and everything that happens in between. Classes have a secondary role as units of software reuse, in that they can be inherited from or delegated to. However, because this is a secondary role, and because of weaknesses in models of inheritance, composition, and delegation, Perl 6 will split out the notion of software reuse into a separate class-like entity called a "role". Roles are an abstraction mechanism for use by classes that don't care about the secondary aspects of software reuse, or that (looking at it the other way) care so much about it that they want to encapsulate any decisions about implementation, composition, delegation, and maybe even inheritance. Sounds fancy, but just think of them as includes of partial classes, with some safety checks. Roles don't manage objects. They manage interfaces and other abstract behavior (like default implementations), and they help classes manage objects. As such, a role may only be composed into a class or into another role, never inherited from or delegated to. That's what classes are for. From slenk at emich.edu Tue Oct 3 12:45:15 2006 From: slenk at emich.edu (Stephen Gordon Lenk) Date: Tue, 03 Oct 2006 12:45:15 -0400 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm Message-ID: <5120d6a511f5a7.511f5a75120d6a@emich.edu> The separation of interface and implementation is generally regarded as a good idea. Right now the Bioperl community is doing this as part of the implementation of Bioperl. I suggest that this is an example of something which you might want to have as part of the Perl implementation. If Perl 6 (or even Perl 5) does not have this as a core part of the language or as a standard package (reusable by all in a common fashion), you may want to suggest to the Perl implementers that a way for interface/implementation distinctions be made part of the core language. My 2 cents, as you people are the experts on your own code. ----- Original Message ----- From: Chris Fields Date: Tuesday, October 3, 2006 10:29 am Subject: Re: [Bioperl-l] Use of Root.pm versus RootI.pm > > The interface classes (those ending in 'I') should actually inherit > > from RootI, not Root. > > > > In reality this recommendation is more theoretical than it makes > that> much of a difference I think. The motivation is that interface > > classes should not determine the actual implementation of a class > > (hash ref, array ref, whatever), and since Root.pm contains lots of > > implementation using a hash ref that decision will basically have > > been made. > > > > On the contrary though, RootI contains implementation too, although > > I'm not sure it would prescribe the object implementation as opposed > > to merely implementing static methods (like throw(), warn(), etc). > > That would need to be checked. > > > > -hilmar > > The constructor in Bio::Root::RootI lets one know that its use is > deprecated, so you shouldn't have any cases of 'our > qw(Bio::Root::RootI)';there should be some way of inheriting Root > directly or indirectly. I would > say that any direct use of RootI is not good practice, though. > For the > current implementation we should only inherit Bio::Root::Root, which > implements RootI. > > Is there any reason to shut off the warning with BIOPERLDEBUG? > > >From RootI: > > sub new { > my $class = shift; > my @args = @_; > unless ( $ENV{'BIOPERLDEBUG'} ) { > carp("Use of new in Bio::Root::RootI is deprecated. Please use > Bio::Root::Root instead"); > } > eval "require Bio::Root::Root"; > return Bio::Root::Root->new(@args); > } > > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > > > > > On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote: > > > > > My understanding is that all Bioperl-compliant classes should > inherit> > from Bio::Root::Root, not Bio::Root::RootI. > > > > > > Additionally, if functions such as throw() or _rearrange() are > to be > > > used without a class instance reference, they are to be used > as class > > > methods via Bio::Root::Root, not Bio::Root::RootI. > > > > > > Is this correct? > > > > > > My naive audit of bioperl-live CVS brought up the following > > > statistics: > > > > > > # Root.pm > > > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l > > > 26 > > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | > wc -l > > > 346 > > > > > > # RootI.pm > > > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l > > > 9 > > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | > wc -l > > > 79 > > > > > > My guess would be that all RootI should be changed to plain > Root ? > > > > > > Any help appreciated, > > > > > > -- > > > Dr Torsten Seemann http://www.vicbioinformatics.com > > > Victorian Bioinformatics Consortium, Monash University, Australia > > > > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > -- > > =========================================================== > > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > > =========================================================== > > > > > > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at uiuc.edu Tue Oct 3 13:49:35 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 3 Oct 2006 12:49:35 -0500 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <5120d6a511f5a7.511f5a75120d6a@emich.edu> Message-ID: <000001c6e714$4c2cbb80$15327e82@pyrimidine> Perl6 already has added flexibility for separation of implementation/interface (I believe they are called roles). http://dev.perl.org/perl6/doc/design/syn/S12.html To tell the truth, I'm not sure about Perl 5, except the way the Bioperl devs have up the distinction between interface and implementation. However, I find the way we use interfaces is very simple (set up interface with some/all methods as unimplemented, use the module as an abstract base class, then override the unimplemented methods). It works for me. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: Stephen Gordon Lenk [mailto:slenk at emich.edu] > Sent: Tuesday, October 03, 2006 11:45 AM > To: Chris Fields > Cc: 'Hilmar Lapp'; 'Torsten Seemann'; 'bioperl-l' > Subject: Re: [Bioperl-l] Use of Root.pm versus RootI.pm > > The separation of interface and implementation is generally > regarded as a good idea. Right now the Bioperl community is > doing this as part of the implementation of Bioperl. I suggest > that this is an example of something which you might want to > have as part of the Perl implementation. If Perl 6 (or even > Perl 5) does not have this as a core part of the language or > as a standard package (reusable by all in a common fashion), > you may want to suggest to the Perl implementers that a way > for interface/implementation distinctions be made part of the > core language. My 2 cents, as you people are the experts on > your own code. > > > ----- Original Message ----- > From: Chris Fields > Date: Tuesday, October 3, 2006 10:29 am > Subject: Re: [Bioperl-l] Use of Root.pm versus RootI.pm > > > > The interface classes (those ending in 'I') should actually inherit > > > from RootI, not Root. > > > > > > In reality this recommendation is more theoretical than it makes > > that> much of a difference I think. The motivation is that interface > > > classes should not determine the actual implementation of a class > > > (hash ref, array ref, whatever), and since Root.pm contains lots of > > > implementation using a hash ref that decision will basically have > > > been made. > > > > > > On the contrary though, RootI contains implementation too, although > > > I'm not sure it would prescribe the object implementation as > opposed > > > to merely implementing static methods (like throw(), warn(), etc). > > > That would need to be checked. > > > > > > -hilmar > > > > The constructor in Bio::Root::RootI lets one know that its use is > > deprecated, so you shouldn't have any cases of 'our > > qw(Bio::Root::RootI)';there should be some way of inheriting Root > > directly or indirectly. I would > > say that any direct use of RootI is not good practice, though. > > For the > > current implementation we should only inherit Bio::Root::Root, which > > implements RootI. > > > > Is there any reason to shut off the warning with BIOPERLDEBUG? > > > > >From RootI: > > > > sub new { > > my $class = shift; > > my @args = @_; > > unless ( $ENV{'BIOPERLDEBUG'} ) { > > carp("Use of new in Bio::Root::RootI is deprecated. Please use > > Bio::Root::Root instead"); > > } > > eval "require Bio::Root::Root"; > > return Bio::Root::Root->new(@args); > > } > > > > > > Christopher Fields > > Postdoctoral Researcher - Switzer Lab > > Dept. of Biochemistry > > University of Illinois Urbana-Champaign > > > > > > > > > > > > On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote: > > > > > > > My understanding is that all Bioperl-compliant classes should > > inherit> > from Bio::Root::Root, not Bio::Root::RootI. > > > > > > > > Additionally, if functions such as throw() or _rearrange() are > > to be > > > > used without a class instance reference, they are to be used > > as class > > > > methods via Bio::Root::Root, not Bio::Root::RootI. > > > > > > > > Is this correct? > > > > > > > > My naive audit of bioperl-live CVS brought up the following > > > > statistics: > > > > > > > > # Root.pm > > > > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l > > > > 26 > > > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | > > wc -l > > > > 346 > > > > > > > > # RootI.pm > > > > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l > > > > 9 > > > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | > > wc -l > > > > 79 > > > > > > > > My guess would be that all RootI should be changed to plain > > Root ? > > > > > > > > Any help appreciated, > > > > > > > > -- > > > > Dr Torsten Seemann http://www.vicbioinformatics.com > > > > Victorian Bioinformatics Consortium, Monash University, Australia > > > > > > > > _______________________________________________ > > > > Bioperl-l mailing list > > > > Bioperl-l at lists.open-bio.org > > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > > > -- > > > =========================================================== > > > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > > > =========================================================== > > > > > > > > > > > > > > > > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cmlapid at up.edu.ph Tue Oct 3 22:06:06 2006 From: cmlapid at up.edu.ph (Carlo Lapid) Date: Wed, 4 Oct 2006 10:06:06 +0800 Subject: [Bioperl-l] genbank mirror Message-ID: Hi, I'm trying to set up a local mirror of a large part of the Genbank database. For users to access the local database, I need to create a web-based search tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank flat files I've downloaded based on a query entered by the user. I'm trying to use Bioperl to create this from scratch, but I'm having a very hard time, especially since I want the user to have reasonable flexibility in customizing his search. The best that I've been able to accomplish is a search function that retrieves genbank sequence objects based on their primary IDs or accession numbers; by using the fetch method of the Bio::Index::GenBank module. But this doesn't help users who don't know the exact IDs for the sequences they want. Can anybody suggest a way to use Bioperl to search for an ordinary word or phrase, like "16S gene", which could be matched against the description field, or the entire genbank entry? (Alternatively, is there some other freely available tool or software that can do this?) I've been scouring the Bioperl documentation, but I couldn't find anything. I just need to be pointed in the right direction. What I thought was a relatively simple problem has been driving me crazy for days; if anybody has any suggestions I would really, really appreciate it. From torsten.seemann at infotech.monash.edu.au Tue Oct 3 22:58:03 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Wed, 04 Oct 2006 12:58:03 +1000 Subject: [Bioperl-l] genbank mirror In-Reply-To: References: Message-ID: <4523233B.7030505@infotech.monash.edu.au> > I'm trying to set up a local mirror of a large part of the Genbank database. > For users to access the local database, I need to create a web-based search > tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank > flat files I've downloaded based on a query entered by the user. Have you coinsidered bioperl-db / BioSQL ? http://www.bioperl.org/wiki/BioPerl_db http://lists.open-bio.org/pipermail/biosql-l/ -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From osborne1 at optonline.net Tue Oct 3 23:16:20 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Tue, 03 Oct 2006 23:16:20 -0400 Subject: [Bioperl-l] genbank mirror In-Reply-To: Message-ID: Carlo, You might want to look at the Bio::DB::Query::GenBank module: http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_dat abase However this works through NCBI's own eutils API, setting it up to query a local mirror may be very difficult. Brian O. On 10/3/06 10:06 PM, "Carlo Lapid" wrote: > Hi, > > I'm trying to set up a local mirror of a large part of the Genbank database. > For users to access the local database, I need to create a web-based search > tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank > flat files I've downloaded based on a query entered by the user. > > I'm trying to use Bioperl to create this from scratch, but I'm having a very > hard time, especially since I want the user to have reasonable flexibility > in customizing his search. The best that I've been able to accomplish is a > search function that retrieves genbank sequence objects based on their > primary IDs or accession numbers; by using the fetch method of the > Bio::Index::GenBank module. But this doesn't help users who don't know the > exact IDs for the sequences they want. > > Can anybody suggest a way to use Bioperl to search for an ordinary word or > phrase, like "16S gene", which could be matched against the description > field, or the entire genbank entry? (Alternatively, is there some other > freely available tool or software that can do this?) I've been scouring the > Bioperl documentation, but I couldn't find anything. I just need to be > pointed in the right direction. What I thought was a relatively simple > problem has been driving me crazy for days; if anybody has any suggestions I > would really, really appreciate it. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From osborne1 at optonline.net Tue Oct 3 23:28:06 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Tue, 03 Oct 2006 23:28:06 -0400 Subject: [Bioperl-l] genbank mirror In-Reply-To: <4523233B.7030505@infotech.monash.edu.au> Message-ID: Torsten and Carlo, Right. For some simple examples of using Bio::DB::Query::BioQuery to query a BioSQL db take a look at Bio::DB::BioSQL::OBDA. You may also want to take a look at NCBI's eutils API, it's quite powerful but not local. Or the ENSEMBL API, people have set up their own local ENSEMBL dbs. There's an example of this API here: http://www.bioperl.org/wiki/Getting_Genomic_Sequences Brian O. On 10/3/06 10:58 PM, "Torsten Seemann" wrote: >> I'm trying to set up a local mirror of a large part of the Genbank database. >> For users to access the local database, I need to create a web-based search >> tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank >> flat files I've downloaded based on a query entered by the user. > > Have you coinsidered bioperl-db / BioSQL ? > > http://www.bioperl.org/wiki/BioPerl_db > http://lists.open-bio.org/pipermail/biosql-l/ From torsten.seemann at infotech.monash.edu.au Wed Oct 4 01:21:24 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Wed, 04 Oct 2006 15:21:24 +1000 Subject: [Bioperl-l] Clean-up of Bio::Root::IO Message-ID: <452344D4.8070908@infotech.monash.edu.au> Hi all, Now that we have Perl 5.6.1 as a minimum, the following modules are standard: File::Spec, File::Temp, File::Path Bio::Root::IO has functions catfile(), tmpdir(), tmpfile(), rmtree() which currently dispatch to the File:: version, or try to emulate it. We don't need to emulate anymore. Jason Stajich suggested in a previous post that they should be deprecated, and that users should use directly the File:: functions themselves. I have an uncommitted simplified version of Bio::Root::IO which does this, and "all tests pass". The functions currently (silently) dispatch directly to their native counterparts. The only tricky function is tempfile() which is *mostly* like File::Temp::tempfile(), but does some voodoo of converting (TEMPLATE=>'xxx') to the non-hash first parameter of the File:: version, so I'm hesitant to commit. It may do other magic - Hilmar? Comments? -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From gianluca.debellis at itb.cnr.it Wed Oct 4 05:25:26 2006 From: gianluca.debellis at itb.cnr.it (Gianluca De Bellis) Date: Wed, 04 Oct 2006 11:25:26 +0200 Subject: [Bioperl-l] Bioperl under WinXP Message-ID: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB> I'm trying to use Bioperl under WinXP-SP2 (novice) Bioperl has been just downloaded (v 1.2.3) Even the simplest program with a single command (use Bio::Perl;) ends up in an error of the Perl interpreter with these details AppName: perl.exe AppVer: 5.8.8.819 ModName: win32.dll ModVer: 0.0.0.0 Offset: 00003294 Coming from the windos reporting system Where is the problem? Thanks in advance From epsteinj at mail.nih.gov Wed Oct 4 07:25:57 2006 From: epsteinj at mail.nih.gov (Epstein, Jonathan A (NIH/NICHD) [E]) Date: Wed, 4 Oct 2006 07:25:57 -0400 Subject: [Bioperl-l] genbank mirror References: Message-ID: <42504F69898FE546B3F0238C9BD03275532603@NIHCESMLBX7.nih.gov> There's Seqhound: http://seqhound.blueprint.org/report.html We set this up locally, and it's probably the most comprehensive free solution out there, but it's non-trivial to setup. Also, since the Blueprint&BIND have lost most of their funding, I'm not sure how long you can count on SeqHound to remain operational (although for now it is being updated). Jonathan -----Original Message----- From: Carlo Lapid [mailto:cmlapid at up.edu.ph] Sent: Tue 10/3/2006 10:06 PM To: bioperl-l at bioperl.org Subject: [Bioperl-l] genbank mirror Hi, I'm trying to set up a local mirror of a large part of the Genbank database. For users to access the local database, I need to create a web-based search tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank flat files I've downloaded based on a query entered by the user. I'm trying to use Bioperl to create this from scratch, but I'm having a very hard time, especially since I want the user to have reasonable flexibility in customizing his search. The best that I've been able to accomplish is a search function that retrieves genbank sequence objects based on their primary IDs or accession numbers; by using the fetch method of the Bio::Index::GenBank module. But this doesn't help users who don't know the exact IDs for the sequences they want. Can anybody suggest a way to use Bioperl to search for an ordinary word or phrase, like "16S gene", which could be matched against the description field, or the entire genbank entry? (Alternatively, is there some other freely available tool or software that can do this?) I've been scouring the Bioperl documentation, but I couldn't find anything. I just need to be pointed in the right direction. What I thought was a relatively simple problem has been driving me crazy for days; if anybody has any suggestions I would really, really appreciate it. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From bix at sendu.me.uk Wed Oct 4 09:19:45 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 04 Oct 2006 14:19:45 +0100 Subject: [Bioperl-l] Bioperl under WinXP In-Reply-To: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB> References: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB> Message-ID: <4523B4F1.3010305@sendu.me.uk> Gianluca De Bellis wrote: > I'm trying to use Bioperl under WinXP-SP2 (novice) > > Bioperl has been just downloaded (v 1.2.3) > > Even the simplest program with a single command (use Bio::Perl;) ends up in > an error of the Perl interpreter with these details > > AppName: perl.exe AppVer: 5.8.8.819 ModName: win32.dll > > ModVer: 0.0.0.0 Offset: 00003294 > > Coming from the windos reporting system > > Where is the problem? Hard to say. Do non-bioperl scripts work? Make sure to follow the Bioperl installation instructions carefully: http://bioperl.org/wiki/Installing_Bioperl_on_Windows And make sure to install at least version 1.4. 1.2.3 is ancient and effectively unsupported. From cjfields at uiuc.edu Wed Oct 4 10:03:34 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 4 Oct 2006 09:03:34 -0500 Subject: [Bioperl-l] Bioperl under WinXP In-Reply-To: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB> Message-ID: <000601c6e7bd$e22ad190$15327e82@pyrimidine> If you're using PPM, you can install a (much) newer version of BioPerl from here: http://www.gmod.org/ggb/ppm/ Add that as one of your repositories in PPM4 (seeing that you are using ActivePerl 5.8.8.819), then search for bioperl. The version should be 1.512. In a few weeks we'll be releasing a new developer release. A WinXP PPM is expected, as well as a bundled package to install all prerequisites. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Gianluca De Bellis > Sent: Wednesday, October 04, 2006 4:25 AM > To: bioperl-l at bioperl.org > Subject: [Bioperl-l] Bioperl under WinXP > > I'm trying to use Bioperl under WinXP-SP2 (novice) > > Bioperl has been just downloaded (v 1.2.3) > > Even the simplest program with a single command (use Bio::Perl;) ends up > in > an error of the Perl interpreter with these details > > AppName: perl.exe AppVer: 5.8.8.819 ModName: win32.dll > > ModVer: 0.0.0.0 Offset: 00003294 > > Coming from the windos reporting system > > Where is the problem? > > > > Thanks in advance > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From hlapp at gmx.net Wed Oct 4 10:25:23 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 4 Oct 2006 10:25:23 -0400 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <002101c6e6f8$67b4ae10$15327e82@pyrimidine> References: <002101c6e6f8$67b4ae10$15327e82@pyrimidine> Message-ID: <28FB0A46-24C9-4EDD-B89C-BF7A108A1EDC@gmx.net> On Oct 3, 2006, at 10:29 AM, Chris Fields wrote: > The constructor in Bio::Root::RootI lets one know that its use is > deprecated, so you shouldn't have any cases of 'our qw > (Bio::Root::RootI)'; Don't confuse the constructor with the inheritance tree. Interface classes should never be instantiated, hence the constructor, consistent with the documentation, should never get executed. > there should be some way of inheriting Root directly or > indirectly. I would > say that any direct use of RootI is not good practice, though. I don't know what you mean by 'directly' or 'indirectly' but inheritance from interfaces, and interfaces extending (inheriting from) other interfaces, is certainly standard practice. I'm not sure at all why it would be a bad one. > For the current implementation we should only inherit > Bio::Root::Root, which > implements RootI. For the implementation classes, yes. For the interface classes, no. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Wed Oct 4 10:43:54 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 4 Oct 2006 10:43:54 -0400 Subject: [Bioperl-l] Clean-up of Bio::Root::IO In-Reply-To: <452344D4.8070908@infotech.monash.edu.au> References: <452344D4.8070908@infotech.monash.edu.au> Message-ID: <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net> On Oct 4, 2006, at 1:21 AM, Torsten Seemann wrote: > Bio::Root::IO has functions catfile(), tmpdir(), tmpfile(), rmtree() > which currently dispatch to the File:: version, or try to emulate > it. We > don't need to emulate anymore. Jason Stajich suggested in a previous > post that they should be deprecated, and that users should use > directly > the File:: functions themselves. I don't think there's a need to deprecate - if the methods just plain delegate to whatever File:: module is appropriate their implementation (supposedly) will become very simple and hence won't pose a maintenance burden anymore. One can still recommend for all new scripts or modules or code written to use the File:: modules directly, just I'm not sure there's a need to tell users that they should start changing their existing stuff. > > I have an uncommitted simplified version of Bio::Root::IO which does > this, and "all tests pass". The functions currently (silently) > dispatch > directly to their native counterparts. > > The only tricky function is tempfile() which is *mostly* like > File::Temp::tempfile(), but does some voodoo of converting > (TEMPLATE=>'xxx') to the non-hash first parameter of the File:: > version, > so I'm hesitant to commit. It may do other magic - Hilmar? Not that I would know of. If the tests pass (without having to change them!) I'd give it a try. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Oct 4 11:35:16 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 4 Oct 2006 10:35:16 -0500 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <28FB0A46-24C9-4EDD-B89C-BF7A108A1EDC@gmx.net> Message-ID: <001901c6e7ca$b12fd5b0$15327e82@pyrimidine> ... > Don't confuse the constructor with the inheritance tree. > > Interface classes should never be instantiated, hence the > constructor, consistent with the documentation, should never get > executed. I know that interfaces shouldn't be instantiated. I had noticed there are cases of 'our qw (Bio::Root::RootI)' where it is completely acceptable to inherit the interface. Makes sense to me now. > > there should be some way of inheriting Root directly or > > indirectly. I would > > say that any direct use of RootI is not good practice, though. > > I don't know what you mean by 'directly' or 'indirectly' but > inheritance from interfaces, and interfaces extending (inheriting > from) other interfaces, is certainly standard practice. I'm not sure > at all why it would be a bad one. I was talking specifically about inheriting RootI, and not about all Bioperl interfaces in general. I completely understand the use of interface/implementation in Bioperl. However, I missed one small fact until yesterday (of course AFTER I posed my reply), which was that interfaces may inherit RootI directly. My oops. I had understood that, in general, any Bioperl implementation should not inherit the RootI interface directly (they should inherit Root, since that implements RootI). The 'constructor' present in RootI is essentially to make sure that no one inherits from the wrong class. Probably a bad use of the terms 'direct' and 'indirect', so maybe I didn't get that across very well. What I meant was that all classes inherit Root in some way, either 'directly' (as the direct parent class) or 'indirectly' (through the inheritance tree). Probably comes from being primarily a molecular microbiologist and not a computer scientist. OT, but it would be nice to have an updated class diagram to sort out the inheritance hierarchy a bit easier. In the meantime, the Deobfuscator does help quite a bit. > > For the current implementation we should only inherit > > Bio::Root::Root, which > > implements RootI. > > For the implementation classes, yes. For the interface classes, no. I agree (see above). That's the one small bit about interfaces I missed along the way. Makes sense; they use throw_not_implemented(), which is a RootI method. > -hilmar Chris From pmiguel at purdue.edu Wed Oct 4 15:38:51 2006 From: pmiguel at purdue.edu (Phillip San Miguel) Date: Wed, 04 Oct 2006 15:38:51 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <4521A552.60301@sendu.me.uk> References: <4521A552.60301@sendu.me.uk> Message-ID: <45240DCB.2080204@purdue.edu> Sendu Bala wrote: > Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll > upload tar.gz files when I have access to the server, then reply here > with links. > > In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for > instructions on getting and testing this RC. > > Developers: > Make sure you're in the AUTHORS file in all 4 packages, as > appropriate. > > Users: > Even though 1.5.2 is a 'developer' release, we consider it the most > stable and capable version of Bioperl, and recommend that you use > it in all but the most critical production environments. Please > try it out and let us know of any problems or difficulties you run > into. > > > Thank you, > Sendu. > I didn't see any tests done under solaris, so I asked our sys admin to do the install on one of our machines. Just another data point: He installed this release candidate on a Sun E450 box running solaris. uname -a gives: SunOS descartes 5.10 Generic_118833-18 sun4u sparc SUNW,Ultra-4 perl -v gives: This is perl, v5.8.8 built for sun4-solaris (etc.) $ time make test PERL_DL_NONLAZY=1 /usr/local/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t t/AAChange...................ok t/AAReverseMutate............ok t/abi........................Bio::SeqIO::staden::read from bioperl-ext is not installed or is installed incorrectly - skipping abi.t tests t/abi........................ok t/ace........................ok t/AlignIO....................ok t/AlignStats.................ok t/AlignUtil..................ok t/alignUtilities.............ok t/Allele.....................ok t/Alphabet...................ok t/Annotation.................ok t/AnnotationAdaptor..........ok t/asciitree..................ok t/Assembly...................ok 1/19 skipped: t/Biblio.....................ok t/Biblio_biofetch............ok t/Biblio_eutils..............ok t/BiblioReferences...........ok t/BioDBGFF...................ok t/BioDBSeqFeature............ok 1/46Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423. t/BioDBSeqFeature............ok t/BioDBSeqFeature_BDB........ok t/BioDBSeqFeature_mysql......ok 3/46prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 prepare_cached(SELECT sequence,offset FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname= ? AND offset >= ? AND offset <= ? ORDER BY offset ) statement handle DBI::st=HASH(0x8c5048) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 prepare_cached(SELECT sequence,offset FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname= ? AND offset >= ? AND offset <= ? ORDER BY offset ) statement handle DBI::st=HASH(0x8c5048) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 t/BioDBSeqFeature_mysql......ok t/BioFetch_DB................ok t/BioGraphics................ok t/BlastIndex.................ok 1/13 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BlastIndex.................ok t/BPbl2seq................... -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPbl2seq...................ok 1/108 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPbl2seq...................ok t/BPlite.....................ok 1/97 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPlite.....................ok 52/97 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPlite.....................ok 88/97 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead STACK Bio::Tools::BPlite::new /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/Tools/BPlite.pm:197 STACK toplevel t/BPlite.t:127 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPlite.....................ok t/BPpsilite.................. -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPpsilite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPpsilite..................ok 4/11 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPpsilite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPpsilite..................ok t/bsml_sax...................ok t/Chain......................ok t/chaosxml...................ok t/cigarstring................ok t/ClusterIO..................ok t/Coalescent.................ok t/CodonTable.................ok t/Compatible.................ok t/consed.....................ok t/CoordinateGraph............ok t/CoordinateMapper...........ok t/Correlate..................ok t/ctf........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping ctf.t tests t/ctf........................ok t/CytoMap....................ok t/DB.........................skipped all skipped: Skipping all tests since they require network access, set BIOPERLDEBUG=1 to test t/DBCUTG.....................ok 11/34 skipped: Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test t/DBFasta....................ok t/DNAMutation................ok t/Domcut.....................ok t/ECnumber...................ok t/ELM........................ok 1/13 -------------------- WARNING --------------------- MSG: sleeping for 1 seconds --------------------------------------------------- t/ELM........................ok t/embl.......................ok t/EMBL_DB....................ok t/EMBOSS_Tools...............ok t/EncodedSeq.................ok t/entrezgene.................ok 491/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467. t/entrezgene.................ok 695/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467. t/entrezgene.................ok 723/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467. t/entrezgene.................ok 824/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467. t/entrezgene.................ok t/ePCR.......................ok t/ESEfinder..................ok 1/15# Looks like you planned 15 tests but only ran 14. t/ESEfinder..................dubious Test returned status 255 (wstat 65280, 0xff00) DIED. FAILED test 15 Failed 1/15 tests, 93.33% okay (less 9 skipped tests: 5 okay, 33.33%) t/est2genome.................ok t/EUtilities.................skipped all skipped: Set BIOPERLDEBUG=1 to run tests t/Exception..................ok t/Exonerate..................ok t/exp........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping exp.t tests t/exp........................ok t/fasta......................ok t/FeatureIO..................ok 7/33 -------------------- WARNING --------------------- MSG: '##feature-ontology' directive handling not yet implemented --------------------------------------------------- -------------------- WARNING --------------------- MSG: '##attribute-ontology' directive handling not yet implemented --------------------------------------------------- -------------------- WARNING --------------------- MSG: '##source-ontology' directive handling not yet implemented --------------------------------------------------- t/FeatureIO..................ok t/flat.......................ok t/FootPrinter................ok t/game.......................ok t/GbrowseGFF.................ok t/gcg........................ok t/GDB........................ok t/Gel........................ok t/genbank....................ok t/GeneCoordinateMapper.......ok t/Geneid.....................ok t/Genewise...................ok 2/51 skipped: t/Genomewise.................ok t/Genpred....................ok t/GFF........................ok t/GOR4.......................ok t/GOterm.....................ok t/GraphAdaptor...............ok t/GuessSeqFormat.............ok t/hmmer......................ok t/hmmer_pull.................ok t/HNN........................ok t/HtSNP......................ok t/Index......................ok t/InstanceSite...............ok t/interpro...................ok t/InterProParser.............ok t/IUPAC......................ok t/kegg.......................ok t/largefasta.................ok t/LargeLocatableSeq..........ok t/largepseq..................ok t/lasergene..................ok t/LinkageMap.................ok t/LiveSeq....................ok t/LocatableSeq...............ok t/Location...................ok t/LocationFactory............ok t/LocusLink..................ok t/lucy.......................ok t/Map........................ok t/MapIO......................ok t/masta......................ok t/Matrix.....................ok t/Measure....................ok t/MeSH.......................ok t/metafasta..................ok t/MetaSeq....................ok t/MicrosatelliteMarker.......ok t/MiniMIMentry...............ok t/MitoProt...................ok t/Molphy.....................ok t/MultiFile..................ok t/multiple_fasta.............ok t/Mutation...................ok t/Mutator....................ok t/NetPhos....................ok 10/14 skipped: Skipping tests which require network access, set BIOPERLDEBUG=1 to test t/Node.......................ok t/obo_parser.................ok t/OddCodes...................ok t/OMIMentry..................ok t/OMIMentryAllelicVariant....ok t/OMIMparser.................ok t/Ontology...................ok t/OntologyEngine.............ok t/OntologyStore..............ok t/PAML.......................ok t/Perl.......................ok t/phd........................ok t/Phenotype..................ok t/PhylipDist.................ok t/PhysicalMap................ok t/pICalculator...............ok t/Pictogram..................ok t/pir........................ok t/pln........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping pln.t tests t/pln........................ok t/PopGen.....................ok 2/89 skipped: t/PopGenSims.................ok t/primaryqual................ok t/PrimarySeq.................ok t/primedseq..................ok t/Primer.....................ok t/primer3....................ok t/Promoterwise...............ok t/ProtDist...................ok t/protgraph..................ok t/ProtMatrix.................ok t/ProtPsm....................ok t/Pseudowise.................ok t/psm........................ok t/QRNA.......................ok t/qual.......................ok t/RandDistFunctions..........ok t/RandomTreeFactory..........ok t/Range......................ok t/RangeI.....................ok t/raw........................ok t/RefSeq.....................ok t/Registry...................ok t/Relationship...............ok t/RelationshipType...........ok t/RemoteBlast................ok 11/13 skipped: to avoid timeout t/RepeatMasker...............ok t/RestrictionAnalysis........ok t/RestrictionEnzyme..........ok 1/14 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::RestrictionEnzyme is deprecatedUse Bio::Restriction classes instead --------------------------------------------------- t/RestrictionEnzyme..........ok t/RestrictionIO..............ok t/RNAChange..................ok t/rnamotif...................ok t/RootI......................ok t/RootIO.....................ok 2/27 skipped: various reasons t/RootStorable...............ok t/Scansite...................ok t/scf........................ok t/SearchDist.................ok t/SearchIO...................ok t/Seg........................ok t/Seq........................ok t/seq_quality................ok t/SeqAnalysisParser..........ok t/SeqBuilder.................ok t/SeqDiff....................ok t/SeqFeatCollection..........ok t/SeqFeature.................ok t/seqfeaturePrimer...........ok t/SeqHound_DB................ok 4/14Writing into 'shoundlog' log file. t/SeqHound_DB................ok t/SeqIO......................ok t/SeqPattern.................ok t/seqread_fail...............ok t/SeqStats...................ok t/SequenceFamily.............ok t/sequencetrace..............ok t/SeqUtils...................ok t/SeqVersion.................ok t/seqwithquality.............ok t/SeqWords...................ok t/Sigcleave..................ok t/Signalp....................ok t/Sim4.......................ok t/SimilarityPair.............ok t/SimpleAlign................ok t/simpleGOparser.............ok t/singlet....................ok t/sirna......................ok t/SiteMatrix.................ok t/SNP........................ok t/Sopma......................ok t/Species....................ok 5/20 skipped: Skipping tests which require network access, set BIOPERLDEBUG=1 to test t/Spidey.....................ok t/splicedseq.................ok t/StandAloneBlast............ok t/StructIO...................ok t/Structure..................ok t/swiss......................ok t/Symbol.....................ok t/tab........................ok t/table......................ok t/TagHaplotype...............ok t/Taxonomy...................ok 44/98 skipped: Skipping tests which require network access, set BIOPERLDEBUG=1 to test t/TaxonTree..................ok t/Tempfile...................ok t/Term.......................ok t/tigrxml....................ok t/tinyseq....................ok t/Tmhmm......................ok t/Tools......................ok t/Tree.......................ok t/TreeBuild..................ok t/TreeIO.....................ok t/trim.......................ok t/tRNAscanSE.................ok t/UCSCParsers................ok t/Unflattener................ok t/Unflattener2...............ok t/UniGene....................ok t/Variation_IO...............ok t/WABA.......................ok t/XEMBL_DB...................ok 1/9 skipped: server may be down t/ztr........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping ztr.t tests t/ztr........................ok Failed Test Stat Wstat Total Fail Failed List of Failed ------------------------------------------------------------------------------- t/ESEfinder.t 255 65280 15 2 13.33% 15 2 tests and 98 subtests skipped. Failed 1/240 test scripts, 99.58% okay. 1/11910 subtests failed, 99.99% okay. *** Error code 29 make: Fatal error: Command failed for target `test_dynamic' real 13m10.064s user 11m14.891s sys 0m45.417s $ TEST_VERBOSE=1 perl t/ESEfinder.t 1..15 ok 1 - use Bio::Tools::Analysis::DNA::ESEfinder; ok 2 - use Data::Dumper; ok 3 - use Bio::PrimarySeq; ok 4 - use Bio::Seq; ok 5 ok 6 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 7 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 8 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 9 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 10 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 11 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 12 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 13 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 14 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test # Looks like you planned 15 tests but only ran 14. From bix at sendu.me.uk Thu Oct 5 03:19:39 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 08:19:39 +0100 Subject: [Bioperl-l] EUtilities term handling Message-ID: <4524B20B.5010703@sendu.me.uk> This is actually a general question and not limited to EUtilities. As I see it EUtiltiies lets you do queries in Bioperl that you can do on a website. The question is, should a Bioperl module always work with queries that the website it is a front-end to works with? So for example, Bio::DB::EUtilities::esearch in -db mode 'gene' is essentially a frontend onto: http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?retmode=xml&db=gene&term= With a web-browser you can complete that url by supplying a term. For example, the term 'BRCA2+9606[taxid]' works and returns results: http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?retmode=xml&db=gene&term=BRCA2+9606[taxid] If you supply the exact same term to EUtilities::esearch like so: my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db => "gene", -term "BRCA2+9606[taxid]"); The search fails. From my 'user' perspective this is highly unexpected. Chris (the author) and I both understand /why/ it fails, but Chris doesn't think it is a bug, or at least something than can/should be changed. What do other people think? At the very least, if something unexpected happens, I'd suggest making a note of it in the POD somewhere. Eg. "Do not use + in term strings, even though they might work on the website". Chris: what is the disadvantage of always submitting '+' as '+' to the server? From bix at sendu.me.uk Thu Oct 5 03:24:45 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 08:24:45 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <4524B20B.5010703@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> Message-ID: <4524B33D.9070607@sendu.me.uk> Sendu Bala wrote: > > With a web-browser you can complete that url by supplying a term. For > example, the term 'BRCA2+9606[taxid]' works and returns results: > > http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?retmode=xml&db=gene&term=BRCA2+9606[taxid] > > > If you supply the exact same term to EUtilities::esearch like so: > > my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db => > "gene", -term "BRCA2+9606[taxid]"); *cough* my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db => "gene", -term => "BRCA2+9606[taxid]"); > The search fails. From m.weimer at dkfz-heidelberg.de Thu Oct 5 08:15:53 2006 From: m.weimer at dkfz-heidelberg.de (Marc Weimer) Date: Thu, 05 Oct 2006 14:15:53 +0200 Subject: [Bioperl-l] Bio::DB::SwissProt Error Message-ID: <1160050554.18691.11.camel@localhost> When running -------------------------------------------------------------- #! /usr/bin/perl -w use strict; use Bio::DB::SwissProt; my $db_obj = new Bio::DB::SwissProt(-verbose=>1); my $seq_obj = $db_obj->get_Seq_by_acc('P43780'); ------------------------------------------------------------- using Bioperl 1.4-1 I get the error message --------------------------------------------------------------------------------- request is POST http://www.ebi.ac.uk/cgi-bin/dbfetch Content-Length: 45 Content-Type: application/x-www-form-urlencoded format=swissprot&db=swall&style=raw&id=P43780 ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: swissprot stream with no ID. Not swissprot in my book STACK: Error::throw STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328 STACK Bio::SeqIO::swiss::next_seq /usr/share/perl5/Bio/SeqIO/swiss.pm:179 STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc /usr/share/perl5/Bio/DB/WebDBSeqI.pm:187 STACK: ./putativeGele.pl:8 ----------------------------------------------------------- -------------------------------------------------------------------------------- Any suggestions? Thanks, Marc From bix at sendu.me.uk Thu Oct 5 09:21:23 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 14:21:23 +0100 Subject: [Bioperl-l] Bio::DB::SwissProt Error In-Reply-To: <1160050554.18691.11.camel@localhost> References: <1160050554.18691.11.camel@localhost> Message-ID: <452506D3.5050501@sendu.me.uk> Marc Weimer wrote: [snip] > my $db_obj = new Bio::DB::SwissProt(-verbose=>1); > > my $seq_obj = $db_obj->get_Seq_by_acc('P43780'); [snip] > using Bioperl 1.4-1 I get the error message [snip] > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: swissprot stream with no ID. Not swissprot in my book [snip] > Any suggestions? It works with the latest Bioperl. I'm not sure if 1.5.1 works (the most recent official release), but 1.5.2 does (http://bioperl.org/wiki/Release_1.5.2), as does a checkout from CVS (http://bioperl.org/wiki/Getting_BioPerl#CVS). From m.weimer at dkfz-heidelberg.de Thu Oct 5 09:35:06 2006 From: m.weimer at dkfz-heidelberg.de (Marc Weimer) Date: Thu, 05 Oct 2006 15:35:06 +0200 Subject: [Bioperl-l] Bio::DB::SwissProt Error In-Reply-To: <452506D3.5050501@sendu.me.uk> References: <1160050554.18691.11.camel@localhost> <452506D3.5050501@sendu.me.uk> Message-ID: <1160055306.18691.14.camel@localhost> Works fine with 1.5.2 Thanks, Marc > Marc Weimer wrote: > [snip] > > my $db_obj = new Bio::DB::SwissProt(-verbose=>1); > > > > my $seq_obj = $db_obj->get_Seq_by_acc('P43780'); > [snip] > > using Bioperl 1.4-1 I get the error message > [snip] > > ------------- EXCEPTION: Bio::Root::Exception ------------- > > MSG: swissprot stream with no ID. Not swissprot in my book > [snip] > > Any suggestions? > > It works with the latest Bioperl. I'm not sure if 1.5.1 works (the most > recent official release), but 1.5.2 does > (http://bioperl.org/wiki/Release_1.5.2), as does a checkout from CVS > (http://bioperl.org/wiki/Getting_BioPerl#CVS). -- ######################################## Dr. Marc Weimer German Cancer Research Center Central Unit Biostatistics Im Neuenheimer Feld 280 D-69120 Heidelberg Phone: +49 (0) 6221/42-2387 Fax: +49 (0) 6221/42-2397 ######################################## From hlapp at gmx.net Thu Oct 5 09:55:58 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 5 Oct 2006 09:55:58 -0400 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <4524B20B.5010703@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> Message-ID: <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> On Oct 5, 2006, at 3:19 AM, Sendu Bala wrote: > This is actually a general question and not limited to EUtilities. > As I > see it EUtiltiies lets you do queries in Bioperl that you can do on a > website. The question is, should a Bioperl module always work with > queries that the website it is a front-end to works with? I think yes, but stick to this definition. Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez website it will actually not work. Hence, it should be no surprise that it doesn't work either using Bio::DB::EUtilities. The URL you are using to make your point is much more an example for using a web-service (SOAP, REST, or not) than it is for using a website. Using the web-service URL with a space in place of the '+' works, but yields a different result (just searches for BRCA2), so if tested for correct result the test fails. I.e., you don't expect an input form on a website to accept URL- encoded input. Instead, you expect it to do any URL-encoding for you that needs to be done. Conversely, if you are using a URL to retrieve stuff using e.g. wget or curl, it is clear that you will need to do URL encoding yourself unless there is a command line option that lets you instruct the querying program to do so. I would be careful with mangling the two definitions into one, resulting in a module that needs to serve two masters. You could consider providing an option though that lets you turn off the URL encoding on demand. Aside from that, one of the advantages of having the service wrapped in Bioperl is in fact that you can have it accept a wider variety of parameters that the actual service would allow you to have, e.g., arrays, hashes, or whatever seems appropriate. My $0.02. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Thu Oct 5 10:08:01 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 15:08:01 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> Message-ID: <452511C1.5020709@sendu.me.uk> Hilmar Lapp wrote: > > On Oct 5, 2006, at 3:19 AM, Sendu Bala wrote: > >> This is actually a general question and not limited to EUtilities. As I >> see it EUtiltiies lets you do queries in Bioperl that you can do on a >> website. The question is, should a Bioperl module always work with >> queries that the website it is a front-end to works with? > > I think yes, but stick to this definition. > > Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez > website it will actually not work. Hence, it should be no surprise that > it doesn't work either using Bio::DB::EUtilities. On the contrary, I find it a surprise because EUtilities is an interface to NCBI's eutils, not the entrez website. If I had previously read instructions on using eutils: http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.section.constructing-urls I might (do) expect that I /should/ use + in my term. > Aside from that, one of the advantages of having the service wrapped in > Bioperl is in fact that you can have it accept a wider variety of > parameters that the actual service would allow you to have, e.g., > arrays, hashes, or whatever seems appropriate. I was going to suggest that terms be supplied as an array, leaving Bioperl code to decide how to 'AND' all the terms (elements in the array) together. It would also further force the user not to think of how eutils normally works, but to only consider the Bioperl instructions on how to form a query. But I'm not sure of the value of all that. From cjfields at uiuc.edu Thu Oct 5 10:06:50 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 09:06:50 -0500 Subject: [Bioperl-l] Bio::DB::SwissProt Error In-Reply-To: <452506D3.5050501@sendu.me.uk> References: <1160050554.18691.11.camel@localhost> <452506D3.5050501@sendu.me.uk> Message-ID: <1AC863FF-3C44-4017-B20F-BD6DB413B318@uiuc.edu> On Oct 5, 2006, at 8:21 AM, Sendu Bala wrote: > Marc Weimer wrote: > [snip] >> my $db_obj = new Bio::DB::SwissProt(-verbose=>1); >> >> my $seq_obj = $db_obj->get_Seq_by_acc('P43780'); > [snip] >> using Bioperl 1.4-1 I get the error message > [snip] >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: swissprot stream with no ID. Not swissprot in my book > [snip] >> Any suggestions? > > It works with the latest Bioperl. I'm not sure if 1.5.1 works (the > most > recent official release), but 1.5.2 does > (http://bioperl.org/wiki/Release_1.5.2), as does a checkout from CVS > (http://bioperl.org/wiki/Getting_BioPerl#CVS). Mark, you'll have to update to 1.5.2 or CVS, as Sendu suggested. There were server changes for biofetch which were fixed about 4-6 months ago (post rel. 1.5.1); I think several changes were made to Bio::SeqIO::swiss as well during this period. I think the error here results from Bio::SeqIO::swiss trying to parse an empty byte stream. Sendu, do you think that Bio::SeqIO::swiss (and other SeqIO parsers) should throw a more specific message for getting an empty byte stream? Or is it more trouble than it's worth? Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Thu Oct 5 10:14:40 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 15:14:40 +0100 Subject: [Bioperl-l] Bio::DB::SwissProt Error In-Reply-To: <1AC863FF-3C44-4017-B20F-BD6DB413B318@uiuc.edu> References: <1160050554.18691.11.camel@localhost> <452506D3.5050501@sendu.me.uk> <1AC863FF-3C44-4017-B20F-BD6DB413B318@uiuc.edu> Message-ID: <45251350.5030608@sendu.me.uk> Chris Fields wrote: > >>> ------------- EXCEPTION: Bio::Root::Exception ------------- >>> MSG: swissprot stream with no ID. Not swissprot in my book [snip] > I think the error here results from Bio::SeqIO::swiss trying to parse an > empty byte stream. Sendu, do you think that Bio::SeqIO::swiss (and > other SeqIO parsers) should throw a more specific message for getting an > empty byte stream? Or is it more trouble than it's worth? Trouble wise, I've no idea without looking into it. Generally speaking though I can say that the error message is pretty useless and I'm always in favour of better error messages. From hlapp at gmx.net Thu Oct 5 10:21:49 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 5 Oct 2006 10:21:49 -0400 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <452511C1.5020709@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> Message-ID: On Oct 5, 2006, at 10:08 AM, Sendu Bala wrote: >> >> On Oct 5, 2006, at 3:19 AM, Sendu Bala wrote: >> >>> This is actually a general question and not limited to >>> EUtilities. As I >>> see it EUtiltiies lets you do queries in Bioperl that you can do >>> on a >>> website. The question is, should a Bioperl module always work with >>> queries that the website it is a front-end to works with? >> >> I think yes, but stick to this definition. >> >> Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez >> website it will actually not work. Hence, it should be no surprise >> that >> it doesn't work either using Bio::DB::EUtilities. > > On the contrary, I find it a surprise because EUtilities is an > interface > to NCBI's eutils, not the entrez website. > > If I had previously read instructions on using eutils: > http://www.ncbi.nlm.nih.gov/books/bv.fcgi? > rid=coursework.section.constructing-urls > I might (do) expect that I /should/ use + in my term. This is my point - stick to your definitions. Are you wrapping a query form on a website or are you wrapping a web service (i.e., a URL)? The examples you give are about wrapping a web-service. Your original question was about wrapping a website. Yet another question is what the author of Bio::DB::EUtilities intended to wrap. The other thing to consider is user-friendliness. If you are wrapping a web-service, do you still make not URL-encoding the user input the default? What will 90% of the users probably want or expect to be able to do? URL-encode all input themselves or expect the module to do this for them unless they turn it off? As far as I'm concerned, I'll happily count myself among those who are lazy and ignorant, don't read NCBI's documentation, don't want to know how to URL encode and why this needs to be done, but just want it to work. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Thu Oct 5 10:31:06 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 09:31:06 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <4524B20B.5010703@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> Message-ID: On Oct 5, 2006, at 2:19 AM, Sendu Bala wrote: > This is actually a general question and not limited to EUtilities. > As I > see it EUtiltiies lets you do queries in Bioperl that you can do on a > website. The question is, should a Bioperl module always work with > queries that the website it is a front-end to works with? > > So for example, Bio::DB::EUtilities::esearch in -db mode 'gene' is > essentially a frontend onto: > > http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi? > retmode=xml&db=gene&term= > > With a web-browser you can complete that url by supplying a term. For > example, the term 'BRCA2+9606[taxid]' works and returns results: > > http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi? > retmode=xml&db=gene&term=BRCA2+9606[taxid] > > If you supply the exact same term to EUtilities::esearch like so: > > my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db => > "gene", -term "BRCA2+9606[taxid]"); > > The search fails. From my 'user' perspective this is highly > unexpected. > Chris (the author) and I both understand /why/ it fails, but Chris > doesn't think it is a bug, or at least something than can/should be > changed. What do other people think? At the very least, if something > unexpected happens, I'd suggest making a note of it in the POD > somewhere. Eg. "Do not use + in term strings, even though they might > work on the website". > > Chris: what is the disadvantage of always submitting '+' as '+' to the > server? A few reasons: 1) According to NCBI, you can use '+' in queries, but not as a boolean. Global changes of '+' to a space may change the meaning of the query in a few rare occasions. So, if you really wanted to search for the string 'BRCA2+ATG', NCBI looks for that term literally. 2) '+' is a URI reserved symbol for a space delimiter. Therefore, any parameters containing '+' are URI-encoded into %2B, which is decoded on NCBI's end back to '+' (The is demonstrable with current EUtilities output and the returned XML data). 3) Why not just use a space (implicit AND)? Or an explicit boolean? Or '&' (which apparently works but is not specified in the NCBI Entrez docs)? The bug is in the query and not in the code, i.e. is is a user- generated bug, not an EUtilities bug. And it shouldn't be unexpected, as NCBI has very specific rules for building queries for Entrez (just like any other database). If I were to use nonstandard queries for MySQL, BioFetch, UCSC, or anything else, I would expect to get bad results. As the old saying goes, garbage in, garbage out. The following link has their updated rules: http://www.ncbi.nlm.nih.gov/books/bv.fcgi? rid=helpentrez.chapter.EntrezHelp Here is their old one: http://www.ncbi.nlm.nih.gov/entrez/query/static/help/helpdoc.html We could, of course, put something in POD, but you never presented that option to me before. I'll grant that the EUtilities API needs some cleaning up, not easy to do when the returned data varies from each utility. But it does get the URL encoding correct, at least in this case. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Thu Oct 5 10:32:49 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 15:32:49 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> Message-ID: <45251791.9040409@sendu.me.uk> Hilmar Lapp wrote: > > On Oct 5, 2006, at 10:08 AM, Sendu Bala wrote: > >> On the contrary, I find it a surprise because EUtilities is an interface >> to NCBI's eutils, not the entrez website. >> >> If I had previously read instructions on using eutils: >> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.section.constructing-urls >> >> I might (do) expect that I /should/ use + in my term. > > This is my point - stick to your definitions. Are you wrapping a query > form on a website or are you wrapping a web service (i.e., a URL)? > > The examples you give are about wrapping a web-service. Your original > question was about wrapping a website. Right... I don't see that that changes the answer to my question though does it? "The question is, should a Bioperl module always work with queries that the web-service it is a front-end to works with?" For me, the answer is still yes. > As far as I'm concerned, I'll happily count myself among those who are > lazy and ignorant, don't read NCBI's documentation, don't want to know > how to URL encode and why this needs to be done, but just want it to work. That's a reasonable attitude to take. Which comes back to the question I asked of Chris - naively, if you send + as + you can please everyone, can't you? Both people who have read the docs on the web-service and those who haven't? Or are there real queries in which a user may want to search for a phrase with a literal + in it (and where such a search works via eutils)? From bix at sendu.me.uk Thu Oct 5 10:44:33 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 15:44:33 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: References: <4524B20B.5010703@sendu.me.uk> Message-ID: <45251A51.6020802@sendu.me.uk> Chris Fields wrote: > The bug is in the query and not in the code, i.e. is is a > user-generated bug, not an EUtilities bug. And it shouldn't be > unexpected, as NCBI has very specific rules for building queries for > Entrez (just like any other database). So I guess this comes down to something Hilmar mentioned and I never even considered before. You consider your EUtilities stuff as a frontend to entrez, and therefore consider valid queries as queries that are valid for entrez and not eutils? If that's the case, fine. I understand why you don't think this is a bug. Again, something that might warrant a mention in the POD. Currently the naming of the modules and the explicit references to eutils (and me knowing the implementation uses eutils) got me confused. From cjfields at uiuc.edu Thu Oct 5 10:51:28 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 09:51:28 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <452511C1.5020709@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> Message-ID: <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> On Oct 5, 2006, at 9:08 AM, Sendu Bala wrote: >>> This is actually a general question and not limited to >>> EUtilities. As I >>> see it EUtiltiies lets you do queries in Bioperl that you can do >>> on a >>> website. The question is, should a Bioperl module always work with >>> queries that the website it is a front-end to works with? >> >> I think yes, but stick to this definition. >> >> Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez >> website it will actually not work. Hence, it should be no surprise >> that >> it doesn't work either using Bio::DB::EUtilities. > > On the contrary, I find it a surprise because EUtilities is an > interface > to NCBI's eutils, not the entrez website. It uses NCBI's CGI interface for eutils, not the SOAP interface. Very different. I have considered using the NCBI SOAP-based interface, but the web services are still somewhat incomplete, unlike the CGI interface. > If I had previously read instructions on using eutils: > http://www.ncbi.nlm.nih.gov/books/bv.fcgi? > rid=coursework.section.constructing-urls > I might (do) expect that I /should/ use + in my term. You are looking at part of the naked URL on that page. Here's what that page says: "When constructing URLs for the eUtils, please use lowercase characters for all parameters except &WebEnv. There is no required order for the URL parameters in an eUtils URL, and null values or inappropriate parameters are ignored. Avoid placing spaces in the URLs, particularly in queries. If a space is required, use a plus sign (+) instead of a space: * Incorrect: &id=352, 25125, 234, ... * Correct: &id=352,25125,234,... * Incorrect: &term=biomol mrna[properties] AND mouse[organism] * Correct: &term=biomol+mrna[properties]+AND+mouse[organism] Other special characters, such as the # symbol used in referring to a query key on the History server, should be represented by their URL encodings (%23 for #).top link" I use URI for building the URL with the parameters. URI specifically encodes all of this for you, so spaces convert to '+' and '+' converts to %2B. >> Aside from that, one of the advantages of having the service >> wrapped in >> Bioperl is in fact that you can have it accept a wider variety of >> parameters that the actual service would allow you to have, e.g., >> arrays, hashes, or whatever seems appropriate. > > I was going to suggest that terms be supplied as an array, leaving > Bioperl code to decide how to 'AND' all the terms (elements in the > array) together. It would also further force the user not to think of > how eutils normally works, but to only consider the Bioperl > instructions > on how to form a query. But I'm not sure of the value of all that. Why do we need to intuit what the user is thinking at an particular time? How would I know that someone actually wanted to search using the literal string 'abc+123' as opposed to 'abc 123'? I see value in your last suggestion but I think a class or set of classes would be best suited for that: MySQL Query | in out | MySQL Query Entrez Query |-----> Generic Query class----->| Entrez Query SRS Query | | SRS Query ad infinitum... The generic query object could then be used in DB searches as an option besides using a raw string. Though it would get tricky with SQL's complexity... Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Thu Oct 5 10:54:04 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 5 Oct 2006 10:54:04 -0400 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45251791.9040409@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <45251791.9040409@sendu.me.uk> Message-ID: <9916EDEE-EA3C-4C55-A004-A46F37B559BF@gmx.net> On Oct 5, 2006, at 10:32 AM, Sendu Bala wrote: >> The examples you give are about wrapping a web-service. Your >> original question was about wrapping a website. > > Right... I don't see that that changes the answer to my question > though does it? > > "The question is, should a Bioperl module always work with > queries that the web-service it is a front-end to works with?" > > For me, the answer is still yes. The answer is still yes. My point was the query that works with a website is not necessarily the query that works with a web-service, even if that web-service also powers the website. > >> As far as I'm concerned, I'll happily count myself among those who >> are lazy and ignorant, don't read NCBI's documentation, don't want >> to know how to URL encode and why this needs to be done, but just >> want it to work. > > That's a reasonable attitude to take. Which comes back to the > question I asked of Chris - naively, if you send + as + you can > please everyone, can't you? Both people who have read the docs on > the web-service and those who haven't? Or are there real queries in > which a user may want to search for a phrase with a literal + in it > (and where such a search works via eutils)? So are you suggesting to URL-encode some characters but not others? This would move you into muddy waters and I'm wondering what the gain is from that, and for whom it is a gain. It sounds like it will mostly benefit those who have studied the NCBI documentation and know exactly the URL they want to send and want to ignore the EUtilities POD. My humble guess is the far majority of people will either not read any documentation, or read the module's POD. Maybe a better way to serve both types of people is to accept a parameter -querystring that is expected to include everything from 'term=' onwards (including 'term=' itself) which gives you complete control and freedom if you know what you are doing, and otherwise implement what you suggested before: > I was going to suggest that terms be supplied as an array, leaving > Bioperl code to decide how to 'AND' all the terms (elements in the > array) together. It would also further force the user not to think of > how eutils normally works, but to only consider the Bioperl > instructions > on how to form a query. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Thu Oct 5 11:02:01 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 16:02:01 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> Message-ID: <45251E69.7040507@sendu.me.uk> Chris Fields wrote: > > On Oct 5, 2006, at 9:08 AM, Sendu Bala wrote: > >> On the contrary, I find it a surprise because EUtilities is an interface >> to NCBI's eutils, not the entrez website. > > It uses NCBI's CGI interface for eutils, not the SOAP interface. Very > different. I have considered using the NCBI SOAP-based interface, but > the web services are still somewhat incomplete, unlike the CGI interface. I don't know anything about the SOAP interface. I'm talking about the CGI interface that you use. >> If I had previously read instructions on using eutils: >> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.section.constructing-urls >> >> I might (do) expect that I /should/ use + in my term. > > You are looking at part of the naked URL on that page. Here's what that > page says: I know what it says... > * Correct: &term=biomol+mrna[properties]+AND+mouse[organism] The correct query is the one that has +s in it. > I use URI for building the URL with the parameters. URI specifically > encodes all of this for you, so spaces convert to '+' and '+' converts > to %2B. Well, yes. This causes what I thought of as a bug. It prevents me from submitting a /correct/ eutils term. However it isn't a bug if you explain to users they shouldn't be submitting valid eutils terms, but only valid /entrez/ terms. From cjfields at uiuc.edu Thu Oct 5 11:15:49 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 10:15:49 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45251A51.6020802@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <45251A51.6020802@sendu.me.uk> Message-ID: On Oct 5, 2006, at 9:44 AM, Sendu Bala wrote: > Chris Fields wrote: >> The bug is in the query and not in the code, i.e. is is a user- >> generated bug, not an EUtilities bug. And it shouldn't be >> unexpected, as NCBI has very specific rules for building queries >> for Entrez (just like any other database). > > So I guess this comes down to something Hilmar mentioned and I > never even considered before. You consider your EUtilities stuff as > a frontend to entrez, and therefore consider valid queries as > queries that are valid for entrez and not eutils? The eutils tools access the same databases as the web page, in the same way, using the same search terms. From the EUtilities docs: "The eUtils access the core search and retrieval engine of the Entrez system and, therefore, are only capable of retrieving data that are already in Entrez." > If that's the case, fine. I understand why you don't think this is > a bug. Again, something that might warrant a mention in the POD. > Currently the naming of the modules and the explicit references to > eutils (and me knowing the implementation uses eutils) got me > confused. I'll note that in there is URI encoding in POD, but that should be a no-brainer. I don't think every Bio::DB* class specifies this, mainly because it is taken for granted. Pretty much anything that builds URL strings needs to encode based on the URI standard, and any server that accepts URLs is expected to decode using the same standard. So, again, why does that have to be specifically outlined in POD? Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Oct 5 11:24:39 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 10:24:39 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45251E69.7040507@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> Message-ID: >> I use URI for building the URL with the parameters. URI >> specifically encodes all of this for you, so spaces convert to '+' >> and '+' converts to %2B. > > Well, yes. This causes what I thought of as a bug. It prevents me > from submitting a /correct/ eutils term. However it isn't a bug if > you explain to users they shouldn't be submitting valid eutils > terms, but only valid /entrez/ terms. I can specify in POD that URI encoding is in effect if that placates you, and maybe add a bit about how terms are to be built (based on the website). I also noticed that the esearch POD doesn't have a demo in the SYNOPSIS yet (my fault). However, I think this is all a bit silly. This is something most people already realize and take for granted (it's standard for any CGI interface to use URI encoding). Also, most Entrez users do not use a term like 'BRCA2+Human [ORGANISM]'. They use 'BRCA2 AND Human[ORGANISM]' or 'BRCA2 Human [ORGANISM]', the latter which is implicit. All of this is on the Entrez website. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From MEC at stowers-institute.org Thu Oct 5 11:12:02 2006 From: MEC at stowers-institute.org (Cook, Malcolm) Date: Thu, 5 Oct 2006 10:12:02 -0500 Subject: [Bioperl-l] using nfreeze instead of freeze in Bio::SeqFeature::Store Message-ID: Lincoln, I committed a change to Bio::SeqFeature::Store to use nfreeze instead of freeze which should allow SeqFeature objects to survive database freeze/thaw cycles across architectures. I hope I was not presumptuous or in error in doing this.... Regards, Malcolm Cook Database Applications Manager - Bioinformatics Stowers Institute for Medical Research - Kansas City, Missouri From bix at sendu.me.uk Thu Oct 5 11:28:55 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 16:28:55 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: References: <4524B20B.5010703@sendu.me.uk> <45251A51.6020802@sendu.me.uk> Message-ID: <452524B7.5080003@sendu.me.uk> Chris Fields wrote: > > On Oct 5, 2006, at 9:44 AM, Sendu Bala wrote: > >> Chris Fields wrote: >>> The bug is in the query and not in the code, i.e. is is a >>> user-generated bug, not an EUtilities bug. And it shouldn't be >>> unexpected, as NCBI has very specific rules for building queries for >>> Entrez (just like any other database). >> >> So I guess this comes down to something Hilmar mentioned and I never >> even considered before. You consider your EUtilities stuff as a >> frontend to entrez, and therefore consider valid queries as queries >> that are valid for entrez and not eutils? > > The eutils tools access the same databases as the web page, in the same > way, using the same search terms. It doesn't. The eutils interface behaves differently with +s than does the entrez website interface. In eutils + means space, whilst in entrez, + means the plus symbol. >> If that's the case, fine. I understand why you don't think this is a >> bug. Again, something that might warrant a mention in the POD. >> Currently the naming of the modules and the explicit references to >> eutils (and me knowing the implementation uses eutils) got me confused. > > I'll note that in there is URI encoding in POD, but that should be a > no-brainer. Just that it is URI encoded isn't the problem. The problem is the difference in behaviour outlined above. > I don't think every Bio::DB* class specifies this, mainly > because it is taken for granted. Pretty much anything that builds URL > strings needs to encode based on the URI standard, and any server that > accepts URLs is expected to decode using the same standard. > > So, again, why does that have to be specifically outlined in POD? Because they're different. If I construct a valid eutils query it might not work. You ought to explain why. "EUtilities takes any valid entrez query and transforms it into a valid eutils query for submission. Do not try and provide a valid eutils query of your own, or the extra transformation will result in no results" From bix at sendu.me.uk Thu Oct 5 11:30:44 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 16:30:44 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> Message-ID: <45252524.7030006@sendu.me.uk> Chris Fields wrote: >>> I use URI for building the URL with the parameters. URI specifically >>> encodes all of this for you, so spaces convert to '+' and '+' >>> converts to %2B. >> >> Well, yes. This causes what I thought of as a bug. It prevents me from >> submitting a /correct/ eutils term. However it isn't a bug if you >> explain to users they shouldn't be submitting valid eutils terms, but >> only valid /entrez/ terms. > > I can specify in POD that URI encoding is in effect if that placates > you, and maybe add a bit about how terms are to be built (based on the > website). I also noticed that the esearch POD doesn't have a demo in > the SYNOPSIS yet (my fault). > > However, I think this is all a bit silly. This is something most people > already realize and take for granted (it's standard for any CGI > interface to use URI encoding). > > Also, most Entrez users do not use a term like 'BRCA2+Human[ORGANISM]'. > They use 'BRCA2 AND Human[ORGANISM]' or 'BRCA2 Human[ORGANISM]', the > latter which is implicit. All of this is on the Entrez website. Exactly. You're assuming an entrez user and expecting an entrez query. I don't think its silly given the name of the modules for the user to assume the code needs an eutils query, which is a different thing with different behaviour /independent/ of URI encoding. From cjfields at uiuc.edu Thu Oct 5 11:50:51 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 10:50:51 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45251E69.7040507@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> Message-ID: <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu> > I know what it says... Ah, that's the Sendu I know and love. > >> * Correct: &term=biomol+mrna[properties]+AND+mouse[organism] > > The correct query is the one that has +s in it. Yes, that's because it's a URL, not a raw search term string (it has been URI-encoded so spaces are converted to '+'). If you use that as a direct query in Entrez you will not get the same response. You do get something if you use the new NCBI global query form on the main page, but clicking on the nucleotide or PMC hits reveals that the URL is malformed and no term is present. That is exactly the same response in EUtilities: 0 0 0 Note the QueryTranslation tag is empty. The only noticeable difference is using egquery (which I just fixed in CVS yesterday). The returned XML gives no hits for any database, which is true based on individual esearch queries for those database, and is actually more consistent than the website version. >> I use URI for building the URL with the parameters. URI specifically >> encodes all of this for you, so spaces convert to '+' and '+' >> converts >> to %2B. > > Well, yes. This causes what I thought of as a bug. It prevents me from > submitting a /correct/ eutils term. However it isn't a bug if you > explain to users they shouldn't be submitting valid eutils terms, but > only valid /entrez/ terms. If you mean that most users will actually use a URL-like search term, then I would say you have a point. But that simply isn't the case. If clarifying the docs makes it better, then so be it. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Oct 5 11:59:53 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 10:59:53 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45252524.7030006@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> Message-ID: <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> On Oct 5, 2006, at 10:30 AM, Sendu Bala wrote: > Chris Fields wrote: >>>> I use URI for building the URL with the parameters. URI >>>> specifically encodes all of this for you, so spaces convert to >>>> '+' and '+' converts to %2B. >>> >>> Well, yes. This causes what I thought of as a bug. It prevents me >>> from submitting a /correct/ eutils term. However it isn't a bug >>> if you explain to users they shouldn't be submitting valid eutils >>> terms, but only valid /entrez/ terms. >> I can specify in POD that URI encoding is in effect if that >> placates you, and maybe add a bit about how terms are to be built >> (based on the website). I also noticed that the esearch POD >> doesn't have a demo in the SYNOPSIS yet (my fault). >> However, I think this is all a bit silly. This is something most >> people already realize and take for granted (it's standard for any >> CGI interface to use URI encoding). >> Also, most Entrez users do not use a term like 'BRCA2+Human >> [ORGANISM]'. They use 'BRCA2 AND Human[ORGANISM]' or 'BRCA2 Human >> [ORGANISM]', the latter which is implicit. All of this is on the >> Entrez website. > > Exactly. You're assuming an entrez user and expecting an entrez > query. I don't think its silly given the name of the modules for > the user to assume the code needs an eutils query, which is a > different thing with different behaviour /independent/ of URI > encoding. It's a silly distinction. The POD for Bio::DB::EUtilities states: Bio::DB::EUtilities - interface for handling web queries and data retrieval from NCBI's Entrez Utilities. My question is this : why would anyone (particularly the everyday bioperl user) want to use URL-encoded parameters for a query? That seems to be your main argument here. If so, wouldn't I just paste them together then send them off NCBI eutils? Would I devote ~ 10 classes to that? I could do that in a short program using an array, join, and LWP::Simple. The purpose is quite clearly stated, but if you feel that by badgering me to add something to POD I consider common sense, then you're right. You've succeeded. Bravo. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Thu Oct 5 12:02:05 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 17:02:05 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu> Message-ID: <45252C7D.3050009@sendu.me.uk> Chris Fields wrote: > >>> * Correct: &term=biomol+mrna[properties]+AND+mouse[organism] >> >> The correct query is the one that has +s in it. > > Yes, that's because it's a URL, not a raw search term string (it has > been URI-encoded so spaces are converted to '+'). If you use that as a > direct query in Entrez you will not get the same response. But we're not doing Entrez queries. We're using a module called EUtilities to do an eutils query, which involves forming a url in which spaces should to be converted to +. That's the source of confusion. Is the user supposed to do this, or is EUtilities? All you had to do 8 emails ago is tell me that EUtilities is supposed to do that. You /still/ haven't told me that. I give up. From cjfields at uiuc.edu Thu Oct 5 12:12:11 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 11:12:11 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45252C7D.3050009@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu> <45252C7D.3050009@sendu.me.uk> Message-ID: On Oct 5, 2006, at 11:02 AM, Sendu Bala wrote: > Chris Fields wrote: >> >>>> * Correct: &term=biomol+mrna[properties]+AND+mouse[organism] >>> >>> The correct query is the one that has +s in it. >> Yes, that's because it's a URL, not a raw search term string (it >> has been URI-encoded so spaces are converted to '+'). If you use >> that as a direct query in Entrez you will not get the same response. > > But we're not doing Entrez queries. We're using a module called > EUtilities to do an eutils query, which involves forming a url in > which spaces should to be converted to +. That's the source of > confusion. Is the user supposed to do this, or is EUtilities? > > All you had to do 8 emails ago is tell me that EUtilities is > supposed to do that. You /still/ haven't told me that. I give up. It should be apparent from the documentation and the URLs posted in debugging output the first few times you used it. Again, why would I dedicate ~ 10 classes to pasting together URI-encoded strings? Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Thu Oct 5 12:22:36 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 17:22:36 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> Message-ID: <4525314C.7020205@sendu.me.uk> Chris Fields wrote: > > On Oct 5, 2006, at 10:30 AM, Sendu Bala wrote: > >> Exactly. You're assuming an entrez user and expecting an entrez query. >> I don't think its silly given the name of the modules for the user to >> assume the code needs an eutils query, which is a different thing with >> different behaviour /independent/ of URI encoding. > > It's a silly distinction. The POD for Bio::DB::EUtilities states: > > Bio::DB::EUtilities - interface for handling web queries and data > retrieval from NCBI's Entrez Utilities. > > My question is this : why would anyone (particularly the everyday > bioperl user) want to use URL-encoded parameters for a query? Well I'll tell you why I was trying to use URL-encoded parameters, if that helps you any. I read the pod for EUtilities but all the examples have very simple -term s defined with just a single word. So I wonder how I'm supposed to make an 'AND' term. I also have no idea what utilities I'm supposed to use, or what databases etc. I need to get the answer I want. The POD points me here: http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html Combined with the EUtilities synopsis I know I'm supposed to start with esearch so I look at: http://eutils.ncbi.nlm.nih.gov/entrez/query/static/esearch_help.html And figure out what my terms are supposed to be. Then I test some example terms in my web browser using the esearch base url (http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?) to see if they work, and copy/paste the terms into my EUtilities-using perl script, replacing variable terms with perl variables. Then I find that my terms don't work, ask you about it, and you fail to tell me I should be testing my terms at http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=search&DB=gene. If you think I'm stupid, fine, but I'm probably not the only stupid person on the planet. Which is why I suggested a POD addition. You don't have to make any POD change if you don't want to. I simply thought it might help avoid anyone 'badgering' you in the future with a similar problem. From bix at sendu.me.uk Thu Oct 5 12:28:51 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 17:28:51 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu> <45252C7D.3050009@sendu.me.uk> Message-ID: <452532C3.9030804@sendu.me.uk> Chris Fields wrote: > > On Oct 5, 2006, at 11:02 AM, Sendu Bala wrote: > >> Chris Fields wrote: >>> >>>>> * Correct: &term=biomol+mrna[properties]+AND+mouse[organism] >>>> >>>> The correct query is the one that has +s in it. >>> Yes, that's because it's a URL, not a raw search term string (it has >>> been URI-encoded so spaces are converted to '+'). If you use that as >>> a direct query in Entrez you will not get the same response. >> >> But we're not doing Entrez queries. We're using a module called >> EUtilities to do an eutils query, which involves forming a url in >> which spaces should to be converted to +. That's the source of >> confusion. Is the user supposed to do this, or is EUtilities? >> >> All you had to do 8 emails ago is tell me that EUtilities is supposed >> to do that. You /still/ haven't told me that. I give up. > > It should be apparent from the documentation and the URLs posted in > debugging output the first few times you used it. Again, why would I > dedicate ~ 10 classes to pasting together URI-encoded strings? I'm not sure how not doing URI-encoding would suddenly make your classes worthless. I find them to be very useful (even when I didn't know there was any URI-encoding, was incorrectly using +s and it happened to work anyway). From bernd.web at gmail.com Thu Oct 5 10:09:38 2006 From: bernd.web at gmail.com (Bernd Web) Date: Thu, 5 Oct 2006 16:09:38 +0200 Subject: [Bioperl-l] Eutilities Batch Message-ID: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com> Hi, I am using the new EUtilities. It looks great. I was trying to use epost followed by elink but i get an error. The same error is actually given with the example on http://doc.bioperl.org/bioperl-live/Bio/DB/EUtilities/elink.html: Can't call method "get_databases" on an undefined value at EU.pl line 25. For completeness, the code is shown below too. Any suggestions what is going wrong? Regards, Bernd # chain EUtilities for complex queries use Bio::DB::EUtilities; my $esearch = Bio::DB::EUtilities->new(-eutil => 'esearch', -db => 'pubmed', -term => 'hutP', -usehistory => 'y'); $esearch->get_response; # parse the response, fetch a cookie my $elink = Bio::DB::EUtilities->new(-eutil => 'elink', -db => 'protein,taxonomy', -dbfrom => 'pubmed', -cookie => $esearch->next_cookie, -cmd => 'neighbor'); # this retrieves the Bio::DB::EUtilities::ElinkData object my ($linkset) = $elink->next_linkset; my @ids; # step through IDs for each linked database in the ElinkData object for my $db ($linkset->get_databases) { @ids = $linkset->get_LinkIds_by_db($db); #returns primary ID's # do something here } From cjfields at uiuc.edu Thu Oct 5 13:31:33 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 12:31:33 -0500 Subject: [Bioperl-l] Eutilities Batch In-Reply-To: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com> References: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com> Message-ID: I'll look into it. I'm busy updating the EUtilities tools now. Chris On Oct 5, 2006, at 9:09 AM, Bernd Web wrote: > Hi, > > I am using the new EUtilities. It looks great. > I was trying to use epost followed by elink but i get an error. The > same error is actually given with the example on > http://doc.bioperl.org/bioperl-live/Bio/DB/EUtilities/elink.html: > Can't call method "get_databases" on an undefined value at EU.pl > line 25. > > For completeness, the code is shown below too. > > Any suggestions what is going wrong? > > Regards, > Bernd > > # chain EUtilities for complex queries > > use Bio::DB::EUtilities; > > my $esearch = Bio::DB::EUtilities->new(-eutil => 'esearch', > -db => 'pubmed', > -term => 'hutP', > -usehistory => 'y'); > > $esearch->get_response; # parse the response, fetch a cookie > > my $elink = Bio::DB::EUtilities->new(-eutil => 'elink', > -db => > 'protein,taxonomy', > -dbfrom => 'pubmed', > -cookie => $esearch- > >next_cookie, > -cmd => 'neighbor'); > > # this retrieves the Bio::DB::EUtilities::ElinkData object > > my ($linkset) = $elink->next_linkset; > my @ids; > > # step through IDs for each linked database in the ElinkData object > > for my $db ($linkset->get_databases) { > @ids = $linkset->get_LinkIds_by_db($db); #returns primary ID's > # do something here > } > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From daniel.lang at biologie.uni-freiburg.de Thu Oct 5 13:12:02 2006 From: daniel.lang at biologie.uni-freiburg.de (Daniel Lang) Date: Thu, 05 Oct 2006 19:12:02 +0200 Subject: [Bioperl-l] Bio::DB::SeqFeature Message-ID: <45253CE2.1070208@biologie.uni-freiburg.de> Hi, we are storing Bio::SeqFeature::Gene::GeneStructure objects (with multiple transcripts) using Bio::DB::SeqFeature::Store in a mysql db (latest bioperl-live checkout). The Bio::SeqFeature::Gene::GeneStructure's are generated from scratch out of a database. The first observation is that is seems to work (fetched objects behave like Bio::SeqFeature::Gene::GeneStructure's) despite the fact that we get these warnings: Can't store item CODE(0x113db10) at lib/Storable.pm (autosplit into lib/auto/Storable/_freeze.al) line 287, line 1. Can't store item CODE(0x11786f0) at lib/Storable.pm (autosplit into lib/auto/Storable/_freeze.al) line 287, line 1. (in cleanup) Not a CODE reference at /home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, line 1. prepare_cached(SELECT f.id,f.object FROM feature as f WHERE ( f.seqid=? AND f.end>=? AND f.start<=? AND ((f.tier=? AND f.bin between ? AND ?) OR (f.tier=? AND f.bin between ? AND ?) OR (f.tier=? AND f.bin between ? AND ?) OR (f.tier=? AND f.bin between ? AND ?) OR (f.tier=? AND f.bin between ? AND ?) OR (f.tier=? AND f.bin between ? AND ?) OR (f.tier=? AND f.bin between ? AND ?)) ) ) statement handle DBI::st=HASH(0x1c317cf0) still Active at /home/lang/bioperl/bioperl-live//Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 (in cleanup) Not a CODE reference at /home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, line 1. Is this something serious? Does this mean that the stored object doesn't have everything it had before freezing? Or are we using Bio::DB::SeqFeature inappropriately? The other question would be, if we can visualize these stored feature objects easily using gbrowse? I didn't find a hint mentioning Bio::DB::SeqFeature as being supported by gbrowse on the gmod pages... Is it working already? Will it? Thanks in advance, Daniel -- Daniel Lang University of Freiburg, Plant Biotechnology Schaenzlestr. 1, D-79104 Freiburg fax: +49 761 203 6945 phone: +49 761 203 6974 homepage: http://www.plant-biotech.net/ e-mail: daniel.lang at biologie.uni-freiburg.de ################################################# My software never has bugs. It just develops random features. ################################################# From cjfields at uiuc.edu Thu Oct 5 13:45:40 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 12:45:40 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <452532C3.9030804@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu> <45252C7D.3050009@sendu.me.uk> <452532C3.9030804@sendu.me.uk> Message-ID: <003DD8C4-6E59-44C2-9A1C-117E036D93BC@uiuc.edu> On Oct 5, 2006, at 11:28 AM, Sendu Bala wrote: > I'm not sure how not doing URI-encoding would suddenly make your > classes worthless. I find them to be very useful (even when I > didn't know there was any URI-encoding, was incorrectly using +s > and it happened to work anyway). That's not my point (and sincerest apologies for the 'badgering' bit). If you made the assumption that all the parameters had to be URI-encoded, why couldn't I do something like: my %param = (#make up your list of parameters here#); my $eutil = 'esearch'; my $url = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/$eutil.fcgi"; # join the key value pairs with '=', then join all those with & # add to end of url # post and retrieve via LWP::Simple It's more user-friendly to set up the parameters so that you wouldn't have to encode everything yourself, esp. when the most reliable way to encode URI strings is to 'use URI'. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Oct 5 14:11:25 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 13:11:25 -0500 Subject: [Bioperl-l] Eutilities Batch In-Reply-To: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com> References: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com> Message-ID: <4A340977-C6AD-4728-8947-BF5A8A782807@uiuc.edu> On Oct 5, 2006, at 9:09 AM, Bernd Web wrote: > Hi, > > I am using the new EUtilities. It looks great. > I was trying to use epost followed by elink but i get an error. The > same error is actually given with the example on > http://doc.bioperl.org/bioperl-live/Bio/DB/EUtilities/elink.html: > Can't call method "get_databases" on an undefined value at EU.pl > line 25. > > For completeness, the code is shown below too. > > Any suggestions what is going wrong? > > Regards, > Bernd Grr...that's my error, sorry Bernd. The POD wasn't updated to match the change I made and has a few errors. The elink object, for starters, doesn't fetch the response using get_response(). Also, the ElinkData method has changed slightly but accomplishes the same thing. Odd, since I copied and pasted that from working code... Just a note: these are considered highly experimental at the moment, though they should be ready for general use and toying around. I would like any suggestions on methods and so on you may have (Sendu has made some very helpful ones off-list which I plan on implementing). Feel free to let me know if something doesn't work. Note that, because of their experimental nature, you will want to take note of any methods changes in particular as I try to solidify the API and clean up the POD, so expect some momentary 'outages'. I plan on setting up a remedial interface for all the container objects (like ElinkData) which will help clarify things and solidify the API in the next few weeks, at least to a point where the class methods have a consistent naming scheme. I plan on using this as a backend web agent for a general Entrez interface at some point to get data into Bio* objects. In the meantime, try this: use Bio::DB::EUtilities; my $esearch = Bio::DB::EUtilities->new(-eutil => 'esearch', -db => 'pubmed', -term => 'hutP', -usehistory => 'y'); $esearch->get_response; # parse the response, fetch a cookie my $elink = Bio::DB::EUtilities->new(-eutil => 'elink', -db => 'protein,taxonomy', -dbfrom => 'pubmed', -cookie => $esearch- >next_cookie, -cmd => 'neighbor'); $elink->get_response; # this retrieves the Bio::DB::EUtilities::ElinkData object my $linkset = $elink->next_linkset; my @ids; # step through IDs for each linked database in the ElinkData object for my $db ($linkset->get_all_linkdbs) { @ids = $linkset->get_LinkIds_by_db($db); #returns primary ID's print join q(,), @ids; # do something here } Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From dmessina at wustl.edu Thu Oct 5 14:07:56 2006 From: dmessina at wustl.edu (David Messina) Date: Thu, 5 Oct 2006 13:07:56 -0500 Subject: [Bioperl-l] BioPerl Deobfuscator updated Message-ID: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> I'm pleased to announce a revised version of the BioPerl Deobfuscator is now available. Many thanks to Mauricio Cuadra for updating bioperl.org's installation: http://bioperl.org/cgi-bin/deob_interface.cgi I've incorporated many of the suggestions you all sent in after the first release, and many of the modules that had non-standard documentation have been updated in the meantime, too, so hopefully you'll find it much improved. There are still some issues with a few modules; please report any problems you see. Also, it's now indexing bioperl-live instead of 1.4, which should make it a little more useful, too. A complete list of changes is below. I welcome your bug reports and suggestions for improvements, via email, this list, Bugzilla, or the Wiki page. Thanks, Dave Changes 0.0.3 Mon Oct 2 20:01:45 CDT 2006 FIX: change default $deob_detail_path to be a relative URL instead of having localhost hardcoded. Thanks to Jason Stajich for pointing this out. FIX: Bio::Ontology modules are no longer missing their prefix in the class list, and their methods are now shown in the lower pane as expected. Thanks to Hilmar Lapp for reporting this bug. FIX: can now handle (and ignore) VERSION POD section. FIX: missing SYNOPSIS section now handled properly. In fact, the SYNOPSIS and DESCRIPTION sections can be in reverse order now, although for consistency this is not recommended. FIX: Bug #2114: "Obfuscator doesn't show "Bio:Matrix:Generic" has been fixed. This bug turned out to afflict multiple modules, which weren't getting parsed correctly by deob_index.pl. NEW: Table cells have been padded out to get rid of that "scrunched" look. Thanks to Sendu Bala for this great suggestion. NEW: If the 'Returns' subsection of a method's documentation contains a POD L<> link, the Deobfuscator assumes this to be a package name, and wraps it in an href for display. This feature is not robust, but seems to work well enough for now. NEW: the list of classes is now sorted alphabetically depth- first, so that subclasses appear just after their parent class. Thanks to Amir Karger for noticing the strange sorting behavior. NEW: HTML page title now 'BioPerl Deobfuscator' to distinguish it from other Deobfuscators out there. Thanks to Amir Karger for suggesting this. NEW: 'No match' search string now more prominent. Yep, kudos to Amir Karger again -- another great idea! NEW: Search box caption now explicitly states that only package names can be searched. Big ups to Amir Karger for this suggestion. The ability to search method names is planned for a future version. NEW: added -x option to deob_index.pl. This allows the use of an 'excluded modules' file. This feature was added to resolve an issue with four modules which rely on external modules to compile. Class::Inspector, used by the Deobfuscator needs to load a module to traverse its inheritance tree, and modules must compile before they can be loaded. CHANGE: using short name now when traversing with File::Find to help identify excluded modules (deob_index.pl). From lincoln.stein at gmail.com Thu Oct 5 14:41:08 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Thu, 5 Oct 2006 14:41:08 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net> Message-ID: <6dce9a0b0610051141x6b61407ar1c0a13cf7616b35f@mail.gmail.com> The non-numeric comparison bug in Bio::DB::SeqFeature is fixed in the latest CVS. Do I need to do anything special to get the CVS fixes into the release candidate? Lincoln On 10/2/06, Chris Fields wrote: > > [I won't create a wiki account just to report this.] > > > > Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG > > not set. Lots of warnings about missing packages and all, but this > > looks interesting: > > > > Argument "+" isn't numeric in numeric lt (<) at Bio/DB/ > > SeqFeature/Segment.pm line 423. > > This is verified on Mac OS X. > > > Otherwise: > > > > Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, > > 99.99% okay. > > > > The failed test is: > > > > t/ESEfinder..................dubious > > Test returned status 255 (wstat 65280, 0xff00) > > DIED. FAILED test 15 > > What do you get when you run that set of tests using 'perl -I. -w t/ > ESEFinder.t'? The bad status code is odd and could be a remote > server issue. > > Chris > > > > > > florin > > > > -- > > If we wish to count lines of code, we should not regard them as lines > > produced but as lines spent. -- Edsger Dijkstra > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From MEC at stowers-institute.org Thu Oct 5 15:18:08 2006 From: MEC at stowers-institute.org (Cook, Malcolm) Date: Thu, 5 Oct 2006 14:18:08 -0500 Subject: [Bioperl-l] using nfreeze instead of freeze in Bio::SeqFeature::Store Message-ID: Yes, there is overhead (c.f. perldoc Storable) "When writing in network order, all fields are written out as standard lengths, which allows full interworking, but takes longer to read and write)" And, I suppose there is also risk of loosing precision in using network order: You can also store data in network order to allow easy sharing across multiple platforms, or when storing on a socket known to be remotely connected. The routines to call have an initial "n" prefix for *network*, as in "nstore" and "nstore_fd". At retrieval time, your data will be correctly restored so you don't have to know whether you're restoring from native or network ordered data. Double values are stored stringified to ensure portability as well, at the slight risk of loosing some precision in the last decimals. So, I agree, it should be configuration option, perhaps defaulting to using network order. However, given the factoring of ../Bio/DB/SeqFeature/Store.pm I'm not sure how to best make it a configuration option since the two provided serializers don't share a common interface. Possibly something like: =head1 Methods for Connecting and Initializating a Database =head2 new Title : new Usage : $db = Bio::DB::SeqFeature::Store->new(@options) Function: connect to a database Returns : A descendent of Bio::DB::Seqfeature::Store Args : several - see below Status : public This class method creates a new database connection. The following -name=E$value arguments are accepted:http://iowg.brcdevel.org/gff3.html#a_fasta Name Value ---- ----- -adaptor The name of the Adaptor class (default DBI::mysql) -serializer The name of the serializer class (default Storable) -network_order Strive to 'preserve network order' (if the serializer implements it. Currently, only Storable.pm does, and this will cause it to use nfreeze instead of freeze. (default 1) -index_subfeatures Whether or not to make subfeatures searchable (default true) -cache Activate LRU caching feature -- size of cache -compress Compresses features before storing them in database using Compress::Zlib Malcolm Cook Database Applications Manager - Bioinformatics Stowers Institute for Medical Research - Kansas City, Missouri > -----Original Message----- > From: Lincoln Stein [mailto:lincoln.stein at gmail.com] > Sent: Thursday, October 05, 2006 1:43 PM > To: Cook, Malcolm > Cc: lstein at cshl.org; bioperl-l > Subject: Re: using nfreeze instead of freeze in Bio::SeqFeature::Store > > I think it's fine unless there is a significant performance hit, in > which case the change should be made into a configuration option. Do > you know if there is any overhead on doing this? > > Lincoln > > On 10/5/06, Cook, Malcolm wrote: > > Lincoln, > > > > I committed a change to Bio::SeqFeature::Store to use > nfreeze instead of > > freeze which should allow SeqFeature objects to survive database > > freeze/thaw cycles across architectures. > > > > I hope I was not presumptuous or in error in doing this.... > > > > Regards, > > > > Malcolm Cook > > Database Applications Manager - Bioinformatics > > Stowers Institute for Medical Research - Kansas City, Missouri > > > > > > > -- > Lincoln D. Stein > Cold Spring Harbor Laboratory > 1 Bungtown Road > Cold Spring Harbor, NY 11724 > (516) 367-8380 (voice) > (516) 367-8389 (fax) > FOR URGENT MESSAGES & SCHEDULING, > PLEASE CONTACT MY ASSISTANT, > SANDRA MICHELSEN, AT michelse at cshl.edu > From lincoln.stein at gmail.com Thu Oct 5 14:32:40 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Thu, 5 Oct 2006 14:32:40 -0400 Subject: [Bioperl-l] Bio::DB::SeqFeature In-Reply-To: <45253CE2.1070208@biologie.uni-freiburg.de> References: <45253CE2.1070208@biologie.uni-freiburg.de> Message-ID: <6dce9a0b0610051132p7d7fcf84g27578731f9727f3f@mail.gmail.com> Hi Daniel, The warnings you are seeing are occurring because Bio::SeqFeature::Gene::GeneStructure contains a CODE reference. I think it must be registering a cleanup method via its Bio::Root::Root ancestor. When Storable serializes the object, it complains that it can't serialize the CODE reference and instead converts it into the string "CODE(0xXXXXX)". Then, after you thaw the object, Bio::Root::Root is complaining that the CODE reference is invalid because it is a string, not a reference. Yuck. I think, however, that I can fix this by setting some magic variables in Storable version 2.05 that will decompile and compile the CODE references. I will try this and send you a note when the code is in CVS. GBrowse does run off Bio::DB::SeqFeature::Store and is noticeably faster than the original Bio::DB::GFF adaptor. Nothing really changes except that you set the db_adaptor option to Bio::DB::SeqFeature::Store. I haven't tried it using Bio::SeqFeature::Gene::GeneStructure, so no guarantees, but I am hopeful that it will work. Lincoln On 10/5/06, Daniel Lang wrote: > Hi, > > we are storing Bio::SeqFeature::Gene::GeneStructure objects (with > multiple transcripts) using Bio::DB::SeqFeature::Store in a mysql db > (latest bioperl-live checkout). > > The Bio::SeqFeature::Gene::GeneStructure's are generated from scratch > out of a database. > > The first observation is that is seems to work (fetched objects behave > like Bio::SeqFeature::Gene::GeneStructure's) despite the fact that we > get these warnings: > > Can't store item CODE(0x113db10) at lib/Storable.pm (autosplit into > lib/auto/Storable/_freeze.al) line 287, line 1. > Can't store item CODE(0x11786f0) at lib/Storable.pm (autosplit into > lib/auto/Storable/_freeze.al) line 287, line 1. > (in cleanup) Not a CODE reference at > /home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, line 1. > prepare_cached(SELECT f.id,f.object > FROM feature as f > WHERE ( f.seqid=? > AND f.end>=? AND f.start<=? AND ((f.tier=? AND f.bin between ? AND ?) > OR (f.tier=? AND f.bin between ? AND ?) > OR (f.tier=? AND f.bin between ? AND ?) > OR (f.tier=? AND f.bin between ? AND ?) > OR (f.tier=? AND f.bin between ? AND ?) > OR (f.tier=? AND f.bin between ? AND ?) > OR (f.tier=? AND f.bin between ? AND ?)) > ) > > ) statement handle DBI::st=HASH(0x1c317cf0) still Active at > /home/lang/bioperl/bioperl-live//Bio/DB/SeqFeature/Store/DBI/mysql.pm > line 1422 > (in cleanup) Not a CODE reference at > /home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, line 1. > > Is this something serious? Does this mean that the stored object doesn't > have everything it had before freezing? Or are we using > Bio::DB::SeqFeature inappropriately? > > The other question would be, if we can visualize these stored feature > objects easily using gbrowse? I didn't find a hint mentioning > Bio::DB::SeqFeature as being supported by gbrowse on the gmod pages... > Is it working already? Will it? > > Thanks in advance, > Daniel > > -- > > Daniel Lang > University of Freiburg, Plant Biotechnology > Schaenzlestr. 1, D-79104 Freiburg > fax: +49 761 203 6945 > phone: +49 761 203 6974 > homepage: http://www.plant-biotech.net/ > e-mail: daniel.lang at biologie.uni-freiburg.de > > ################################################# > My software never has bugs. > It just develops random features. > ################################################# > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From hlapp at gmx.net Thu Oct 5 16:34:49 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 5 Oct 2006 16:34:49 -0400 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <4525314C.7020205@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> <4525314C.7020205@sendu.me.uk> Message-ID: <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net> On Oct 5, 2006, at 12:22 PM, Sendu Bala wrote: > If you think I'm stupid, fine, but I'm probably not the only stupid > person on the planet. That's a great suggestion that I hope we can all agree on? I'll happily count myself among the stupid ones too so you're not alone, and stupid people and even more so those who are lucky enough not to be stupid have an obligation to document stuff so that even the stupid can understand, no matter how silly the documentation might get. Is that agreeable without causing yet more progressive hair loss? Actually - I'm having second thoughts. Isn't it a distinguishing feature of stupid people that - among other things - they are stupid enough to believe they don't need to read documentation? You admitted publicly that you read documentation - are you just faking the stupid? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Thu Oct 5 17:11:06 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 16:11:06 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> <4525314C.7020205@sendu.me.uk> <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net> Message-ID: <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu> On Oct 5, 2006, at 3:34 PM, Hilmar Lapp wrote: > > On Oct 5, 2006, at 12:22 PM, Sendu Bala wrote: > >> If you think I'm stupid, fine, but I'm probably not the only stupid >> person on the planet. > > That's a great suggestion that I hope we can all agree on? I'll > happily count myself among the stupid ones too so you're not alone, > and stupid people and even more so those who are lucky enough not > to be stupid have an obligation to document stuff so that even the > stupid can understand, no matter how silly the documentation might > get. > > Is that agreeable without causing yet more progressive hair loss? > > Actually - I'm having second thoughts. Isn't it a distinguishing > feature of stupid people that - among other things - they are > stupid enough to believe they don't need to read documentation? You > admitted publicly that you read documentation - are you just faking > the stupid? > > -hilmar If lack of good documentation == stupid, I know of a few other modules in trouble besides mine. Based on that we're in for a whole lot of stupid! And I feel stupid for my earlier remarks, Sendu, so apologies. And Hilmar, you're too late on the hair loss, at least on my end. I have corrected the EUtilities POD to reflect that all text input needs to be raw as URI encoding is done in the module, which should work (I think). I plan on committing it tonight. It also indicates that EUtilities search queries need to be made as if they are regular Entrez queries. Would that be sufficient? Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From pmiguel at purdue.edu Thu Oct 5 16:42:00 2006 From: pmiguel at purdue.edu (Phillip San Miguel) Date: Thu, 05 Oct 2006 16:42:00 -0400 Subject: [Bioperl-l] BioPerl Deobfuscator updated In-Reply-To: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> Message-ID: <45256E18.3080103@purdue.edu> David Messina wrote: > I'm pleased to announce a revised version of the BioPerl Deobfuscator > is now available. Many thanks to Mauricio Cuadra for updating > bioperl.org's installation: > > http://bioperl.org/cgi-bin/deob_interface.cgi > > I've incorporated many of the suggestions you all sent in after the > first release, and many of the modules that had non-standard > documentation have been updated in the meantime, too, so hopefully > you'll find it much improved. There are still some issues with a few > modules; please report any problems you see. Also, it's now indexing > bioperl-live instead of 1.4, which should make it a little more > useful, too. A complete list of changes is below. > > I welcome your bug reports and suggestions for improvements, via > email, this list, Bugzilla, or the Wiki page. > > > Thanks, > Dave > > Here are some comments: Would be good to have the column headings for the methods table in the fixed part of the page, rather than the scroll box. That way you could always see the column headings from anywhere in the list. Second, I've noticed that there are a fair number of methods that have "not documented" for "Returns" and "Usage". But in every case I've checked both of these were documented. For example, consider methods for Bio::Seq::SeqWithQuality. The method "accession_number" is listed as "not documented". But if you click on Bio::Seq:SeqWithQuality link to the documentation, usage is defined as: "$unique_biological_key = $obj->accession_number;" and returns is defined as "A string". Finally, it would be good to have the version of bioperl being deobfuscated on the deob_interface.cgi page. Just as a quick sanity-checking measure. After poking around a bit I found that bioperl-live is being indexed in the wiki. But, I can tell, it is just the sort of thing I'm going to forget and look for every time come back to the page after a few months... Overall very nice, though. Just what is needed when I'm trying to remember "which was the method that returns subseq string and which one returns an object?" Phillip SanMiguel Purdue University From bix at sendu.me.uk Thu Oct 5 17:24:34 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 22:24:34 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> <4525314C.7020205@sendu.me.uk> <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net> <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu> Message-ID: <45257812.5050008@sendu.me.uk> Chris Fields wrote: > > I have corrected the EUtilities POD to reflect that all text input needs > to be raw as URI encoding is done in the module, which should work (I > think). I plan on committing it tonight. It also indicates that > EUtilities search queries need to be made as if they are regular Entrez > queries. Would that be sufficient? You may not even need to mention anything about URI encoding, which might frighten some people. Something as simple as: =head1 SYNOPSIS use Bio::DB::EUtilities; my $esearch = Bio::DB::EUtilities->new(-eutil => 'esearch', -db => 'pubmed', -term => 'hutP AND xyz', ... and/or some POD for the new() method: =head2 new Title : new ... Args : -eutil => ... -db => ... -term => string, an entrez-style query =cut would get the point across, I think. BTW, can the term string be supplied anywhere else other than new()? It doesn't matter at all if it can't, I'm just idly wondering if I missed anything. From dmessina at wustl.edu Thu Oct 5 17:42:49 2006 From: dmessina at wustl.edu (David Messina) Date: Thu, 5 Oct 2006 16:42:49 -0500 Subject: [Bioperl-l] BioPerl Deobfuscator updated In-Reply-To: <45256E18.3080103@purdue.edu> References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> <45256E18.3080103@purdue.edu> Message-ID: <5861A24A-E0F9-4B06-9AD1-68857E6561F9@wustl.edu> Thanks so much, Phillip, for taking the time to check out the new version and send your comments. I really appreciate it! I've added them to the wiki page so I can track them. Best, Dave From cjfields at uiuc.edu Thu Oct 5 17:50:11 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 16:50:11 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45257812.5050008@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> <4525314C.7020205@sendu.me.uk> <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net> <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu> <45257812.5050008@sendu.me.uk> Message-ID: Sendu, I have the parameters all set up as get/sets at this point, but I'm open to suggestions on that. Note in the BEGIN block the heredoc eval {} block. Yes, nasty I know, but I hate AUTOLOAD. It works as a quick way of getting parameter get/sets up-and-running. I plan on making those explicit get/sets as soon as I can then sorting out particular ones to the various eutil modules where they are primarily used. Long story short, every parameter is a get/set at this time (including term()). The common ones needed for most EUtilities are initialized in the parent EUtilities::_initialize(), and eutil- specific parameters are initialized in the individual eutil plugins. Each eutil plugin only sets whatever parameters may be needed for operation (though you could circumvent that, since all of them are inherited via EUtilities). We could always simplify it to accept simple key-value pairs, but get/ sets (at least to me) allow more flexibility as long as you remember which parameters are set and to what. Chris On Oct 5, 2006, at 4:24 PM, Sendu Bala wrote: > Chris Fields wrote: >> I have corrected the EUtilities POD to reflect that all text input >> needs to be raw as URI encoding is done in the module, which >> should work (I think). I plan on committing it tonight. It also >> indicates that EUtilities search queries need to be made as if >> they are regular Entrez queries. Would that be sufficient? > > You may not even need to mention anything about URI encoding, which > might frighten some people. Something as simple as: > > =head1 SYNOPSIS > > use Bio::DB::EUtilities; > > my $esearch = Bio::DB::EUtilities->new(-eutil => 'esearch', > -db => 'pubmed', > -term => 'hutP AND > xyz', > ... > > and/or some POD for the new() method: > > =head2 new > > Title : new > ... > Args : -eutil => ... > -db => ... > -term => string, an entrez-style query > > =cut > > would get the point across, I think. > > BTW, can the term string be supplied anywhere else other than new > ()? It doesn't matter at all if it can't, I'm just idly wondering > if I missed anything. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Oct 5 17:51:06 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 16:51:06 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45257812.5050008@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> <4525314C.7020205@sendu.me.uk> <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net> <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu> <45257812.5050008@sendu.me.uk> Message-ID: <5B2E844F-7B8B-4F69-9005-138826B835FB@uiuc.edu> > You may not even need to mention anything about URI encoding, which > might frighten some people. Something as simple as: > > =head1 SYNOPSIS > > use Bio::DB::EUtilities; > > my $esearch = Bio::DB::EUtilities->new(-eutil => 'esearch', > -db => 'pubmed', > -term => 'hutP AND > xyz', > ... > > and/or some POD for the new() method: > > =head2 new > > Title : new > ... > Args : -eutil => ... > -db => ... > -term => string, an entrez-style query > > =cut > > would get the point across, I think. Oops, forgot. I'll add this in and update new() when I can. Thanks! Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From arareko at campus.iztacala.unam.mx Thu Oct 5 18:12:49 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Thu, 05 Oct 2006 17:12:49 -0500 Subject: [Bioperl-l] BioPerl Deobfuscator updated In-Reply-To: <45256E18.3080103@purdue.edu> References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> <45256E18.3080103@purdue.edu> Message-ID: <45258361.8080803@campus.iztacala.unam.mx> Phillip San Miguel wrote: > Finally, it would be good to have the version of bioperl being > deobfuscated on the deob_interface.cgi page. Just as a quick > sanity-checking measure. After poking around a bit I found that > bioperl-live is being indexed in the wiki. But, I can tell, it is just > the sort of thing I'm going to forget and look for every time come back > to the page after a few months... Dave, I think this value can be stored in one of the index files and passed as an argument to the deob_index.pl script. What do you think? Mauricio. -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From lincoln.stein at gmail.com Thu Oct 5 14:42:41 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Thu, 5 Oct 2006 14:42:41 -0400 Subject: [Bioperl-l] using nfreeze instead of freeze in Bio::SeqFeature::Store In-Reply-To: References: Message-ID: <6dce9a0b0610051142h56479843ofc5429d959cb6e3@mail.gmail.com> I think it's fine unless there is a significant performance hit, in which case the change should be made into a configuration option. Do you know if there is any overhead on doing this? Lincoln On 10/5/06, Cook, Malcolm wrote: > Lincoln, > > I committed a change to Bio::SeqFeature::Store to use nfreeze instead of > freeze which should allow SeqFeature objects to survive database > freeze/thaw cycles across architectures. > > I hope I was not presumptuous or in error in doing this.... > > Regards, > > Malcolm Cook > Database Applications Manager - Bioinformatics > Stowers Institute for Medical Research - Kansas City, Missouri > > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From torsten.seemann at infotech.monash.edu.au Fri Oct 6 01:26:10 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Fri, 06 Oct 2006 15:26:10 +1000 Subject: [Bioperl-l] Clean-up of Bio::Root::IO In-Reply-To: <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net> References: <452344D4.8070908@infotech.monash.edu.au> <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net> Message-ID: <4525E8F2.1000704@infotech.monash.edu.au> Hilmar, > I don't think there's a need to deprecate - if the methods just plain > delegate to whatever File:: module is appropriate their > implementation (supposedly) will become very simple and hence won't > pose a maintenance burden anymore. >> I have an uncommitted simplified version of Bio::Root::IO which does >> this, and "all tests pass". The functions currently (silently) >> dispatch >> directly to their native counterparts. >> >> The only tricky function is tempfile() which is *mostly* like >> File::Temp::tempfile(), but does some voodoo of converting >> (TEMPLATE=>'xxx') to the non-hash first parameter of the File:: >> version, >> so I'm hesitant to commit. It may do other magic - Hilmar? > > Not that I would know of. If the tests pass (without having to change > them!) I'd give it a try. Tempfile.t had two tests that failed. It seems that Bio::Root::IO had some magic whereby it would keep a list of all tempfilenames created with UNLINK != 0 and when the Bio::Root::IO object was destroyed (eg. undef $obj) it would MANUALLY unlink each of them. This would occur before File::Temp got to unlink them. Not sure why it was written like this (as File::Temp will delete them at the end of the script anyway) but maybe it was legacy for when File::Temp::tempfile WASN'T available. Anyway, I've kept backward compatibility there, although I think eventually it should be removed and Tempfile.t adjusted. Although all tests pass with my new trim Bio/Root/IO.pm I am still concerned about committing as the assumption is that the BioPerl test suite is good enough to handle such a change to an important module, but the reality may be different :-) Let me know if you think I should commit anyway, Your advice is appreciated. -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From dmessina at wustl.edu Fri Oct 6 01:25:56 2006 From: dmessina at wustl.edu (David Messina) Date: Fri, 6 Oct 2006 00:25:56 -0500 Subject: [Bioperl-l] BioPerl Deobfuscator updated In-Reply-To: <45258361.8080803@campus.iztacala.unam.mx> References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> <45256E18.3080103@purdue.edu> <45258361.8080803@campus.iztacala.unam.mx> Message-ID: On Oct 5, 2006, at 5:12 PM, Mauricio Herrera Cuadra wrote: > I think this value can be stored in one of the index files and > passed as an argument to the deob_index.pl script. What do you think? Yep, I think that works nicely. I added this feature and committed it to CVS. Here's what the new header looks like if you do deob_index.pl -s "bioperl-live": ? Thanks for the suggestions, guys. Dave -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: deob_header.jpg Type: image/jpeg Size: 25739 bytes Desc: not available URL: From deep_ans at yahoo.com Fri Oct 6 09:22:49 2006 From: deep_ans at yahoo.com (deepak shingan) Date: Fri, 6 Oct 2006 06:22:49 -0700 (PDT) Subject: [Bioperl-l] Sort blast file result according to evalues Message-ID: <20061006132249.49450.qmail@web51711.mail.yahoo.com> Hi , Is there any way to parse the blast file according to evalue for each hit. I want the output sorted according to hit evalue. I am using SearchIO algorithm and already tried sorting the hits according to bits, gaps, but I am not able to sort the hits by evalue. As evalues are mainly associated with hsp and each hit may have multiple hsps. waiting for help. Thanks, Dun Dansi --------------------------------- How low will we go? Check out Yahoo! Messenger?s low PC-to-Phone call rates. From hlapp at gmx.net Fri Oct 6 10:03:04 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 6 Oct 2006 10:03:04 -0400 Subject: [Bioperl-l] Clean-up of Bio::Root::IO In-Reply-To: <4525E8F2.1000704@infotech.monash.edu.au> References: <452344D4.8070908@infotech.monash.edu.au> <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net> <4525E8F2.1000704@infotech.monash.edu.au> Message-ID: <074183F9-004A-4693-A7BC-D5E5524FCC92@gmx.net> This is a 1.5, i.e. developers release that's in the works, and also you'd be doing this on the main trunk. If you get the tests to pass there's no reason to hold back. You may be right and in reality it has repercussions somewhere, but those will be the opportunities to improve our test suite. -hilmar On Oct 6, 2006, at 1:26 AM, Torsten Seemann wrote: > Although all tests pass with my new trim Bio/Root/IO.pm I am still > concerned about committing as the assumption is that the BioPerl > test suite is good enough to handle such a change to an important > module, but the reality may be different :-) > > Let me know if you think I should commit anyway, > > Your advice is appreciated. -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Fri Oct 6 10:58:09 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 6 Oct 2006 09:58:09 -0500 Subject: [Bioperl-l] Sort blast file result according to evalues In-Reply-To: <20061006132249.49450.qmail@web51711.mail.yahoo.com> References: <20061006132249.49450.qmail@web51711.mail.yahoo.com> Message-ID: The evalue for the hit is retrieved by the BlastHit::signifiance() method, if I remember correctly. So if $hit is a Bio::Search::Hit::BlastHit object, you use $hit->significance. If you want individual HSP evalues, you would use $hsp->evalue for the individual HSP objects. The output is normally sorted by the order they appear in the alignments and table, which is typically by increasing evalue or decreasing bits (score). So they are already sorted. If you wanted to run a sort yourself you could use a sort block using '{$a- >significance() <=> $b->significance()} @hits', but as pointed out on the wiki it may be safer to run a Schwartzian transform instead: http://www.bioperl.org/wiki/Bioperl_Best_Practices#Sorting Chris On Oct 6, 2006, at 8:22 AM, deepak shingan wrote: > Hi , > Is there any way to parse the blast file according to evalue for > each hit. I want the output sorted according to hit evalue. I am > using SearchIO algorithm and already tried sorting the hits > according to bits, gaps, but I am not able to sort the hits by evalue. > As evalues are mainly associated with hsp and each hit may have > multiple hsps. > > waiting for help. > > Thanks, > Dun Dansi > > > > > > --------------------------------- > How low will we go? Check out Yahoo! Messenger?s low PC-to-Phone > call rates. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Fri Oct 6 11:03:45 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 6 Oct 2006 10:03:45 -0500 Subject: [Bioperl-l] Clean-up of Bio::Root::IO In-Reply-To: <074183F9-004A-4693-A7BC-D5E5524FCC92@gmx.net> References: <452344D4.8070908@infotech.monash.edu.au> <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net> <4525E8F2.1000704@infotech.monash.edu.au> <074183F9-004A-4693-A7BC-D5E5524FCC92@gmx.net> Message-ID: <265AD609-F74E-4545-B3DD-FF94290BE0B4@uiuc.edu> On Oct 6, 2006, at 9:03 AM, Hilmar Lapp wrote: > This is a 1.5, i.e. developers release that's in the works, and also > you'd be doing this on the main trunk. If you get the tests to pass > there's no reason to hold back. > > You may be right and in reality it has repercussions somewhere, but > those will be the opportunities to improve our test suite. > > -hilmar Agreed, though I think Sendu only wants bug fixes for 1.5.2. You could always commit to CVS HEAD and it could be in 1.5.3. Let me rethink that. There were some subtle tempfile/tempdir issues that were popping up on WinXP where the some tempfiles were not being deleted b/c of permissions issues; I had planned on adding that to Bugzilla today or tomorrow. Maybe changing to File::Temp would fix that, so in essence it would be a bug fix! I'll go ahead and post the bug. Chris >> Although all tests pass with my new trim Bio/Root/IO.pm I am still >> concerned about committing as the assumption is that the BioPerl >> test suite is good enough to handle such a change to an important >> module, but the reality may be different :-) >> >> Let me know if you think I should commit anyway, >> >> Your advice is appreciated. > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From pmiguel at purdue.edu Fri Oct 6 11:06:56 2006 From: pmiguel at purdue.edu (Phillip San Miguel) Date: Fri, 06 Oct 2006 11:06:56 -0400 Subject: [Bioperl-l] BioPerl Deobfuscator updated In-Reply-To: <5861A24A-E0F9-4B06-9AD1-68857E6561F9@wustl.edu> References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> <45256E18.3080103@purdue.edu> <5861A24A-E0F9-4B06-9AD1-68857E6561F9@wustl.edu> Message-ID: <45267110.7030905@purdue.edu> David Messina wrote: > Thanks so much, Phillip, for taking the time to check out the new > version and send your comments. I really appreciate it! I've added > them to the wiki page so I can track them. > > Best, > Dave > Dave, No problem. I've just added a "keyword" to search BioPerl Deobfuscator to my Firefox browser. That way I can just type "deob qual" in my URL bar in firefox and the browser jumps directly to BioPerl Deobfuscator (like a bookmark) but it pre-submits the search item "qual". I heard about the Firefox "keywords" in a TWiT/FLOSS episode on mozilla. You just go to any search page and right-click in the search box of interest and one of the choices is "Add a Keyword for this Search". Then you just have to fill out "Name" and "Keyword" fields and drop the keyword into whatever folder you like. The "Keyword" then becomes the word to invoke that search with parameters that follow it when it is typed into the URL bar. Phillip From arareko at campus.iztacala.unam.mx Fri Oct 6 11:18:02 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Fri, 06 Oct 2006 10:18:02 -0500 Subject: [Bioperl-l] BioPerl Deobfuscator updated In-Reply-To: References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> <45256E18.3080103@purdue.edu> <45258361.8080803@campus.iztacala.unam.mx> Message-ID: <452673AA.7070305@campus.iztacala.unam.mx> Looks great! I'll update it during the weekend. Mauricio. David Messina wrote: > > On Oct 5, 2006, at 5:12 PM, Mauricio Herrera Cuadra wrote: >> I think this value can be stored in one of the index files and passed >> as an argument to the deob_index.pl script. What do you think? > > Yep, I think that works nicely. I added this feature and committed it to > CVS. Here's what the new header looks like if you do deob_index.pl -s > "bioperl-live": > > > Thanks for the suggestions, guys. > > Dave > > > ------------------------------------------------------------------------ > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From bix at sendu.me.uk Fri Oct 6 11:27:14 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 06 Oct 2006 16:27:14 +0100 Subject: [Bioperl-l] Sort blast file result according to evalues In-Reply-To: References: <20061006132249.49450.qmail@web51711.mail.yahoo.com> Message-ID: <452675D2.9090803@sendu.me.uk> Chris Fields wrote: > The evalue for the hit is retrieved by the BlastHit::signifiance() > method, if I remember correctly. So if $hit is a > Bio::Search::Hit::BlastHit object, you use $hit->significance. If > you want individual HSP evalues, you would use $hsp->evalue for the > individual HSP objects. > > The output is normally sorted by the order they appear in the > alignments and table, which is typically by increasing evalue or > decreasing bits (score). So they are already sorted. Concur. > If you wanted to run a sort yourself you could use a sort block using > '{$a->significance() <=> $b->significance()} @hits' Actually, it is best to use the sort_hits() method of the result object prior to asking for any hits. (As this allows for potential optimization in the parser.) ->significance is still the thing you need to sort on though. From cjfields at uiuc.edu Fri Oct 6 11:52:57 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 6 Oct 2006 10:52:57 -0500 Subject: [Bioperl-l] Sort blast file result according to evalues In-Reply-To: <452675D2.9090803@sendu.me.uk> References: <20061006132249.49450.qmail@web51711.mail.yahoo.com> <452675D2.9090803@sendu.me.uk> Message-ID: <31A6FC3A-8BEB-42B8-B51D-66E659EF7495@uiuc.edu> On Oct 6, 2006, at 10:27 AM, Sendu Bala wrote: >> If you wanted to run a sort yourself you could use a sort block using >> '{$a->significance() <=> $b->significance()} @hits' > > Actually, it is best to use the sort_hits() method of the result > object > prior to asking for any hits. (As this allows for potential > optimization > in the parser.) Ah, forgot about that one! Chris Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Fri Oct 6 14:36:49 2006 From: jason at bioperl.org (Jason Stajich) Date: Fri, 6 Oct 2006 11:36:49 -0700 Subject: [Bioperl-l] tempfile cleanup In-Reply-To: <437E2D49-F5F6-424A-BC61-EC4BFEB9F3C4@berkeley.edu> References: <437E2D49-F5F6-424A-BC61-EC4BFEB9F3C4@berkeley.edu> Message-ID: <0FCEC6B2-E190-4800-AAB1-89559C552FA6@bioperl.org> I think the magic trickery in there for cleanup is that File::Temp only cleans up tempfiles when Perl exits not when the Root::IO object goes out of scope -- so this can be a problem for people on CGI scripts that stay resident in memory and don't ever have tempfiles cleaned up. The managing the list aspect allows us to call _cleanup periodically (perhaps before the start of every Blast run) to insure that tempfiles are removed. perhaps newer File::Temp versions can solve this better now but I believe that was the behavior we were trying to deal with with managing the list of to-be-deleted files by the Root::IO object. This is some hackery that also had to do with not expecting File::Temp to be installed I believe. -jason From torsten.seemann at infotech.monash.edu.au Mon Oct 9 00:52:29 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Mon, 09 Oct 2006 14:52:29 +1000 Subject: [Bioperl-l] Multiple packages in the one .pm file Message-ID: <4529D58D.1080004@infotech.monash.edu.au> Hi all, The following modules have more than one "package xxxx;" declaration in them. For small, internal classes I guess this is fine, but for others, they should be split up into the filesystem - otherwise they are troublesome to locate and the online documentation doesn't list them! eg. bioperl-run/Bio/Tools/Run/Analysis/Job.pm is in bioperl-run/Bio/Tools/Run/Analysis.pm Here's the culprits: % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio | sed 's/:.*$//' | sort | uniq -d ; done bioperl-live/Bio/AnalysisI.pm bioperl-live/Bio/DB/Fasta.pm bioperl-live/Bio/DB/GFF.pm bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm bioperl-live/Bio/DB/SeqFeature/Store/memory.pm bioperl-live/Bio/SeqIO/interpro.pm bioperl-run/Bio/Tools/Run/Analysis.pm bioperl-run/Bio/Tools/Run/Analysis/soap.pm -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From pmiguel at purdue.edu Mon Oct 9 15:57:12 2006 From: pmiguel at purdue.edu (Phillip San Miguel) Date: Mon, 09 Oct 2006 15:57:12 -0400 Subject: [Bioperl-l] Will 1.5.2 bugfixes propagate into next RC? Message-ID: <452AA998.5010104@purdue.edu> I found a bug in Bio::SeqIO::phd and am wondering if the fix will propagate into the next release candidate? The bug is here: http://bugzilla.open-bio.org/show_bug.cgi?id=2120 I also created a patch that fixes it (on my machine, anyway). It is a fairly minor change, so it seems like it would be worth propagating it into the next release candidate. -- Phillip SanMiguel From bix at sendu.me.uk Mon Oct 9 16:57:28 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 09 Oct 2006 21:57:28 +0100 Subject: [Bioperl-l] Will 1.5.2 bugfixes propagate into next RC? In-Reply-To: <452AA998.5010104@purdue.edu> References: <452AA998.5010104@purdue.edu> Message-ID: <452AB7B8.4040404@sendu.me.uk> Phillip San Miguel wrote: > I found a bug in Bio::SeqIO::phd and am wondering if the fix will > propagate into the next release candidate? > > The bug is here: > > http://bugzilla.open-bio.org/show_bug.cgi?id=2120 > > I also created a patch that fixes it (on my machine, anyway). It is a > fairly minor change, so it seems like it would be worth propagating it > into the next release candidate. If it gets committed to HEAD before I make the next candidate, then yes. I'll do that if no one beats me to it (and if someone does, please add a new test for this). BTW Phillip, thank you for the bug report but in future use the attachment capabilities for files, please don't paste them into the comments box. From bix at sendu.me.uk Mon Oct 9 17:01:56 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 09 Oct 2006 22:01:56 +0100 Subject: [Bioperl-l] Analysis soap problem Message-ID: <452AB8C4.1010704@sendu.me.uk> I thought I'd 'advertise' this bug on the list so more people see it: http://bugzilla.open-bio.org/show_bug.cgi?id=2117 I don't want to make the next 1.5.2 release candidate until its fixed. Does anyone have any idea about it? Even if you can't fix it, just explaining what's (supposed) to be going on would help a lot. Thank you, Sendu. From Kevin.M.Brown at asu.edu Mon Oct 9 18:40:54 2006 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Mon, 9 Oct 2006 15:40:54 -0700 Subject: [Bioperl-l] Analysis soap problem Message-ID: <1A4207F8295607498283FE9E93B775B40219690B@EX02.asurite.ad.asu.edu> If I had to guess from looking at the snippet provided, the variable $seq holds no data so when you try to setup the regex /^$seq$/ you end up with /^$/ (blank line) and the warning. > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Sendu Bala > Sent: Monday, October 09, 2006 2:02 PM > To: bioperl-l List > Subject: [Bioperl-l] Analysis soap problem > > I thought I'd 'advertise' this bug on the list so more people see it: > http://bugzilla.open-bio.org/show_bug.cgi?id=2117 > > I don't want to make the next 1.5.2 release candidate until > its fixed. > Does anyone have any idea about it? Even if you can't fix it, just > explaining what's (supposed) to be going on would help a lot. > > Thank you, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at uiuc.edu Mon Oct 9 22:34:23 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 9 Oct 2006 21:34:23 -0500 Subject: [Bioperl-l] Analysis soap problem In-Reply-To: <452AB8C4.1010704@sendu.me.uk> References: <452AB8C4.1010704@sendu.me.uk> Message-ID: <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu> I have 'fixed' this in CVS. Note the quotes; it depends on what you might consider fixed. Multiple calls to results() were returning empty hash refs, so no data was being returned. For now, I stored the hash reference in a variable then tested each one. All tests now pass, including the 'outseq' one. Maybe it's just me, but shouldn't results() either consistently return the same information, or contain documentation that it doesn't do so? Anyway, I have left the bugzilla report open for now. Chris On Oct 9, 2006, at 4:01 PM, Sendu Bala wrote: > I thought I'd 'advertise' this bug on the list so more people see it: > http://bugzilla.open-bio.org/show_bug.cgi?id=2117 > > I don't want to make the next 1.5.2 release candidate until its fixed. > Does anyone have any idea about it? Even if you can't fix it, just > explaining what's (supposed) to be going on would help a lot. > > Thank you, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bosborne11 at verizon.net Mon Oct 9 22:09:45 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 09 Oct 2006 22:09:45 -0400 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: <4529D58D.1080004@infotech.monash.edu.au> Message-ID: Torsten, Fixed interpro.pm, it could have been written more simply (or more like other SeqIO modules). Can't really address the others. Brian O. On 10/9/06 12:52 AM, "Torsten Seemann" wrote: > Hi all, > > The following modules have more than one "package xxxx;" declaration in > them. For small, internal classes I guess this is fine, but for others, > they should be split up into the filesystem - otherwise they are > troublesome to locate and the online documentation doesn't list them! > > eg. > bioperl-run/Bio/Tools/Run/Analysis/Job.pm > is in > bioperl-run/Bio/Tools/Run/Analysis.pm > > Here's the culprits: > > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio | > sed 's/:.*$//' | sort | uniq -d ; done > > bioperl-live/Bio/AnalysisI.pm > bioperl-live/Bio/DB/Fasta.pm > bioperl-live/Bio/DB/GFF.pm > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm > bioperl-live/Bio/SeqIO/interpro.pm > > bioperl-run/Bio/Tools/Run/Analysis.pm > bioperl-run/Bio/Tools/Run/Analysis/soap.pm From bix at sendu.me.uk Tue Oct 10 03:03:20 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 10 Oct 2006 08:03:20 +0100 Subject: [Bioperl-l] Analysis soap problem In-Reply-To: <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu> References: <452AB8C4.1010704@sendu.me.uk> <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu> Message-ID: <452B45B8.8010401@sendu.me.uk> Chris Fields wrote: > I have 'fixed' this in CVS. Note the quotes; it depends on what you > might consider fixed. Multiple calls to results() were returning > empty hash refs, so no data was being returned. For now, I stored > the hash reference in a variable then tested each one. All tests now > pass, including the 'outseq' one. > > Maybe it's just me, but shouldn't results() either consistently > return the same information, or contain documentation that it doesn't > do so? Anyway, I have left the bugzilla report open for now. Judging by the tests there seems a clear expectation that multiple calls to results() should work, and certainly that makes sense and seems natural. So I'd say that results() should be fixed and the test script reverted. From cjfields at uiuc.edu Tue Oct 10 07:42:33 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 06:42:33 -0500 Subject: [Bioperl-l] Analysis soap problem In-Reply-To: <452B45B8.8010401@sendu.me.uk> References: <452AB8C4.1010704@sendu.me.uk> <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu> <452B45B8.8010401@sendu.me.uk> Message-ID: I agree, though I think Martin Senger should be contacted, at least to get his thoughts. Has anyone tried yet? Chris On Oct 10, 2006, at 2:03 AM, Sendu Bala wrote: > Chris Fields wrote: >> I have 'fixed' this in CVS. Note the quotes; it depends on what you >> might consider fixed. Multiple calls to results() were returning >> empty hash refs, so no data was being returned. For now, I stored >> the hash reference in a variable then tested each one. All tests now >> pass, including the 'outseq' one. >> >> Maybe it's just me, but shouldn't results() either consistently >> return the same information, or contain documentation that it doesn't >> do so? Anyway, I have left the bugzilla report open for now. > > Judging by the tests there seems a clear expectation that multiple > calls > to results() should work, and certainly that makes sense and seems > natural. So I'd say that results() should be fixed and the test script > reverted. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Tue Oct 10 08:14:31 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 10 Oct 2006 13:14:31 +0100 Subject: [Bioperl-l] Analysis soap problem In-Reply-To: References: <452AB8C4.1010704@sendu.me.uk> <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu> <452B45B8.8010401@sendu.me.uk> Message-ID: <452B8EA7.1080800@sendu.me.uk> Chris Fields wrote: > I agree, though I think Martin Senger should be contacted, at least to > get his thoughts. Has anyone tried yet? He's CCd on the bug report, but I haven't tried directly, no. Do you want to tackle this (contacting him and/or fixing the bug)? Cheers, Sendu. From cjfields at uiuc.edu Tue Oct 10 09:20:03 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 08:20:03 -0500 Subject: [Bioperl-l] Analysis soap problem In-Reply-To: <452B8EA7.1080800@sendu.me.uk> Message-ID: <001801c6ec6e$cc016900$15327e82@pyrimidine> I'll try giving it a closer look, just didn't have much time yesterday. I'll also try contacting Martin. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Sendu Bala > Sent: Tuesday, October 10, 2006 7:15 AM > To: bioperl-l > Subject: Re: [Bioperl-l] Analysis soap problem > > Chris Fields wrote: > > I agree, though I think Martin Senger should be contacted, at least to > > get his thoughts. Has anyone tried yet? > > He's CCd on the bug report, but I haven't tried directly, no. Do you > want to tackle this (contacting him and/or fixing the bug)? > > Cheers, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From pmiguel at purdue.edu Tue Oct 10 10:26:35 2006 From: pmiguel at purdue.edu (Phillip San Miguel) Date: Tue, 10 Oct 2006 10:26:35 -0400 Subject: [Bioperl-l] Bug reports and attachments In-Reply-To: <452AB7B8.4040404@sendu.me.uk> References: <452AA998.5010104@purdue.edu> <452AB7B8.4040404@sendu.me.uk> Message-ID: <452BAD9B.5010903@purdue.edu> Sendu Bala wrote: > > BTW Phillip, thank you for the bug report but in future use the > attachment capabilities for files, please don't paste them into the > comments box. > Sendu, Sounds reasonable to me. I should note, however; when I entered the bug, I was looking for some method to attach files. There is none on the "Enter Bug: Bioperl" page: http://bugzilla.open-bio.org/enter_bug.cgi?product=Bioperl Also, "bug writing guidelines" makes no mention of it. I vaguely remembered there being some method to do it--but given the "bug writing guidelines" exhortations to be specific and detailed, I thought I must put the information somewhere. So I put them them the only place offered (on that page)--"Description:" I see that, once submitted, attachments can be added to a bug report. Is that normally how it is done? Doesn't each attachment result in a separate email to the bioperl guts email list? Anyway, I've just added the files to the bug report as attachments, in case someone needs them to construct a test. -- Phillip From bix at sendu.me.uk Tue Oct 10 11:10:25 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 10 Oct 2006 16:10:25 +0100 Subject: [Bioperl-l] Bug reports and attachments In-Reply-To: <452BAD9B.5010903@purdue.edu> References: <452AA998.5010104@purdue.edu> <452AB7B8.4040404@sendu.me.uk> <452BAD9B.5010903@purdue.edu> Message-ID: <452BB7E1.5020200@sendu.me.uk> Phillip San Miguel wrote: > Sendu Bala wrote: >> BTW Phillip, thank you for the bug report but in future use the >> attachment capabilities for files, please don't paste them into the >> comments box. >> > Sendu, Sounds reasonable to me. I should note, however; when I > entered the bug, I was looking for some method to attach files. There > is none on the "Enter Bug: Bioperl" page: > > http://bugzilla.open-bio.org/enter_bug.cgi?product=Bioperl > > Also, "bug writing guidelines" makes no mention of it. I vaguely > remembered there being some method to do it--but given the "bug > writing guidelines" exhortations to be specific and detailed, I > thought I must put the information somewhere. So I put them them the > only place offered (on that page)--"Description:" I agree that things could be better here. Who looks after bugzilla, and is this an alterable feature? > I see that, once submitted, attachments can be added to a bug report. > Is that normally how it is done? Yes, AFAIK. > Doesn't each attachment result in a separate email to the bioperl > guts email list? Yes, but that's not a problem. In fact, doing it this way means you don't email everyone subscribed to guts your big files in plain text, but instead they get a small email with a link to the download. > Anyway, I've just added the files to the bug report as attachments, > in case someone needs them to construct a test. Thank you. From arareko at campus.iztacala.unam.mx Tue Oct 10 11:14:00 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Tue, 10 Oct 2006 10:14:00 -0500 Subject: [Bioperl-l] Bug reports and attachments In-Reply-To: <452BAD9B.5010903@purdue.edu> References: <452AA998.5010104@purdue.edu> <452AB7B8.4040404@sendu.me.uk> <452BAD9B.5010903@purdue.edu> Message-ID: <452BB8B8.40409@campus.iztacala.unam.mx> Phillip San Miguel wrote: > I see that, once submitted, attachments can be added to a bug report. > Is that normally how it is done? Yes, it's the normal method: create the bug report, then attach files. > Doesn't each attachment result in a separate email to the bioperl > guts email list? Adding a file will generate an informative email per bug change (attaching the file in this case) but won't send the attachment to the list. Regards, Mauricio. -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From cjfields at uiuc.edu Tue Oct 10 11:20:55 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 10:20:55 -0500 Subject: [Bioperl-l] Bug reports and attachments In-Reply-To: <452BAD9B.5010903@purdue.edu> Message-ID: <002801c6ec7f$ae8d85f0$15327e82@pyrimidine> > Also, "bug writing guidelines" makes no mention of it. I vaguely > remembered there being some method to do it--but given the "bug writing > guidelines" exhortations to be specific and detailed, I thought I must > put the information somewhere. So I put them them the only place offered > (on that page)--"Description:" > I see that, once submitted, attachments can be added to a bug > report. Is that normally how it is done? Doesn't each attachment result > in a separate email to the bioperl guts email list? > Anyway, I've just added the files to the bug report as attachments, > in case someone needs them to construct a test. Phillip, Initial bug reports only require the general description, OS used, bioperl version, etc. That's quite normal. Any relevant attachments are added afterward. We should probably make that clearer upfront on the wiki page; I don't know if anyone can make similar changes to bugzilla. Any bug changes, CVS commits, etc are mailed to bioperl-guts, yes. That isn't an issue though; it keeps the developers updated on the various bugs/commits that are going on and is a pretty common practice. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 10 12:48:22 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 11:48:22 -0500 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: <4529D58D.1080004@infotech.monash.edu.au> References: <4529D58D.1080004@infotech.monash.edu.au> Message-ID: <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> There are a number of other bioperl-run examples (the Bio::Tools::Run::Analysis::soap issue I looked into revealed such). I agree with both points, 1) that it depends on the size of the classes, and 2) from a maintainability standpoint, it can be very frustrating when looking for documentation. Is there really any advantage to doing this? Chris On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote: > Hi all, > > The following modules have more than one "package xxxx;" > declaration in > them. For small, internal classes I guess this is fine, but for > others, > they should be split up into the filesystem - otherwise they are > troublesome to locate and the online documentation doesn't list them! > > eg. > bioperl-run/Bio/Tools/Run/Analysis/Job.pm > is in > bioperl-run/Bio/Tools/Run/Analysis.pm > > Here's the culprits: > > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio | > sed 's/:.*$//' | sort | uniq -d ; done > > bioperl-live/Bio/AnalysisI.pm > bioperl-live/Bio/DB/Fasta.pm > bioperl-live/Bio/DB/GFF.pm > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm > bioperl-live/Bio/SeqIO/interpro.pm > > bioperl-run/Bio/Tools/Run/Analysis.pm > bioperl-run/Bio/Tools/Run/Analysis/soap.pm > > -- > Dr Torsten Seemann http://www.vicbioinformatics.com > Victorian Bioinformatics Consortium, Monash University, Australia > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 10 12:48:22 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 11:48:22 -0500 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: <4529D58D.1080004@infotech.monash.edu.au> References: <4529D58D.1080004@infotech.monash.edu.au> Message-ID: <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> There are a number of other bioperl-run examples (the Bio::Tools::Run::Analysis::soap issue I looked into revealed such). I agree with both points, 1) that it depends on the size of the classes, and 2) from a maintainability standpoint, it can be very frustrating when looking for documentation. Is there really any advantage to doing this? Chris On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote: > Hi all, > > The following modules have more than one "package xxxx;" > declaration in > them. For small, internal classes I guess this is fine, but for > others, > they should be split up into the filesystem - otherwise they are > troublesome to locate and the online documentation doesn't list them! > > eg. > bioperl-run/Bio/Tools/Run/Analysis/Job.pm > is in > bioperl-run/Bio/Tools/Run/Analysis.pm > > Here's the culprits: > > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio | > sed 's/:.*$//' | sort | uniq -d ; done > > bioperl-live/Bio/AnalysisI.pm > bioperl-live/Bio/DB/Fasta.pm > bioperl-live/Bio/DB/GFF.pm > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm > bioperl-live/Bio/SeqIO/interpro.pm > > bioperl-run/Bio/Tools/Run/Analysis.pm > bioperl-run/Bio/Tools/Run/Analysis/soap.pm > > -- > Dr Torsten Seemann http://www.vicbioinformatics.com > Victorian Bioinformatics Consortium, Monash University, Australia > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From lzhtom at hotmail.com Tue Oct 10 15:42:48 2006 From: lzhtom at hotmail.com (zhihua li) Date: Tue, 10 Oct 2006 19:42:48 +0000 Subject: [Bioperl-l] No "new" method in Bio::Tool::Run::AnalysisFactor::Pise? Message-ID: Hi netters. I've installed Bioperl 1.5.1, both core and run modules. But when I tried to use the Pise module, an error occured saying that there's no "new" method in this package. My script is: use strict; use warnings; use Bio::Tools::Run::AnalysisFactory::Pise; my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new(); my $program=$factory->program('mfold'); $program->seq('my_input_file'); my $job = $program->run(); print STDERR $job->contect('mfold.out'); The error message I got is: Can't locate object method "new" via package "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load "Bio::Tools::Run::AnalysisFactor::Pise"?) I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/Pise.pm and it DOES contain a sub new. So what's going on? Anyone could give me a hint? Thanks a lot! From cjfields at uiuc.edu Tue Oct 10 16:27:27 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 15:27:27 -0500 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com> References: <4529D58D.1080004@infotech.monash.edu.au> <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com> Message-ID: Makes sense to me. I think, as long as they're documented, it shouldn't be a problem. I think the main point is that the class methods for these don't show up using perldoc (something I ran into with Bio::DB::Fasta's inclusion of Bio::PrimarySeq::Fasta), but they do show up when using other documentation. So 'perldoc Bio::DB::Fasta' works, but 'perldoc Bio::PrimarySeq::Fasta' doesn't. So these can be problematic when looking for specific methods. However, I think pod2html handles multiple package declarations in one module, and the PDOC online do as well. Does the Deobfuscator? Chris On Oct 10, 2006, at 3:11 PM, Lincoln Stein wrote: > Hi, > > These ones are all mine: > > > bioperl-live/Bio/DB/Fasta.pm > > bioperl-live/Bio/DB/GFF.pm > > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm > > In each case, the second modules are teeny tiny ones that implement > iterators which are at most two methods long (typically a new() and > a next()). I prefer not to split them out because they will just > clutter up the file tree with stuff that is already well documented > in the "parent ship" modules. > > Lincoln > > > On 10/10/06, Chris Fields wrote: There are a > number of other bioperl-run examples (the > Bio::Tools::Run::Analysis::soap issue I looked into revealed such). > > I agree with both points, 1) that it depends on the size of the > classes, and 2) from a maintainability standpoint, it can be very > frustrating when looking for documentation. Is there really any > advantage to doing this? > > Chris > > On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote: > > > Hi all, > > > > The following modules have more than one "package xxxx;" > > declaration in > > them. For small, internal classes I guess this is fine, but for > > others, > > they should be split up into the filesystem - otherwise they are > > troublesome to locate and the online documentation doesn't list > them! > > > > eg. > > bioperl-run/Bio/Tools/Run/Analysis/Job.pm > > is in > > bioperl-run/Bio/Tools/Run/Analysis.pm > > > > Here's the culprits: > > > > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/ > Bio | > > sed 's/:.*$//' | sort | uniq -d ; done > > > > bioperl-live/Bio/AnalysisI.pm > > bioperl-live/Bio/DB/Fasta.pm > > bioperl-live/Bio/DB/GFF.pm > > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm > > bioperl-live/Bio/SeqIO/interpro.pm > > > > bioperl-run/Bio/Tools/Run/Analysis.pm > > bioperl-run/Bio/Tools/Run/Analysis/soap.pm > > > > -- > > Dr Torsten Seemann http://www.vicbioinformatics.com > > Victorian Bioinformatics Consortium, Monash University, Australia > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > Lincoln D. Stein > Cold Spring Harbor Laboratory > 1 Bungtown Road > Cold Spring Harbor, NY 11724 > (516) 367-8380 (voice) > (516) 367-8389 (fax) > FOR URGENT MESSAGES & SCHEDULING, > PLEASE CONTACT MY ASSISTANT, > SANDRA MICHELSEN, AT michelse at cshl.edu Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 10 16:30:16 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 15:30:16 -0500 Subject: [Bioperl-l] No "new" method in Bio::Tool::Run::AnalysisFactor::Pise? In-Reply-To: References: Message-ID: <870B7500-AA83-42D7-965B-865B91AA8E7F@uiuc.edu> On Oct 10, 2006, at 2:42 PM, zhihua li wrote: > Hi netters. > > I've installed Bioperl 1.5.1, both core and run modules. But when > I tried to use the Pise module, an error occured saying that > there's no "new" method in this package. > > My script is: > > use strict; > use warnings; > use Bio::Tools::Run::AnalysisFactory::Pise; > my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new(); > my $program=$factory->program('mfold'); > $program->seq('my_input_file'); > my $job = $program->run(); > print STDERR $job->contect('mfold.out'); > > The error message I got is: > > Can't locate object method "new" via package > "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load > "Bio::Tools::Run::AnalysisFactor::Pise"?) > > I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/ > Pise.pm and it DOES contain a sub new. > > So what's going on? Anyone could give me a hint? > > Thanks a lot! Well, according to your error output you have AnalysisFactory misspelled ('AnalysisFactor'), which should tell you what the problem is. Look for the same thing in your script. Chris > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Tue Oct 10 16:43:06 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 10 Oct 2006 21:43:06 +0100 Subject: [Bioperl-l] No "new" method in Bio::Tool::Run::AnalysisFactor::Pise? In-Reply-To: References: Message-ID: <452C05DA.5050803@sendu.me.uk> zhihua li wrote: > Hi netters. > > I've installed Bioperl 1.5.1, both core and run modules. But when I > tried to use the Pise module, an error occured saying that there's no > "new" method in this package. > > My script is: > > use strict; > use warnings; > use Bio::Tools::Run::AnalysisFactory::Pise; > my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new(); > my $program=$factory->program('mfold'); > $program->seq('my_input_file'); > my $job = $program->run(); > print STDERR $job->contect('mfold.out'); > > The error message I got is: > > Can't locate object method "new" via package > "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load > "Bio::Tools::Run::AnalysisFactor::Pise"?) > > I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/Pise.pm > and it DOES contain a sub new. > > So what's going on? Anyone could give me a hint? You have a typo. Bio::Tools::Run::AnalysisFactory::Pise, not Bio::Tools::Run::AnalysisFactor::Pise From lincoln.stein at gmail.com Tue Oct 10 16:11:00 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Tue, 10 Oct 2006 16:11:00 -0400 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> References: <4529D58D.1080004@infotech.monash.edu.au> <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> Message-ID: <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com> Hi, These ones are all mine: > bioperl-live/Bio/DB/Fasta.pm > bioperl-live/Bio/DB/GFF.pm > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm In each case, the second modules are teeny tiny ones that implement iterators which are at most two methods long (typically a new() and a next()). I prefer not to split them out because they will just clutter up the file tree with stuff that is already well documented in the "parent ship" modules. Lincoln On 10/10/06, Chris Fields wrote: > > There are a number of other bioperl-run examples (the > Bio::Tools::Run::Analysis::soap issue I looked into revealed such). > > I agree with both points, 1) that it depends on the size of the > classes, and 2) from a maintainability standpoint, it can be very > frustrating when looking for documentation. Is there really any > advantage to doing this? > > Chris > > On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote: > > > Hi all, > > > > The following modules have more than one "package xxxx;" > > declaration in > > them. For small, internal classes I guess this is fine, but for > > others, > > they should be split up into the filesystem - otherwise they are > > troublesome to locate and the online documentation doesn't list them! > > > > eg. > > bioperl-run/Bio/Tools/Run/Analysis/Job.pm > > is in > > bioperl-run/Bio/Tools/Run/Analysis.pm > > > > Here's the culprits: > > > > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio | > > sed 's/:.*$//' | sort | uniq -d ; done > > > > bioperl-live/Bio/AnalysisI.pm > > bioperl-live/Bio/DB/Fasta.pm > > bioperl-live/Bio/DB/GFF.pm > > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm > > bioperl-live/Bio/SeqIO/interpro.pm > > > > bioperl-run/Bio/Tools/Run/Analysis.pm > > bioperl-run/Bio/Tools/Run/Analysis/soap.pm > > > > -- > > Dr Torsten Seemann http://www.vicbioinformatics.com > > Victorian Bioinformatics Consortium, Monash University, Australia > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From asjo at koldfront.dk Tue Oct 10 16:04:35 2006 From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=) Date: Tue, 10 Oct 2006 22:04:35 +0200 Subject: [Bioperl-l] No "new" method in Bio::Tool::Run::AnalysisFactor::Pise? References: Message-ID: <871wpglyy4.fsf@topper.koldfront.dk> On Tue, 10 Oct 2006 19:42:48 +0000, zhihua wrote: > use Bio::Tools::Run::AnalysisFactory::Pise; > my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new(); ^ y [...] > Can't locate object method "new" via package > "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load > "Bio::Tools::Run::AnalysisFactor::Pise"?) You missed a 'y' in "Factory". Best wishes, -- "We've reached a special place... Spiritually... Adam Sj?gren ecumenically... grammatically." asjo at koldfront.dk From dmessina at wustl.edu Tue Oct 10 17:08:45 2006 From: dmessina at wustl.edu (David Messina) Date: Tue, 10 Oct 2006 16:08:45 -0500 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: References: <4529D58D.1080004@infotech.monash.edu.au> <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com> Message-ID: > However, I think pod2html handles multiple package declarations in > one module, and the PDOC online do as well. Does the Deobfuscator? Nope. From my cursory examination at the time they mostly were, as Lincoln said, short and sweet, so I didn't consider it a big deal. I do think the Deobfuscator should theoretically handle such cases anyway, though. I'll add it as a feature request on the wiki page. Or if you're chomping at the bit for it, I could certainly be beer- suaded to do it sooner rather than later... :) Dave From cjfields at uiuc.edu Tue Oct 10 17:33:39 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 16:33:39 -0500 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: References: <4529D58D.1080004@infotech.monash.edu.au> <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com> Message-ID: <7F35F565-7D28-4B06-A501-4D4083652C5C@uiuc.edu> Me? I'm a lowly postdoc. Lincoln's got the cash! Chris On Oct 10, 2006, at 4:08 PM, David Messina wrote: >> However, I think pod2html handles multiple package declarations in >> one module, and the PDOC online do as well. Does the Deobfuscator? > > Nope. From my cursory examination at the time they mostly were, as > Lincoln said, short and sweet, so I didn't consider it a big deal. > > I do think the Deobfuscator should theoretically handle such cases > anyway, though. I'll add it as a feature request on the wiki page. > Or if you're chomping at the bit for it, I could certainly be beer- > suaded to do it sooner rather than later... :) > > Dave > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From sdavis2 at mail.nih.gov Wed Oct 11 05:43:35 2006 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Wed, 11 Oct 2006 05:43:35 -0400 Subject: [Bioperl-l] No "new" method in Bio::Tool::Run::AnalysisFactor::Pise? In-Reply-To: References: Message-ID: <452CBCC7.30108@mail.nih.gov> zhihua li wrote: > Hi netters. > > I've installed Bioperl 1.5.1, both core and run modules. But when I > tried to use the Pise module, an error occured saying that there's no > "new" method in this package. > > My script is: > > use strict; > use warnings; > use Bio::Tools::Run::AnalysisFactory::Pise; > my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new(); > my $program=$factory->program('mfold'); > $program->seq('my_input_file'); > my $job = $program->run(); > print STDERR $job->contect('mfold.out'); > > The error message I got is: > > Can't locate object method "new" via package > "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load > "Bio::Tools::Run::AnalysisFactor::Pise"?) > > I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/Pise.pm > and it DOES contain a sub new. > > So what's going on? Anyone could give me a hint? > > Thanks a lot! The module name is Bio::Tools::Run::AnalysisFactory::Pise. Note that it is not "factor" but "factory". That should probably fix your problem. Sean From jay at jays.net Sat Oct 7 18:34:23 2006 From: jay at jays.net (Jay Hannah) Date: Sat, 07 Oct 2006 17:34:23 -0500 Subject: [Bioperl-l] Documentation typo: Bio::Search::Result::GenericResult Message-ID: <45282B6F.1030308@jays.net> I just updated my bioperl-live this morning, so I think I'm current. :) perldoc Bio::Search::Result::GenericResult ------------ SYNOPSIS # typically one gets Results from a SearchIO stream use Bio::SearchIO; my $io = new Bio::SearchIO(-format => 'blast', -file => 't/data/HUMBETGLOA.tblastx'); while( my $result = $io->next_result) { # process all search results within the input stream while( my $hit = $result->next_hits()) { ------------- Except that "next_hits()" does not exist. Should be "next_hit()". (Should I have posted a patch instead?) Thanks, j From bosborne11 at verizon.net Tue Oct 10 18:42:25 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 10 Oct 2006 18:42:25 -0400 Subject: [Bioperl-l] Documentation typo: Bio::Search::Result::GenericResult In-Reply-To: <45282B6F.1030308@jays.net> Message-ID: j, No need, not for something so simple. Brian O. On 10/7/06 6:34 PM, "Jay Hannah" wrote: > Except that "next_hits()" does not exist. Should be "next_hit()". > > (Should I have posted a patch instead?) From zchou at cau.edu.cn Wed Oct 11 02:34:24 2006 From: zchou at cau.edu.cn (zhuocheng Hou) Date: Wed, 11 Oct 2006 14:34:24 +0800 Subject: [Bioperl-l] about retreive alinged sequence Message-ID: <000a01c6ecff$4ea4b2f0$0915020a@zchou> Hello,everyone, I am a new user of Bioperl. I want to align mutiple DNA sequences based on translated proteins by using clustalw. However, I don't know how to retreive aligned sequences out. The codes as follows (from the tutorials of HOWTOPAML): # # These codes run and can find the screen print out of clustalw ....... my $aa_aln = $aln_factory->align(\@prots, at params); # project the protein alignment back to CDS coordinates my $dna_aln = aa_to_dna_aln($aa_aln, \%seqs); my @each = $dna_aln->each_seq(); # The following codes were writed by me. However, it doesn't work. I want to get teh $dna_aln contents and write it to files. my $in = Bio::AlignIO->newFh(-Fh=>\$dna_aln,-format => 'fasta'); my $aln=$dna_aln; my $out = Bio::AlignIO->new(-file => ">out.msf" , -format => 'msf'); #print $out $_ while <$in>; while ($aln = $in->next_aln() ) { my $out->write_aln($aln); } Best regards, Zhuocheng CAU From n.haigh at sheffield.ac.uk Wed Oct 11 10:00:33 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 11 Oct 2006 15:00:33 +0100 Subject: [Bioperl-l] about retreive alinged sequence In-Reply-To: <000a01c6ecff$4ea4b2f0$0915020a@zchou> References: <000a01c6ecff$4ea4b2f0$0915020a@zchou> Message-ID: <452CF901.6020409@sheffield.ac.uk> Dear Zhuocheng I'm not familiar with the aa_to_dna_al method but it appears that from your code that it returns an alignment object. Please find comments inserted below - hope they help! Nathan zhuocheng Hou wrote: > Hello,everyone, > > I am a new user of Bioperl. I want to align mutiple DNA sequences based on translated proteins by using clustalw. However, I don't know how to retreive aligned sequences out. > > The codes as follows (from the tutorials of HOWTOPAML): > > # > # These codes run and can find the screen print out of clustalw > ....... > my $aa_aln = $aln_factory->align(\@prots, at params); > # project the protein alignment back to CDS coordinates > my $dna_aln = aa_to_dna_aln($aa_aln, \%seqs); > $dna_aln should be a Bio::AlignIO object so all you need to do is setup the output stream to write the alignment object similar to what you wrote below. i.e. my $out = Bio::AlignIO->new(-file => ">out.msf" , -format => 'msf'); Then simply write the input alignment ($dna_aln) to the output stream with this: my $out->write_aln($dna_aln); > my @each = $dna_aln->each_seq(); > > # The following codes were writed by me. However, it doesn't work. I want to get teh $dna_aln contents and write it to files. > > > my $in = Bio::AlignIO->newFh(-Fh=>\$dna_aln,-format => 'fasta'); > my $aln=$dna_aln; > my $out = Bio::AlignIO->new(-file => ">out.msf" , > -format => 'msf'); > #print $out $_ while <$in>; > while ($aln = $in->next_aln() ) { > my $out->write_aln($aln); > } > > > Best regards, > > Zhuocheng > CAU > > > ------------------------------------------------------------------------ > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From melcher at rescomp.berkeley.edu Wed Oct 11 17:09:17 2006 From: melcher at rescomp.berkeley.edu (Graham Melcher) Date: Wed, 11 Oct 2006 14:09:17 -0700 Subject: [Bioperl-l] Accessing GO through MYSQL? Message-ID: <20061011210917.GA783@rescomp.berkeley.edu> Hey all, Preface:: This is my first post to this list, please redirect if my questions belong elsewhere. I need to lookup GO ontology information given GO:Accessors, and I have a local mysql db that mirrors the GO db from that website. I am not sure if the Bio::Ontology::* libraries were designed to be used in a dynamic, load-as-you-need sort of way, and am wondering how other people have gone about solving this problem. Details follow... Right now I'm using Class::DBI to access the Mysql database, then made a new set of subclassed Bio::Ontology::TermI and Bio::Ontology::RelationshipI which use these class::DBI objects to access the relevent information in the database on the fly. Unfortunately, I was getting stuck with the implementation of some of the other Bio::Ontology::*I, especially Ontology. Making all of these subclasses seems infeasible, or at least enough work that it might be available somewhere. Are mysql accessors out there, and I just haven't found them, or is Bio::Ontology possibly not way to go? Alternatively, if I end up having to write this sort of Bio::Ontology - Class::DBI interface, would anyone be interested in it being made generally usable and available? Finally, I just found go-perl, but although I haven't had a lot of time to look into it, it doesn't seem to use mysql either. Thanks! Graham -- Graham Melcher From sdavis2 at mail.nih.gov Thu Oct 12 07:51:14 2006 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Thu, 12 Oct 2006 07:51:14 -0400 Subject: [Bioperl-l] Accessing GO through MYSQL? In-Reply-To: <20061011210917.GA783@rescomp.berkeley.edu> References: <20061011210917.GA783@rescomp.berkeley.edu> Message-ID: <452E2C32.7070502@mail.nih.gov> Graham Melcher wrote: > Finally, I just found go-perl, but although I haven't had a lot of time > to look into it, it doesn't seem to use mysql either. > Yep. Keep going. Go-perl and Go-db-perl: http://www.godatabase.org/dev/go-db-perl/doc/go-db-perl-doc.html Sean From hlapp at gmx.net Thu Oct 12 00:44:49 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 12 Oct 2006 00:44:49 -0400 Subject: [Bioperl-l] NESCent Phyloinformatics Hackathon Message-ID: <939B253E-2F87-450A-A277-78B5645D3494@gmx.net> (apologies in advance to those who receive this multiple times) The National Evolutionary Synthesis Center (NESCent) in collaboration with Arlin Stoltzfus (U. Maryland, NIST), Aaron Mackey (GSK), Rutger Vos (UBC), and Mark Holder (FSU) sponsors a Phyloinformatics Hackathon to take place Dec 11-15 in Durham, NC. The (wiki) website with more information and a formal proposal is at https://www.nescent.org/wg_phyloinformatics/ In short, the goal is to leverage the Bio* toolkits to provide the "glue" for evolutionary analyses of various types that depend on automation, interoperability, and data integration. CALL FOR INPUT: The specific objectives are driven by "use cases", that is, specific target problems of interest to evolutionary biologists (click 'Use Cases' at the above website). We invite community input in order to focus efforts on the most urgent or pervasive problems. The wiki for the hackathon allows direct editing of the use cases after registration. You may also upload data files, or add comments to the "Forum" page. Alternatively, send email to hlapp at nescent.org. You may also contact any of the organizers with questions or comments. ATTENDANCE: The hackathon is scheduled for Dec 11-15, 2006 in Durham NC. Space is limited, and attendance is by invitation. If you have not been contacted but desire to attend, please contact Hilmar Lapp (hlapp at nescent.org). ORGANIZERS: Hilmar Lapp (NESCent; hlapp at nescent.org) Aaron Mackey (GSK; aaron.j.mackey at gsk.com) Mark Holder (FSU; mholder at scs.fsu.edu) Arlin Stoltzfus (CARB, NIST; arlin.stoltzfus at nist.gov) Todd Vision (NESCent; tjv at bio.unc.edu) Rutger Vos (UBC; rvosa at sfu.ca) From neetisomaiya at gmail.com Thu Oct 12 02:03:20 2006 From: neetisomaiya at gmail.com (neeti somaiya) Date: Thu, 12 Oct 2006 11:33:20 +0530 Subject: [Bioperl-l] need help urgently Message-ID: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> We are using BioPerl to parse a BLAST output file, and then we want to load full alignments into a CLOB column in one of our database tables. We are trying to use sql loader for the same. Anyone has an idea how we can go about it? We have tried loading sequences into CLOB columns using sql loader, and that works fine, but the same syntax when used for loading alignments, is not working. -- -Neeti Even my blood says, B positive From neetisomaiya at gmail.com Thu Oct 12 02:03:20 2006 From: neetisomaiya at gmail.com (neeti somaiya) Date: Thu, 12 Oct 2006 11:33:20 +0530 Subject: [Bioperl-l] need help urgently Message-ID: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> We are using BioPerl to parse a BLAST output file, and then we want to load full alignments into a CLOB column in one of our database tables. We are trying to use sql loader for the same. Anyone has an idea how we can go about it? We have tried loading sequences into CLOB columns using sql loader, and that works fine, but the same syntax when used for loading alignments, is not working. -- -Neeti Even my blood says, B positive From neetisomaiya at gmail.com Thu Oct 12 02:03:20 2006 From: neetisomaiya at gmail.com (neeti somaiya) Date: Thu, 12 Oct 2006 11:33:20 +0530 Subject: [Bioperl-l] need help urgently Message-ID: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> We are using BioPerl to parse a BLAST output file, and then we want to load full alignments into a CLOB column in one of our database tables. We are trying to use sql loader for the same. Anyone has an idea how we can go about it? We have tried loading sequences into CLOB columns using sql loader, and that works fine, but the same syntax when used for loading alignments, is not working. -- -Neeti Even my blood says, B positive From sayali_salodkar at persistent.co.in Thu Oct 12 06:16:34 2006 From: sayali_salodkar at persistent.co.in (Sayali) Date: Thu, 12 Oct 2006 15:46:34 +0530 Subject: [Bioperl-l] regarding polyphred output Message-ID: <006e01c6ede7$7f10bef0$00b1580a@persistent.co.in> Hi, I want to parse the output of polyphred http://droog.gs.washington.edu/PolyPhred.html. Is there a parser already available in Bioperl which would help me in doing the same. Thanks, Sayali DISCLAIMER ========== This e-mail may contain privileged and confidential information which is the property of Persistent Systems Pvt. Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Pvt. Ltd. does not accept any liability for virus infected mails. From sayali_salodkar at persistent.co.in Thu Oct 12 06:16:34 2006 From: sayali_salodkar at persistent.co.in (Sayali) Date: Thu, 12 Oct 2006 15:46:34 +0530 Subject: [Bioperl-l] regarding polyphred output Message-ID: <006e01c6ede7$7f10bef0$00b1580a@persistent.co.in> Hi, I want to parse the output of polyphred http://droog.gs.washington.edu/PolyPhred.html. Is there a parser already available in Bioperl which would help me in doing the same. Thanks, Sayali DISCLAIMER ========== This e-mail may contain privileged and confidential information which is the property of Persistent Systems Pvt. Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Pvt. Ltd. does not accept any liability for virus infected mails. From sdavis2 at mail.nih.gov Thu Oct 12 06:40:12 2006 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Thu, 12 Oct 2006 06:40:12 -0400 Subject: [Bioperl-l] need help urgently In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> Message-ID: <200610120640.12250.sdavis2@mail.nih.gov> On Thursday 12 October 2006 02:03, neeti somaiya wrote: > We are using BioPerl to parse a BLAST output file, and then we want to load > full alignments into a CLOB column in one of our database tables. We are > trying to use sql loader for the same. Anyone has an idea how we can go > about it? > We have tried loading sequences into CLOB columns using sql loader, and > that works fine, but the same syntax when used for loading alignments, is > not working. Neeti, You'll need to be a bit more specific about what you are doing. Can you post the code you are using and error messages? Also, what is "sql loader"? And what database are you trying to use? Sean From sdavis2 at mail.nih.gov Thu Oct 12 06:40:12 2006 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Thu, 12 Oct 2006 06:40:12 -0400 Subject: [Bioperl-l] need help urgently In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> Message-ID: <200610120640.12250.sdavis2@mail.nih.gov> On Thursday 12 October 2006 02:03, neeti somaiya wrote: > We are using BioPerl to parse a BLAST output file, and then we want to load > full alignments into a CLOB column in one of our database tables. We are > trying to use sql loader for the same. Anyone has an idea how we can go > about it? > We have tried loading sequences into CLOB columns using sql loader, and > that works fine, but the same syntax when used for loading alignments, is > not working. Neeti, You'll need to be a bit more specific about what you are doing. Can you post the code you are using and error messages? Also, what is "sql loader"? And what database are you trying to use? Sean From sdavis2 at mail.nih.gov Thu Oct 12 06:40:12 2006 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Thu, 12 Oct 2006 06:40:12 -0400 Subject: [Bioperl-l] need help urgently In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> Message-ID: <200610120640.12250.sdavis2@mail.nih.gov> On Thursday 12 October 2006 02:03, neeti somaiya wrote: > We are using BioPerl to parse a BLAST output file, and then we want to load > full alignments into a CLOB column in one of our database tables. We are > trying to use sql loader for the same. Anyone has an idea how we can go > about it? > We have tried loading sequences into CLOB columns using sql loader, and > that works fine, but the same syntax when used for loading alignments, is > not working. Neeti, You'll need to be a bit more specific about what you are doing. Can you post the code you are using and error messages? Also, what is "sql loader"? And what database are you trying to use? Sean From crabtree at tigr.ORG Thu Oct 12 07:28:06 2006 From: crabtree at tigr.ORG (Jonathan Crabtree) Date: Thu, 12 Oct 2006 07:28:06 -0400 Subject: [Bioperl-l] need help urgently In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> Message-ID: <452E26C6.6040800@tigr.org> Hi Neeti- neeti somaiya wrote: > We are using BioPerl to parse a BLAST output file, and then we want to load > full alignments into a CLOB column in one of our database tables. We are > trying to use sql loader for the same. Anyone has an idea how we can go > about it? > This doesn't sound like a BioPerl issue per se, so this list might not be the best venue for your question. Since SQL*Loader is an Oracle utility you may have better luck in a forum frequented by Oracle DBAs and/or general bioinformatics people. (Not that this isn't such a forum, but unless your difficulty is actually being caused by BioPerl, or there's some kind of SQL*Loader wrapper in BioPerl--which I don't think is the case--you run the risk of having people complain that your question doesn't have enough to do with BioPerl.) > We have tried loading sequences into CLOB columns using sql loader, and that > works fine, but the same syntax when used for loading alignments, is not > working. > It's been a while since I've done any work with SQL*Loader, but I'd guess that the reason it works with sequences and not alignments is because there are characters in the alignments (newlines, perhaps?) that SQL*Loader is incorrectly interpreting as either column (field) or row (record) delimiters. You may need to change your flat file encoding to use delimiters other than the defaults (and alter the SQL*Loader control file accordingly.) As Sean pointed out, however, it's difficult to be much help without seeing an example of a failed input and the corresponding error(s)! One other thing I remember about SQL*Loader (as of Oracle 8-9 or so) is that all the CLOB values had to appear *last* in the SQL*Loader record, at least if you were using variable-length fields. But since you've loaded sequences successfully, I doubt this is the issue. One final thought is that I believe SQL*Loader has an option whereby you can place your LOB values in files external to the main SQL*Loader input file, which sidesteps the field/row delimiter issue completely; you may want to look into this if you're not already loading your Oracle database this way. Jonathan From bix at sendu.me.uk Fri Oct 13 04:56:01 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 13 Oct 2006 09:56:01 +0100 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <4521E74E.1040404@infotech.monash.edu.au> References: <4521E74E.1040404@infotech.monash.edu.au> Message-ID: <452F54A1.7010908@sendu.me.uk> Bio::Tools::Run::WrapperBase is currently a Bio::Root::RootI. It's certainly interface-like, but doesn't follow the normal interface naming convention (WrapperBaseI). Should it be a Bio::Root::Root? Renamed WrapperBaseI? Left alone? From cjfields at uiuc.edu Fri Oct 13 08:20:58 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 13 Oct 2006 07:20:58 -0500 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <452F54A1.7010908@sendu.me.uk> References: <4521E74E.1040404@infotech.monash.edu.au> <452F54A1.7010908@sendu.me.uk> Message-ID: <43CC4E80-8F15-4C83-929D-DDC719360C8F@uiuc.edu> I would say, according to BioPerl convention, it should be renamed WrapperBaseI. It has a few interface-like methods and (importantly) lacks a constructor. Unless someone else out there has other reasoning? Note that this will require lots of bioperl-run changes as well, at least I think it will. Chris On Oct 13, 2006, at 3:56 AM, Sendu Bala wrote: > Bio::Tools::Run::WrapperBase is currently a Bio::Root::RootI. It's > certainly interface-like, but doesn't follow the normal interface > naming > convention (WrapperBaseI). Should it be a Bio::Root::Root? Renamed > WrapperBaseI? Left alone? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From avilella at gmail.com Fri Oct 13 11:26:47 2006 From: avilella at gmail.com (Albert Vilella) Date: Fri, 13 Oct 2006 16:26:47 +0100 Subject: [Bioperl-l] hickup in SimpleAlign remove_gaps method Message-ID: <358f4d650610130826k1a126983t73839d665bbcb502@mail.gmail.com> Hi all, While using the remove_gaps method in Bio::SimpleAlign I found out that if the alignment is (bad enough for) having no columns without any gap at all, the method will give a: Use of uninitialized value in split at this line in add_seq: map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq); So my idea was to tweak this line to something like: map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq || ''); But I am unsure about any other side effects this may have. Anyone? Albert. From cjfields at uiuc.edu Fri Oct 13 11:51:38 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 13 Oct 2006 10:51:38 -0500 Subject: [Bioperl-l] hickup in SimpleAlign remove_gaps method In-Reply-To: <358f4d650610130826k1a126983t73839d665bbcb502@mail.gmail.com> References: <358f4d650610130826k1a126983t73839d665bbcb502@mail.gmail.com> Message-ID: You can check to see if it passes all tests. I'm guessing SimpleAlign.t tests this method out in some way (though it's always safer to check). Chris On Oct 13, 2006, at 10:26 AM, Albert Vilella wrote: > Hi all, > > While using the remove_gaps method in Bio::SimpleAlign I found out > that if the alignment is (bad enough for) having no columns without > any gap at all, the method will give a: > > Use of uninitialized value in split at this line in add_seq: > > map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq); > > So my idea was to tweak this line to something like: > > map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq || ''); > > But I am unsure about any other side effects this may have. > > Anyone? > > Albert. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From jay at jays.net Fri Oct 13 12:09:16 2006 From: jay at jays.net (Jay Hannah) Date: Fri, 13 Oct 2006 11:09:16 -0500 Subject: [Bioperl-l] Documentation typo: Bio::Search::Result::GenericResult In-Reply-To: References: Message-ID: <452FBA2C.7070003@jays.net> Thanks Brian! My first BioPerl patch has been applied! A one byte typo correction! I'm so excited! :) /home/repository/bioperl/bioperl-live/Bio/Search/Result/GenericResult.pm,v ---------------------------- revision 1.27 date: 2006/10/10 22:41:46; author: bosborne; state: Exp; lines: +4 -4 next_hit, not next_hits ---------------------------- I'm a simple man who takes great satisfaction in the simple things. :) j Brian Osborne wrote: > j, > > No need, not for something so simple. > > Brian O. > > > On 10/7/06 6:34 PM, "Jay Hannah" wrote: >> Except that "next_hits()" does not exist. Should be "next_hit()". >> >> (Should I have posted a patch instead?) > From jay at jays.net Fri Oct 13 12:24:48 2006 From: jay at jays.net (Jay Hannah) Date: Fri, 13 Oct 2006 11:24:48 -0500 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? Message-ID: <452FBDD0.2070008@jays.net> So I'm doing the following: 1) Using Bio::SeqIO to read in a genbank file and kick out fasta. 2) Reading that fasta file w/ command line formatdb. 3) Using that output for command line blastall. 4) Using Bio::SearchIO to read the blast results. (If there's a better way, do tell. -grin-) This sequence is working great for nucleotide BLASTing, but I'm stuck on step 1 when trying protein BLAST. my $seq_in = Bio::SeqIO->new( -file => " "genbank", -alphabet => "protein" ); my $seq_out_protein = Bio::SeqIO->new( -file => ">out", -format => 'fasta', -alphabet => 'protein' ); while (my $inseq = $seq_in->next_seq) { $inseq->molecule("protein"); $seq_out_protein->write_seq($inseq); } This creates a nucleotide file "out". Setting -alphabet doesn't seem to do anything. Setting molecule("protein") doesn't seem to do anything either. I was expecting that it would just pull all the CDS strings out of the genbank file and dump those into fasta format? Am I missing something obvious? Thanks, j From bosborne11 at verizon.net Fri Oct 13 12:54:02 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Fri, 13 Oct 2006 12:54:02 -0400 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? In-Reply-To: <452FBDD0.2070008@jays.net> Message-ID: Jay, You're looking for the "translation" string in the CDS section, yes? You need to delve a bit into features, the CDS is considered to be a feature of the main or parent nucleotide sequence and the translation is part of CDS feature: http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Features_from_Genbank Brian O. On 10/13/06 12:24 PM, "Jay Hannah" wrote: > Am I missing something From bix at sendu.me.uk Fri Oct 13 12:59:46 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 13 Oct 2006 17:59:46 +0100 Subject: [Bioperl-l] Documentation typo: Bio::Search::Result::GenericResult In-Reply-To: <452FBA2C.7070003@jays.net> References: <452FBA2C.7070003@jays.net> Message-ID: <452FC602.3080302@sendu.me.uk> Jay Hannah wrote: > Thanks Brian! > > My first BioPerl patch has been applied! A one byte typo correction! I'm so excited! :) > > /home/repository/bioperl/bioperl-live/Bio/Search/Result/GenericResult.pm,v > ---------------------------- > revision 1.27 > date: 2006/10/10 22:41:46; author: bosborne; state: Exp; lines: +4 -4 > next_hit, not next_hits > ---------------------------- Congratulations! :D Next it will be two byte corrections and from there, the sky's the limit! :) From hlapp at gmx.net Fri Oct 13 13:28:50 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 13 Oct 2006 13:28:50 -0400 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <452F54A1.7010908@sendu.me.uk> References: <4521E74E.1040404@infotech.monash.edu.au> <452F54A1.7010908@sendu.me.uk> Message-ID: <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net> What does the POD (and the code) say about instantiating it? -hilmar On Oct 13, 2006, at 4:56 AM, Sendu Bala wrote: > Bio::Tools::Run::WrapperBase is currently a Bio::Root::RootI. It's > certainly interface-like, but doesn't follow the normal interface > naming > convention (WrapperBaseI). Should it be a Bio::Root::Root? Renamed > WrapperBaseI? Left alone? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From jay at jays.net Fri Oct 13 14:56:38 2006 From: jay at jays.net (Jay Hannah) Date: Fri, 13 Oct 2006 13:56:38 -0500 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? In-Reply-To: References: Message-ID: <452FE166.5080405@jays.net> Brian Osborne wrote: > You're looking for the "translation" string in the CDS section, yes? You > need to delve a bit into features, the CDS is considered to be a feature of > the main or parent nucleotide sequence and the translation is part of CDS > feature: > > http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Features_from_Genbank Yes. Thanks. I "rolled my own" -- I'm now doing this: while (my $inseq = $seq_in->next_seq) { my @features = $inseq->get_SeqFeatures(); foreach my $feat ( @features ) { next unless ($feat->primary_tag eq "CDS"); my @db_xrefs = $feat->annotation->get_Annotations("db_xref"); @db_xrefs = grep { /^GI:/ } @db_xrefs; die "Panic! More than one GI: db_xref?" if (@db_xrefs > 1); die "Panic! No GI: db_xref?" unless (@db_xrefs == 1); my $gi = $db_xrefs[0]; $gi =~ s/^GI://; my @translations = $feat->annotation->get_Annotations("translation"); die "Panic! More than one translation?" if (@translations > 1); my @protein_ids = $feat->annotation->get_Annotations("protein_id"); die "Panic! More than one protein_id?" if (@protein_ids > 1); my @product = $feat->annotation->get_Annotations("product"); die "Panic! More than one product?" if (@product > 1); print ">gi|$gi|gb|$protein_ids[0]|"; print $inseq->id . " $product[0]\n"; print "$translations[0]\n"; } } To generate a homebrew fasta file for a protein BLAST. I just thought that -alphabet and molecule() would do that stuff for me? What else would "protein" mean in those? Does anyone use -alphabet and/or molecule()? For what? How? Again, here's what I'm talking about: ========== my $seq_out_protein = Bio::SeqIO->new( -file => ">out", -format => 'fasta', -alphabet => 'protein' # No effect? ); while (my $inseq = $seq_in->next_seq) { $inseq->molecule("protein"); # No effect? ========== Thanks, j From bosborne11 at verizon.net Fri Oct 13 17:20:40 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Fri, 13 Oct 2006 17:20:40 -0400 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? In-Reply-To: <452FE166.5080405@jays.net> Message-ID: Jay, Yes, people use the -alphabet parameter. If you set it to something then Bioperl will not try to determine whether the sequence is protein, rna, or dna and this is particularly useful when the sequence contains characters that Bioperl would object to (sequences with distasteful characters can be created by various applications, for example, or you might introduce some weird character for some reason). Setting the -alphabet would also speed up Bioperl a bit, for the same reason. Brian O. On 10/13/06 2:56 PM, "Jay Hannah" wrote: > > I just thought that -alphabet and molecule() would do that stuff for me? What > else would "protein" mean in those? From jay at jays.net Sat Oct 14 11:25:05 2006 From: jay at jays.net (Jay Hannah) Date: Sat, 14 Oct 2006 10:25:05 -0500 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? In-Reply-To: References: Message-ID: <45310151.5050901@jays.net> Brian Osborne wrote: > Yes, people use the -alphabet parameter. If you set it to something then > Bioperl will not try to determine whether the sequence is protein, rna, or > dna and this is particularly useful when the sequence contains characters > that Bioperl would object to (sequences with distasteful characters can be > created by various applications, for example, or you might introduce some > weird character for some reason). Setting the -alphabet would also speed up > Bioperl a bit, for the same reason. Huh. That's what I assumed when I stumbled into the -alphabet parameter. So I thought this would read the protein sequences out of my genbank file and write a fasta file for me: my $seq_in = Bio::SeqIO->new( -file => "<$file", -format => "genbank", -alphabet => "protein" # No effect? ); my $seq_out = Bio::SeqIO->new( -file => ">$outfile", -format => "fasta", -alphabet => "protein" # No effect? ); while (my $inseq = $seq_in->next_seq) { $inseq->molecule("protein"); # No effect? $seq_out->write_seq($inseq); } It didn't. Would it be a Good Thing if it did what I was expecting? (Like I said I rolled my own, but I'm always looking for ways to enhance BioPerl that other people might find useful... Someday I will contribute something useful, by golly. -grin-) (Background: I'm doing protein BLASTs from genbank files. To make formatdb happy I have to have fasta files full of the protein sequences.) j From bosborne11 at verizon.net Sat Oct 14 14:40:21 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Sat, 14 Oct 2006 14:40:21 -0400 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? In-Reply-To: <45310151.5050901@jays.net> Message-ID: Jay, What you expected was that setting the -alphabet to "protein" would make Bioperl translate the input nucleotide sequence to output protein. In Bioperl this is accomplished by using the translate() method, no surprise there. If you take a look at the documentation on translate() in the online Bioperl Tutorial you'll see that this is a fairly sophisticated method, you can do all sorts of different things with it. So using -alphabet for this purpose won't really work, there are too many different ways to translate. Brian O. On 10/14/06 11:25 AM, "Jay Hannah" wrote: > Would it be a Good Thing if it did what I was expecting? From cjfields at uiuc.edu Sat Oct 14 20:44:04 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 14 Oct 2006 19:44:04 -0500 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? In-Reply-To: <45310151.5050901@jays.net> Message-ID: <000601c6eff3$084663c0$15327e82@pyrimidine> ... > Huh. That's what I assumed when I stumbled into the -alphabet parameter. > So I thought this would read the protein sequences out of my genbank file > and write a fasta file for me: You have to think about it this way: the GenBank record you are using is for the nucleotide sequence only, and all other information in that record describes the sequence. Similarly, if you used a 'GenPept' sequence, the focus would be the protein sequence. Both normally contain annotations which describe the sequence globally, such as references, organism info, etc. Both also may contain features (or SeqFeatures), which describe a feature bound to a particular location on the sequence. However, features are not an absolute requirement for a sequence; they're sort of 'window dressing', albeit almost always essential for describing the main sequence. I would do exactly as Brian suggests. See the Feature/Annotation HOWTO for ideas on how to screen out the particular features you want and either grab the 'translation' tag data or get the sequence object from the feature and translate it directly. You should get the same result either way though getting the tag may be faster. ... > It didn't. Would it be a Good Thing if it did what I was expecting? (Like > I said I rolled my own, but I'm always looking for ways to enhance BioPerl > that other people might find useful... Someday I will contribute something > useful, by golly. -grin-) > > (Background: I'm doing protein BLASTs from genbank files. To make formatdb > happy I have to have fasta files full of the protein sequences.) > > j You could, theoretically, write up a method to only retrieve features which correspond to coding regions only (CDS). You may want to optionally screen out pseudogenes but that's up to you. Chris From avilella at gmail.com Sun Oct 15 07:08:23 2006 From: avilella at gmail.com (Albert Vilella) Date: Sun, 15 Oct 2006 12:08:23 +0100 Subject: [Bioperl-l] no_residues test in SimpleAlign.t Message-ID: <358f4d650610150408r23cdf693t8e3b6b797649ff58@mail.gmail.com> Hi all, Can somebody check the SimpleAlign.t test? perl t/SimpleAlign.t I get a few errors, I am looking at one that deals with no_residues. I don't understand if this is suposed to work: sub no_residues { my $self = shift; my $count = 0; foreach my $seq ($self->each_seq) { my $str = $seq->seq(); $count += ($str =~ s/[^A-Za-z]//g); #is this the same as: # $str =~ s/[^A-Za-z]//g; # $count += length($str); } Cheers, Albert. return $count; } From cjfields at uiuc.edu Sun Oct 15 13:53:50 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 15 Oct 2006 12:53:50 -0500 Subject: [Bioperl-l] no_residues test in SimpleAlign.t In-Reply-To: <358f4d650610150408r23cdf693t8e3b6b797649ff58@mail.gmail.com> References: <358f4d650610150408r23cdf693t8e3b6b797649ff58@mail.gmail.com> Message-ID: Albert, I get all 75 tests passing. SimpleAlign.t was recently switched over to Test::More, so you should be seeing more explicit test descriptions. It looks like test 27 is no_residues(). Were there any more that failed? I usually run 'perl -I. t/test.t' from the main bioperl directory to check individual tests from the local directory. Otherwise you are checking your installed version which may be older (and may not match tests and recent bug fixes). Could that be the problem? Chris On Oct 15, 2006, at 6:08 AM, Albert Vilella wrote: > Hi all, > > Can somebody check the SimpleAlign.t test? > > perl t/SimpleAlign.t > > I get a few errors, I am looking at one that deals with no_residues. I > don't understand if this is suposed to work: > > sub no_residues { > my $self = shift; > my $count = 0; > > foreach my $seq ($self->each_seq) { > my $str = $seq->seq(); > > $count += ($str =~ s/[^A-Za-z]//g); > #is this the same as: > # $str =~ s/[^A-Za-z]//g; > # $count += length($str); > } > > Cheers, > > Albert. > return $count; > } > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From DGroskreutz at twt.com Mon Oct 16 02:00:39 2006 From: DGroskreutz at twt.com (DGroskreutz at twt.com) Date: Mon, 16 Oct 2006 01:00:39 -0500 Subject: [Bioperl-l] CN=Deb Groskreutz/OU=MSN/O=TWT is out of the office. Message-ID: I will be out of the office starting 10/13/2006 and will not return until 10/30/2006. I will be out of the office until October 30, 2006. I will reply to your message at that time. Thanks, Deb NOTICE OF CONFIDENTIALITY: The information contained in this communication, including attachments, is intended for the specific delivery to and use by the individual(s) to whom it is addressed. This email includes confidential information that may be attorney-client privileged. Any review, retransmission, dissemination, or unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please reply to the sender immediately and delete the original communication and any copy of it from your computer system, including all attachments. From bix at sendu.me.uk Mon Oct 16 04:08:34 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 09:08:34 +0100 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net> References: <4521E74E.1040404@infotech.monash.edu.au> <452F54A1.7010908@sendu.me.uk> <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net> Message-ID: <45333E02.9070808@sendu.me.uk> Hilmar Lapp wrote: > What does the POD (and the code) say about instantiating it? =head1 SYNOPSIS # do not use this object directly, it provides the following methods # for its subclasses ... =head1 DESCRIPTION This is a basic module from which to build executable wrapper modules. It has some basic methods to help when implementing new modules. There is no new() method. From bix at sendu.me.uk Mon Oct 16 09:23:41 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 14:23:41 +0100 Subject: [Bioperl-l] Bio::WebAgent sleep warning Message-ID: <453387DD.3040105@sendu.me.uk> Hi, Does anyone think it's appropriate for Bio::WebAgent to issue warnings every time it sleeps? I'd consider the sleeping part of its normal, expected and desired behaviour so I don't need to be warned about it. Perhaps change the $self->warn to a $self->debug? From cjfields at uiuc.edu Mon Oct 16 10:12:10 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 09:12:10 -0500 Subject: [Bioperl-l] Bio::WebAgent sleep warning In-Reply-To: <453387DD.3040105@sendu.me.uk> Message-ID: <000c01c6f12d$121b5000$15327e82@pyrimidine> > Hi, > > Does anyone think it's appropriate for Bio::WebAgent to issue warnings > every time it sleeps? I'd consider the sleeping part of its normal, > expected and desired behaviour so I don't need to be warned about it. > Perhaps change the $self->warn to a $self->debug? That sounds fine. Using debugging output for sleep would be similar behavior to Bio::DB::NCBIHelper and BioDB::GenericWebDBI. You may want to pass it by Heikki (I think that's his module). The only reason I would want to see sleep output, personally, is to make sure it is working properly. Almost looks like that class has the same intent that GenericWebDBI has (even down to using LWP::UserAgent as a superclass). I may look into it to see if I can use this as a superclass for GenericWebDBI. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Mon Oct 16 10:26:21 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Mon, 16 Oct 2006 15:26:21 +0100 Subject: [Bioperl-l] Bioperl Server Reconfig Message-ID: <4533968D.6040009@sheffield.ac.uk> Did anyone reconfigure the bioperl web server (which ever server hosts http://bioperl.org/DIST) by adding the following lines to the httpd.conf file: RedirectMatch /DIST/MSWin32-x86-multi-thread-5.8/(.*) http://ppm.activestate.com/PPMPackages/5.8-windows/MSWin32-x86-multi-thread-5.8/$1 This will be required as a workaround to a bug in ActivePerl 5.8.8.819 which will result in a failed install of Bioperl via PPM. Cheers Nath From n.haigh at sheffield.ac.uk Mon Oct 16 11:30:16 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Mon, 16 Oct 2006 16:30:16 +0100 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533A257.2000207@campus.iztacala.unam.mx> References: <4533968D.6040009@sheffield.ac.uk> <4533A257.2000207@campus.iztacala.unam.mx> Message-ID: <4533A588.9020505@sheffield.ac.uk> Mauricio Herrera Cuadra wrote: > Done. Could you please check if it works as it should? > > Cheers, > Mauricio. Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got someone to pop it in http://bioperl/DIST Volunteers? BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for the PPD? I seem to remember that there was talk about having to maintain a separate Bundle::BioPerl for each release of Bioperl. Any ideas on this front? Nath From arareko at campus.iztacala.unam.mx Mon Oct 16 11:16:39 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Mon, 16 Oct 2006 10:16:39 -0500 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533968D.6040009@sheffield.ac.uk> References: <4533968D.6040009@sheffield.ac.uk> Message-ID: <4533A257.2000207@campus.iztacala.unam.mx> Done. Could you please check if it works as it should? Cheers, Mauricio. Nathan Haigh wrote: > Did anyone reconfigure the bioperl web server (which ever server hosts > http://bioperl.org/DIST) by adding the following lines to the httpd.conf > file: > > RedirectMatch /DIST/MSWin32-x86-multi-thread-5.8/(.*) > http://ppm.activestate.com/PPMPackages/5.8-windows/MSWin32-x86-multi-thread-5.8/$1 > > This will be required as a workaround to a bug in ActivePerl 5.8.8.819 > which will result in a failed install of Bioperl via PPM. > > Cheers > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From arareko at campus.iztacala.unam.mx Mon Oct 16 11:33:33 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Mon, 16 Oct 2006 10:33:33 -0500 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533A588.9020505@sheffield.ac.uk> References: <4533968D.6040009@sheffield.ac.uk> <4533A257.2000207@campus.iztacala.unam.mx> <4533A588.9020505@sheffield.ac.uk> Message-ID: <4533A64D.6040203@campus.iztacala.unam.mx> Nathan Haigh wrote: > Mauricio Herrera Cuadra wrote: >> Done. Could you please check if it works as it should? >> >> Cheers, >> Mauricio. > Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got > someone to pop it in http://bioperl/DIST > > Volunteers? You can send it to me. -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From akarger at CGR.Harvard.edu Mon Oct 16 11:54:33 2006 From: akarger at CGR.Harvard.edu (Amir Karger) Date: Mon, 16 Oct 2006 11:54:33 -0400 Subject: [Bioperl-l] Bio::Location::Split Message-ID: I recently came across bug 2101, where Bio::Location::Split::to_FTstring gives the incorrect order for multi-sublocation locations on the minus strand. That is, I found it by getting incorrect results, and then found it in Bugzilla and in the September archives. I'm converting CDS files from one format to another. E.g., I read an EMBL file with a chromosome and CDS features, and want to output the location in a FASTA header. If I do something like: foreach (<$in>) { foreach my $feat ($seq->getSeqFeatures) { print $feat->location->to_FTstring() } } I get the wrong results for multi-exon CDSs on the -1 strand, as described in the bug report. Is there a relatively easy way around this? I assume I can't get at the original string of the location, which in this case is all I need. Can I just flip the order of the exons in certain cases? Chris F, can you tell me the preliminary solution you mentioned? I must say I'm sort of surprised this wasn't found before. It seems like a not-that-rare occurrence. Oh well. Thanks, - Amir Karger Research Computing Life Sciences Division Harvard University From bix at sendu.me.uk Mon Oct 16 12:14:39 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 17:14:39 +0100 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533A588.9020505@sheffield.ac.uk> References: <4533968D.6040009@sheffield.ac.uk> <4533A257.2000207@campus.iztacala.unam.mx> <4533A588.9020505@sheffield.ac.uk> Message-ID: <4533AFEF.8080103@sendu.me.uk> Nathan Haigh wrote: > Mauricio Herrera Cuadra wrote: >> Done. Could you please check if it works as it should? >> >> Cheers, >> Mauricio. > Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got > someone to pop it in http://bioperl/DIST > > Volunteers? I'm sure Mauricio would be happy to do it, but so am I. You may want to hold off a little while until I release rc2, which may be a few hours away. > BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for > the PPD? I seem to remember that there was talk about having to maintain > a separate Bundle::BioPerl for each release of Bioperl. Any ideas on > this front? It depends on what is in the PPD and what kind of auto-dependency features the ActiveState installer has. Given Perl 5.8 and your current PPD, does Bioperl install with the same or fewer number of skips if you also install Bundle::BioPerl first? That is, does Bundle::BioPerl even do anything useful anymore? If not, obviously don't bother making it a pre-req. If it does, my opinion is that you make it a pre-req. If people really don't want to install the optional stuff they can download the .zip file and install manually without even a make. From Kevin.M.Brown at asu.edu Mon Oct 16 12:14:51 2006 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Mon, 16 Oct 2006 09:14:51 -0700 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? Message-ID: <1A4207F8295607498283FE9E93B775B402196FAA@EX02.asurite.ad.asu.edu> > > Yes, people use the -alphabet parameter. If you set it to > something then > > Bioperl will not try to determine whether the sequence is > protein, rna, or > > dna and this is particularly useful when the sequence > contains characters > > that Bioperl would object to (sequences with distasteful > characters can be > > created by various applications, for example, or you might > introduce some > > weird character for some reason). Setting the -alphabet > would also speed up > > Bioperl a bit, for the same reason. > > Huh. That's what I assumed when I stumbled into the -alphabet > parameter. So I thought this would read the protein sequences > out of my genbank file and write a fasta file for me: > > my $seq_in = Bio::SeqIO->new( > -file => "<$file", > -format => "genbank", > -alphabet => "protein" # No effect? > ); > my $seq_out = Bio::SeqIO->new( > -file => ">$outfile", > -format => "fasta", > -alphabet => "protein" # No effect? > ); > while (my $inseq = $seq_in->next_seq) { > $inseq->molecule("protein"); # No effect? > $seq_out->write_seq($inseq); > } > > It didn't. Would it be a Good Thing if it did what I was > expecting? (Like I said I rolled my own, but I'm always > looking for ways to enhance BioPerl that other people might > find useful... Someday I will contribute something useful, by > golly. -grin-) > > (Background: I'm doing protein BLASTs from genbank files. To > make formatdb happy I have to have fasta files full of the > protein sequences.) This might work for your needs (CDS to protein FASTA). my $seq_in = Bio::SeqIO->new( -file => "<$file", -format => "genbank", ); open my $seq_out, ">$outfile"; while (my $inseq = $seq_in->next_seq) { print $seq_out ">". $inseq->display_id(). "\n"; print $seq_out $inseq->translate() ."\n"; } From bix at sendu.me.uk Mon Oct 16 11:44:19 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 16:44:19 +0100 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? Message-ID: <4533A8D3.90709@sendu.me.uk> I think Chris recently deprecated this, but should it be? For me, its POD description justifies its existence, and perhaps more importantly, Bio::Index::Blast relies on it. I took a quick peek at the latter and it didn't seem trivial to move it over to Bio::SearchIO instead. Should it be undeprecated? From n.haigh at sheffield.ac.uk Mon Oct 16 12:39:02 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Mon, 16 Oct 2006 17:39:02 +0100 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533AFEF.8080103@sendu.me.uk> References: <4533968D.6040009@sheffield.ac.uk> <4533A257.2000207@campus.iztacala.unam.mx> <4533A588.9020505@sheffield.ac.uk> <4533AFEF.8080103@sendu.me.uk> Message-ID: <4533B5A6.1070701@sheffield.ac.uk> Sendu Bala wrote: > Nathan Haigh wrote: >> Mauricio Herrera Cuadra wrote: >>> Done. Could you please check if it works as it should? >>> >>> Cheers, >>> Mauricio. >> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got >> someone to pop it in http://bioperl/DIST >> >> Volunteers? > > I'm sure Mauricio would be happy to do it, but so am I. You may want > to hold off a little while until I release rc2, which may be a few > hours away. Just e-mailed Mauricio links to the files off list, It's not a big job for me to remake the bioperl PPD, so Mauricio it's up to you if you want to wait 18hrs for me to make the PPDs for 1.5.2-rc2. > > >> BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for >> the PPD? I seem to remember that there was talk about having to maintain >> a separate Bundle::BioPerl for each release of Bioperl. Any ideas on >> this front? > > It depends on what is in the PPD and what kind of auto-dependency > features the ActiveState installer has. Given Perl 5.8 and your > current PPD, does Bioperl install with the same or fewer number of > skips if you also install Bundle::BioPerl first? That is, does > Bundle::BioPerl even do anything useful anymore? If not, obviously > don't bother making it a pre-req. If it does, my opinion is that you > make it a pre-req. If people really don't want to install the optional > stuff they can download the .zip file and install manually without > even a make. As far as the PPDs are concerned - no tests are run during installation. PPM more or less just copies files into the correct place for Perl to find so both approaches result in the same thing. However, I've not tried making a CPAN distribution file for either Bioperl or Bundle::Bioperl - I wouldn't know where to start! MakeFile.PL now only documents the prereq in one place (%packages), and this is used to add the prereq to the bioperl PPD when issuing "nmake ppd". This way, each release of BioPerl should be up-to-date with prereq as long as developers add their modules prereq to %packages. If we have Bundle::BioPerl, most of those prereq need to be moved from the Bioperl PPD to the Bundle::BioPerl PPD - a bit of a pain because there are no guidelines as to what should/should not go in Bundle::BioPerl. Therefore, as far as the PPDs are concerned, it far easier to do away with Bundel::BioPerl. Nath From hlapp at gmx.net Mon Oct 16 13:04:24 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 16 Oct 2006 13:04:24 -0400 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <45333E02.9070808@sendu.me.uk> References: <4521E74E.1040404@infotech.monash.edu.au> <452F54A1.7010908@sendu.me.uk> <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net> <45333E02.9070808@sendu.me.uk> Message-ID: <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net> So it looks like an abstract base class, not an interface that defines a contract or API? Should use Root.pm then, would be my vote. -hilmar On Oct 16, 2006, at 4:08 AM, Sendu Bala wrote: > Hilmar Lapp wrote: >> What does the POD (and the code) say about instantiating it? > > =head1 SYNOPSIS > > # do not use this object directly, it provides the following > methods > # for its subclasses > > ... > > > =head1 DESCRIPTION > > This is a basic module from which to build executable wrapper modules. > It has some basic methods to help when implementing new modules. > > > There is no new() method. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Mon Oct 16 13:08:28 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 16 Oct 2006 13:08:28 -0400 Subject: [Bioperl-l] Bio::WebAgent sleep warning In-Reply-To: <453387DD.3040105@sendu.me.uk> References: <453387DD.3040105@sendu.me.uk> Message-ID: It depends. What triggers the sleeping? If it's part of every request that it processes then I'd agree. If it is triggered by failure to precede the next try then the failure is probably not expected (though possible), and hence should be reported by warn(). If it is just part of the polling cycle then there should probably be a limit up to which the time waited is considered 'normal' and after which it is considered 'excessive' and hence should be reported through warn(). My $0.02. -hilmar On Oct 16, 2006, at 9:23 AM, Sendu Bala wrote: > Hi, > > Does anyone think it's appropriate for Bio::WebAgent to issue warnings > every time it sleeps? I'd consider the sleeping part of its normal, > expected and desired behaviour so I don't need to be warned about it. > Perhaps change the $self->warn to a $self->debug? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Mon Oct 16 13:13:53 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 18:13:53 +0100 Subject: [Bioperl-l] Bio::WebAgent sleep warning In-Reply-To: References: <453387DD.3040105@sendu.me.uk> Message-ID: <4533BDD1.8060204@sendu.me.uk> Hilmar Lapp wrote: > It depends. What triggers the sleeping? If it's part of every request > that it processes then I'd agree. If it is triggered by failure to > precede the next try then the failure is probably not expected (though > possible), and hence should be reported by warn(). > > If it is just part of the polling cycle then there should probably be a > limit up to which the time waited is considered 'normal' and after which > it is considered 'excessive' and hence should be reported through warn(). =head2 sleep Title : sleep Usage : $self->sleep Function: sleep for a number of seconds indicated by the delay policy Returns : none Args : none NOTE: This method keeps track of the last time it was called and only imposes a sleep if it was called more recently than the delay_policy() allows. =cut It issues a warning every time it actually sleeps. I find it inappropriate that a method warns me that it did what I asked it to do. From arareko at campus.iztacala.unam.mx Mon Oct 16 13:14:06 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Mon, 16 Oct 2006 12:14:06 -0500 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533B5A6.1070701@sheffield.ac.uk> References: <4533968D.6040009@sheffield.ac.uk> <4533A257.2000207@campus.iztacala.unam.mx> <4533A588.9020505@sheffield.ac.uk> <4533AFEF.8080103@sendu.me.uk> <4533B5A6.1070701@sheffield.ac.uk> Message-ID: <4533BDDE.2040801@campus.iztacala.unam.mx> Nathan Haigh wrote: > Sendu Bala wrote: >> Nathan Haigh wrote: >>> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got >>> someone to pop it in http://bioperl/DIST >>> >>> Volunteers? >> I'm sure Mauricio would be happy to do it, but so am I. You may want >> to hold off a little while until I release rc2, which may be a few >> hours away. > > Just e-mailed Mauricio links to the files off list, It's not a big job > for me to remake the bioperl PPD, so Mauricio it's up to you if you want > to wait 18hrs for me to make the PPDs for 1.5.2-rc2. Too late, I've already placed 1.5.2-rc1 in DIST. hehe :) -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From bix at sendu.me.uk Mon Oct 16 12:32:11 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 17:32:11 +0100 Subject: [Bioperl-l] Swissprot problems Message-ID: <4533B40B.2030908@sendu.me.uk> t/Biofetch.t and t/DB.t are skipping their swissprot database fetches. Earlier today http://www.ebi.ac.uk/cgi-bin/dbfetch was down for maintenance but is now back up. However I'm guessing the databases must have changed. I've manually looked for the test case 'YNB3_YEAST' in database 'UniProtKB' and it came back with no result, even though I can find the test case manually at the expasy website. Is this an EBI bug or deliberate change that makes sense to someone? From m.weimer at dkfz-heidelberg.de Mon Oct 16 12:43:38 2006 From: m.weimer at dkfz-heidelberg.de (Marc Weimer) Date: Mon, 16 Oct 2006 18:43:38 +0200 Subject: [Bioperl-l] Bio::DB::SwissProt Problem Message-ID: <1161017019.5203.6.camel@localhost> Dear list members, when running ###################################################################### #! /usr/bin/perl -w use strict; use Bio::DB::SwissProt; my $db_obj = new Bio::DB::SwissProt(-verbose => 1); my $seq_obj = $db_obj->get_Seq_by_acc("O02938"); ###################################################################### using Bioperl 1.5.2 I get the following message: ########################################################################################## request is POST http://www.ebi.ac.uk/cgi-bin/dbfetch Content-Length: 49 Content-Type: application/x-www-form-urlencoded format=swissprot&db=UniProtKB&style=raw&id=O02938 ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: acc O02938 does not exist STACK: Error::throw STACK: Bio::Root::Root::throw /usr/local/share/perl/5.8.7/Bio/Root/Root.pm:350 STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc /usr/local/share/perl/5.8.7/Bio/DB/WebDBSeqI.pm:181 STACK: ./get.test.pl:8 ----------------------------------------------------------- ########################################################################################## But the accession number does exist. Surprisingly, everything worked fine a few days ago. Any ideas of what might have happened? Thanks and best regards, Marc From hlapp at gmx.net Mon Oct 16 13:15:50 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 16 Oct 2006 13:15:50 -0400 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <4533A8D3.90709@sendu.me.uk> References: <4533A8D3.90709@sendu.me.uk> Message-ID: The problem is it is not maintained, and there are outstanding been bug reports. If you un-deprecate it, then we need a response to people who come across problems with it when using it. Either you change the POD to say exactly who and when one should use it (or rather not) and point to the fact that it is unsupported for all other cases. Or what would you suggest? -hilmar On Oct 16, 2006, at 11:44 AM, Sendu Bala wrote: > I think Chris recently deprecated this, but should it be? For me, its > POD description justifies its existence, and perhaps more importantly, > Bio::Index::Blast relies on it. > > I took a quick peek at the latter and it didn't seem trivial to > move it > over to Bio::SearchIO instead. > > Should it be undeprecated? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Mon Oct 16 13:21:46 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 12:21:46 -0500 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <4533A8D3.90709@sendu.me.uk> Message-ID: <000001c6f147$8efdfd60$15327e82@pyrimidine> Bio::Tools::BPlite was placed on the deprecation list a while back (~ rel 1.5); the other related Bio::Tools::BP* modules were also supposed to be on that list as well. If we want to undeprecate (de-deprecate? reprecate?) BPlite we also would need to do the same for the others. They must be updated to parse current BLAST/PSI-BLAST/bl2seq text output, something that Bio::SearchIO::blast is currently capable of (so the functionality is redundant). And someone needs to take them over. In my opinion it may be more trouble than it's worth as they haven't been touched in a while. Seems if we 'revive' BPlite we're not really moving forward esp. since you have added the PullParser recently and made substantial improvements to SearchIO. Maybe Bio::Index::Blast just needs to be deprecated or rewritten to use SearchIO? Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Sendu Bala > Sent: Monday, October 16, 2006 10:44 AM > To: bioperl-l > Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? > > I think Chris recently deprecated this, but should it be? For me, its > POD description justifies its existence, and perhaps more importantly, > Bio::Index::Blast relies on it. > > I took a quick peek at the latter and it didn't seem trivial to move it > over to Bio::SearchIO instead. > > Should it be undeprecated? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From bix at sendu.me.uk Mon Oct 16 13:21:58 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 18:21:58 +0100 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: References: <4533A8D3.90709@sendu.me.uk> Message-ID: <4533BFB6.5070504@sendu.me.uk> Hilmar Lapp wrote: > The problem is it is not maintained, and there are outstanding been bug > reports. > > If you un-deprecate it, then we need a response to people who come > across problems with it when using it. Either you change the POD to say > exactly who and when one should use it (or rather not) and point to the > fact that it is unsupported for all other cases. > > Or what would you suggest? I'm not sure. Does Bio::Index::Blast even work correctly? Does it suffer from whatever bugs Bio::Tools::BPlite has? Is Bio::Index::Blast maintained? Should that be deprecated as well? Ideally I'd like to see Bio::Index::Blast updated to use Bio::SearchIO and then Bio::Tools::BPlite can be deprecated. But like I say, it didn't seem trivial (or even appropriate). Ultimately I just wanted to solve the warnings in the test suite. Thoughts, Chris? From cjfields at uiuc.edu Mon Oct 16 13:30:05 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 12:30:05 -0500 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533A588.9020505@sheffield.ac.uk> Message-ID: <000101c6f148$b8538b20$15327e82@pyrimidine> > Mauricio Herrera Cuadra wrote: > > Done. Could you please check if it works as it should? > > > > Cheers, > > Mauricio. > Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got > someone to pop it in http://bioperl/DIST > > Volunteers? > > BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for > the PPD? I seem to remember that there was talk about having to maintain > a separate Bundle::BioPerl for each release of Bioperl. Any ideas on > this front? > > Nath Nathan, I think Chris Dagdigian still maintains Bundle::Bioperl on CPAN. That version should be the common basis for prereqs for any Bioperl core installation. It's relatively easy to add/remove modules to the Bundle::Bioperl. Contact Chris D. and let him know if anything needs to be changed. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 16 13:33:50 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 12:33:50 -0500 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net> Message-ID: <000201c6f149$3ed63490$15327e82@pyrimidine> > So it looks like an abstract base class, not an interface that > defines a contract or API? Should use Root.pm then, would be my vote. > > -hilmar Makes sense to me. Maybe another audit is needed to catch similar instances, or has this been done already? Chris > On Oct 16, 2006, at 4:08 AM, Sendu Bala wrote: > > > Hilmar Lapp wrote: > >> What does the POD (and the code) say about instantiating it? > > > > =head1 SYNOPSIS > > > > # do not use this object directly, it provides the following > > methods > > # for its subclasses > > > > ... > > > > > > =head1 DESCRIPTION > > > > This is a basic module from which to build executable wrapper modules. > > It has some basic methods to help when implementing new modules. > > > > > > There is no new() method. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 16 13:57:35 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 12:57:35 -0500 Subject: [Bioperl-l] Bio::Location::Split In-Reply-To: Message-ID: <000301c6f14c$8fb0e060$15327e82@pyrimidine> > I recently came across bug 2101, where Bio::Location::Split::to_FTstring > gives the incorrect order for multi-sublocation locations on the minus > strand. That is, I found it by getting incorrect results, and then found > it in Bugzilla and in the September archives. > > I'm converting CDS files from one format to another. E.g., I read an > EMBL file with a chromosome and CDS features, and want to output the > location in a FASTA header. If I do something like: > > foreach (<$in>) { > foreach my $feat ($seq->getSeqFeatures) { > print $feat->location->to_FTstring() > } > } > > I get the wrong results for multi-exon CDSs on the -1 strand, as > described in the bug report. > > Is there a relatively easy way around this? I assume I can't get at the > original string of the location, which in this case is all I need. Can I > just flip the order of the exons in certain cases? Chris F, can you tell > me the preliminary solution you mentioned? > > I must say I'm sort of surprised this wasn't found before. It seems like > a not-that-rare occurrence. Oh well. > > Thanks, > > - Amir Karger > Research Computing > Life Sciences Division > Harvard University Could you let me know specifically which EMBL file contains the odd locations? The bug report uses theoretical locations, not actual ones, so it would be nice to have a real-world example to test against. As for the lack of catching this, the particular types of locations that cause the issue are quite rare. Note that there are two bugs for that bug report. The first (and more serious) is still unresolved. The second (where remote locations are treated differently in Location::Split, which caused more problems than it was worth) had a fix committed about a month ago. Any fixes I have made for the first bug invariably break several other methods, which use the current Location::Split object logic for retrieving sequences, building feature strings, etc. Since a new RC is imminent and the bug only affects a small number of locations, I have held off until after a final release is made (the last thing I want to do is fix something that breaks ~6-8 other methods), but I'll try looking at it again this week. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 16 14:29:02 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 13:29:02 -0500 Subject: [Bioperl-l] Swissprot problems In-Reply-To: <4533B40B.2030908@sendu.me.uk> Message-ID: <000401c6f150$f57dfc30$15327e82@pyrimidine> > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Sendu Bala > Sent: Monday, October 16, 2006 11:32 AM > To: bioperl-l > Subject: [Bioperl-l] Swissprot problems > > t/Biofetch.t and t/DB.t are skipping their swissprot database fetches. > Earlier today http://www.ebi.ac.uk/cgi-bin/dbfetch was down for > maintenance but is now back up. However I'm guessing the databases must > have changed. I've manually looked for the test case 'YNB3_YEAST' in > database 'UniProtKB' and it came back with no result, even though I can > find the test case manually at the expasy website. > > Is this an EBI bug or deliberate change that makes sense to someone? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l I can confirm that. It's not our end, though. Entering the same data on the DBFetch web page also gets no data. I have emailed EBI about the problem and will let you know if I hear anything; I think it's related to the maintenance issue. Notably, nothing on the web page indicates any database name changes yet. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 16 14:29:52 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 13:29:52 -0500 Subject: [Bioperl-l] Bio::DB::SwissProt Problem In-Reply-To: <1161017019.5203.6.camel@localhost> Message-ID: <000501c6f151$12918710$15327e82@pyrimidine> We think there is a problem on the SwissProt (DBFetch) server. I have contacted them about the problem and will post something when I hear something back. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Marc Weimer > Sent: Monday, October 16, 2006 11:44 AM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] Bio::DB::SwissProt Problem > > Dear list members, > > when running > > ###################################################################### > #! /usr/bin/perl -w > > use strict; > use Bio::DB::SwissProt; > > my $db_obj = new Bio::DB::SwissProt(-verbose => 1); > > my $seq_obj = $db_obj->get_Seq_by_acc("O02938"); > ###################################################################### > > using Bioperl 1.5.2 I get the following message: > > ########################################################################## > ################ > > request is POST http://www.ebi.ac.uk/cgi-bin/dbfetch > Content-Length: 49 > Content-Type: application/x-www-form-urlencoded > > format=swissprot&db=UniProtKB&style=raw&id=O02938 > > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: acc O02938 does not exist > STACK: Error::throw > STACK: > Bio::Root::Root::throw /usr/local/share/perl/5.8.7/Bio/Root/Root.pm:350 > STACK: > Bio::DB::WebDBSeqI::get_Seq_by_acc > /usr/local/share/perl/5.8.7/Bio/DB/WebDBSeqI.pm:181 > STACK: ./get.test.pl:8 > ----------------------------------------------------------- > > ########################################################################## > ################ > > But the accession number does exist. Surprisingly, everything worked > fine a few days ago. Any ideas of what might have happened? > > Thanks and best regards, > > Marc > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Mon Oct 16 14:39:28 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 13:39:28 -0500 Subject: [Bioperl-l] SwissProt Down Message-ID: <000601c6f152$6997dbd0$15327e82@pyrimidine> Looks like the swissprot problem stems from maintenance at EBI. From the EBI page http://www.ebi.ac.uk/Information/ (not on the DBFetch page, BTW): Please Note: Monday October 16th 12:00-15:00 - Due to general maintenance, some services from the EBI may be temporarily unavailable. We apologise for any inconvenience. At least we know that Test::More skips are working! Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Mon Oct 16 14:51:31 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 19:51:31 +0100 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: References: Message-ID: <4533D4B3.2000809@sendu.me.uk> Brian Osborne wrote: > Sendu, > > I just made a commit that makes Bio::Index::Blast use SearchIO instead of > BPlite. I was concerned about the whole id_parser thing. Did you determine that your change still allows for id_parser to be used and have the intended effect, or that id_parser is in someway meaningless and should be removed as a method? From cjfields at uiuc.edu Mon Oct 16 15:03:33 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 14:03:33 -0500 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <4533BFB6.5070504@sendu.me.uk> Message-ID: <000301c6f155$c7029ff0$15327e82@pyrimidine> > Hilmar Lapp wrote: > > The problem is it is not maintained, and there are outstanding been bug > > reports. > > > > If you un-deprecate it, then we need a response to people who come > > across problems with it when using it. Either you change the POD to say > > exactly who and when one should use it (or rather not) and point to the > > fact that it is unsupported for all other cases. > > > > Or what would you suggest? > > I'm not sure. > > Does Bio::Index::Blast even work correctly? Does it suffer from whatever > bugs Bio::Tools::BPlite has? Is Bio::Index::Blast maintained? Should > that be deprecated as well? > > Ideally I'd like to see Bio::Index::Blast updated to use Bio::SearchIO > and then Bio::Tools::BPlite can be deprecated. But like I say, it didn't > seem trivial (or even appropriate). > > Ultimately I just wanted to solve the warnings in the test suite. > Thoughts, Chris? My opinion is we either have to completely support BPlite (and the others) or drop it altogether. I don't think we can state "use BPLite only with Bio::Index::Blast, use SearchIO everywhere else." That's too inconsistent. It seems simpler to deprecate the various Bio::Tools::BP* classes and either fix Bio::Index::Blast to use Bio::SearchIO (which I think Brian is working on) or deprecate Bio::Index::Blast as well. The warnings in the test suite belong to BlastIndex.t, correct? I updated using Brian's Bio::Index::blast fix and it passes now w/o warnings. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From akarger at CGR.Harvard.edu Mon Oct 16 15:00:28 2006 From: akarger at CGR.Harvard.edu (Amir Karger) Date: Mon, 16 Oct 2006 15:00:28 -0400 Subject: [Bioperl-l] Bio::Location::Split Message-ID: > -----Original Message----- > From: Chris Fields [mailto:cjfields at uiuc.edu] > > > > I'm converting CDS files from one format to another. E.g., I read an > > EMBL file with a chromosome and CDS features, and want to output the > > location in a FASTA header.> > > > I get the wrong results for multi-exon CDSs on the -1 strand, as > > described in the bug report. > > > > Could you let me know specifically which EMBL file contains the odd > locations? The bug report uses theoretical locations, not > actual ones, so > it would be nice to have a real-world example to test against. I downloaded candida glabrata chromosome B from EBI: http://www.ebi.ac.uk/genomes/eukaryota.html, CR380948 testportal>perl location.pl new_glabrata_B.embl > bio testportal>perl -wlne 'print $1 if /^FT\s+CDS\s+(.*)/' new_glabrata_B.embl > nonbio testportal>wc bio nonbio 217 217 4537 bio 217 217 4549 nonbio 434 434 9086 total testportal>diff bio nonbio 4c4 < complement(join(10632..11157,10347..10372)) --- > join(complement(10632..11157),complement(10347..10372)) Just one example here, but see below. > As for the lack of catching this, the particular types of > locations that > cause the issue are quite rare. Really? I guess our definitions of rare depend on which sequences we're working with. I'm doing fungal genomes, and here's a grep for a few species' entire genomes: testportal>foreach i ( *.embl ) foreach? echo $i foreach? grep CDS $i | grep join | grep -c complement foreach? end glabrata_orf.embl 29 hansenii_orf.embl 151 lactis_orf.embl 70 lipolytica_orf.embl 337 pombe_orf.embl 1137 You might like to use pombe as a test case, as it has lots of these complement joins, including ones with multiple introns. Anyway, I'd question the "rare" designation. It seems to me like any species that has introns will have situations like this in their CDSs. Not to mention any other sequence that uses Bio::Location::Split. (Since I'm not a Real Biologist, I can't think up mor examples here, but I'm sure they exist.) Or are you saying it's rare to use join (complement(C..D), complement(A..B)) instead of complement(join(A..B, C..D)). In that case, I guess I just got really unlucky in that five fungal genomes I was using decided to use the "rare" syntax. > Note that there are two bugs > for that bug > report. The first (and more serious) is still unresolved. The second > (where remote locations are treated differently in > Location::Split, which > caused more problems than it was worth) had a fix committed > about a month > ago. Sadly, it's the first (and in my case, more common (I have no remote locations.)) bug for me. > Any fixes I have made for the first bug invariably break several other > methods, which use the current Location::Split object logic > for retrieving > sequences, building feature strings, etc. Since a new RC is > imminent and > the bug only affects a small number of locations, I have held > off until > after a final release is made (the last thing I want to do is > fix something > that breaks ~6-8 other methods), but I'll try looking at it > again this week. IMO this is a pretty serious bug (if these kinds of sequences aren't that rare as I've shown above), because you're outputting sequence descriptions that are just plain wrong. Anyone who uses FTLocationFactory to read these output description will have incorrect sequence, incorrect translated proteins, etc. And it's even more serious if other methods are depending on it. I know I can't dictate your time, and should be volunteering to work on fixing it. But if it affects other modules, then I will no doubt break things even more than you have in your attempts. -Amir > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > From bosborne11 at verizon.net Mon Oct 16 14:25:14 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 16 Oct 2006 14:25:14 -0400 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <4533A8D3.90709@sendu.me.uk> Message-ID: Sendu, I just made a commit that makes Bio::Index::Blast use SearchIO instead of BPlite. The BlastIndex.t test is giving a few warnings so I need to take a look at that but all tests are passing. An awful lot of work has gone into the SearchIO system, for more on why its approach is deemed to be superior in the context of Bioperl see the SearchIO HOWTO. One key feature of this upcoming release is an emphasis on removing extraneous modules, I think it's safe to say that BPlite has been considered extraneous for a number of years now. Brian O. On 10/16/06 11:44 AM, "Sendu Bala" wrote: > I think Chris recently deprecated this, but should it be? For me, its > POD description justifies its existence, and perhaps more importantly, > Bio::Index::Blast relies on it. > > I took a quick peek at the latter and it didn't seem trivial to move it > over to Bio::SearchIO instead. > > Should it be undeprecated? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From bosborne11 at verizon.net Mon Oct 16 14:59:38 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 16 Oct 2006 14:59:38 -0400 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <4533D4B3.2000809@sendu.me.uk> Message-ID: Sendu, OK. I _think_ this change shouldn't affect id_parser() but I will test this in BlastIndex.t. The id_parser() method is relevant to all these Index* modules - don't know how much it's used but it certainly is nice to have it available. Brian O. On 10/16/06 2:51 PM, "Sendu Bala" wrote: > Brian Osborne wrote: >> Sendu, >> >> I just made a commit that makes Bio::Index::Blast use SearchIO instead of >> BPlite. > > I was concerned about the whole id_parser thing. Did you determine that > your change still allows for id_parser to be used and have the intended > effect, or that id_parser is in someway meaningless and should be > removed as a method? From cjfields at uiuc.edu Mon Oct 16 16:51:08 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 15:51:08 -0500 Subject: [Bioperl-l] Bio::Location::Split In-Reply-To: Message-ID: <000001c6f164$d1380190$15327e82@pyrimidine> ... > I downloaded candida glabrata chromosome B from EBI: > http://www.ebi.ac.uk/genomes/eukaryota.html, CR380948 > > testportal>perl location.pl new_glabrata_B.embl > bio > testportal>perl -wlne 'print $1 if /^FT\s+CDS\s+(.*)/' > new_glabrata_B.embl > nonbio > testportal>wc bio nonbio > 217 217 4537 bio > 217 217 4549 nonbio > 434 434 9086 total > testportal>diff bio nonbio > 4c4 > < complement(join(10632..11157,10347..10372)) > --- > > join(complement(10632..11157),complement(10347..10372)) > > Just one example here, but see below. > > > As for the lack of catching this, the particular types of > > locations that > > cause the issue are quite rare. > > Really? I guess our definitions of rare depend on which sequences we're > working with. I'm doing fungal genomes, and here's a grep for a few > species' entire genomes: > > testportal>foreach i ( *.embl ) > foreach? echo $i > foreach? grep CDS $i | grep join | grep -c complement > foreach? end > glabrata_orf.embl > 29 > hansenii_orf.embl > 151 > lactis_orf.embl > 70 > lipolytica_orf.embl > 337 > pombe_orf.embl > 1137 > > You might like to use pombe as a test case, as it has lots of these > complement joins, including ones with multiple introns. I'll use those. I'll see if an analogous GenBank file exists as well. I can probably make a preliminary fix for FT_string() so that it arranges the sublocations correctly, but I think the best way to go is to have FTLocationFactory not modify the various sublocations to start with, which it currently does when it sets strand() (strand() propagates the strand info to sublocations). > Anyway, I'd question the "rare" designation. It seems to me like any > species that has introns will have situations like this in their CDSs. > Not to mention any other sequence that uses Bio::Location::Split. (Since > I'm not a Real Biologist, I can't think up mor examples here, but I'm > sure they exist.) I think that additional tests are definitely needed for pulling out sequences. What I mean by 'rare' is that the majority of sequences do not have problems. Also, this seems to be a 'silent' bug since the error shows up in to_FTstring() but the object sublocations seem to beprocessed correctly when using the location object directly (such as via SeqFeatureI). Round-tripping the sequence should pick it up though. Since complement(join(10632..11157,10347..10372)) is not the same as join(complement(10632..11157),complement(10347..10372)). That is essentially what you are doing, correct? i.e. getting the sequences using Bioperl, saving them (which passes them through SeqIO), reading them again (back through SeqIO with the malformed location string). > Or are you saying it's rare to use join (complement(C..D), > complement(A..B)) instead of complement(join(A..B, C..D)). In that case, > I guess I just got really unlucky in that five fungal genomes I was > using decided to use the "rare" syntax. Location::Split is supposed to handle all variations, but apparently it isn't. > > Note that there are two bugs > > for that bug > > report. The first (and more serious) is still unresolved. The second > > (where remote locations are treated differently in > > Location::Split, which > > caused more problems than it was worth) had a fix committed > > about a month > > ago. > > Sadly, it's the first (and in my case, more common (I have no remote > locations.)) bug for me. > > > Any fixes I have made for the first bug invariably break several other > > methods, which use the current Location::Split object logic > > for retrieving > > sequences, building feature strings, etc. Since a new RC is > > imminent and > > the bug only affects a small number of locations, I have held > > off until > > after a final release is made (the last thing I want to do is > > fix something > > that breaks ~6-8 other methods), but I'll try looking at it > > again this week. > > IMO this is a pretty serious bug (if these kinds of sequences aren't > that rare as I've shown above), because you're outputting sequence > descriptions that are just plain wrong. Anyone who uses > FTLocationFactory to read these output description will have incorrect > sequence, incorrect translated proteins, etc. And it's even more serious > if other methods are depending on it. > > I know I can't dictate your time, and should be volunteering to work on > fixing it. But if it affects other modules, then I will no doubt break > things even more than you have in your attempts. > > -Amir I'll give it a look over the next week. Like I mentioned above, I may be able to fix it in Split::to_FTstring() w/o breaking other tests (in which case I'll commit it for the 1.5.2 release), but it would be a temporary hack until I can work out why other tests are failing. Chris From jason at bioperl.org Mon Oct 16 18:45:21 2006 From: jason at bioperl.org (Jason Stajich) Date: Mon, 16 Oct 2006 15:45:21 -0700 Subject: [Bioperl-l] split location problems Message-ID: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org> The whole point of split locations is to represent genes with introns so that is not the "rare" case. I'm confused where the problem is. The locations that I get out with to_FTstring on the location object are exactly the same as those input. I have processed the genbank fungal genomes into GFF3 and have had no problems so I'm confused where you are breaking down. If I write them out as embl I also get the correct thing. This is using the CVS version of bioperl from the HEAD. I've added code to test this to bug 2101 including a C.glabrata chromsome downloaded from genbank. Perhaps the problem is on the EMBL parsing side, I didn't test that. On the technical side, I still am not sure I fully know where the strand information should be stored - the top level container or the sub-features. I'll try and stay up on the discussion if anything has been decided that I should know about. -jason From torsten.seemann at infotech.monash.edu.au Mon Oct 16 18:23:23 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Tue, 17 Oct 2006 08:23:23 +1000 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <000201c6f149$3ed63490$15327e82@pyrimidine> References: <000201c6f149$3ed63490$15327e82@pyrimidine> Message-ID: <4534065B.9020309@infotech.monash.edu.au> Chris Fields wrote: >> So it looks like an abstract base class, not an interface that >> defines a contract or API? Should use Root.pm then, would be my vote. >> -hilmar > > Makes sense to me. Maybe another audit is needed to catch similar > instances, or has this been done already? The purpose of my original (poorly phrased) question was to try and sort out where Root and RootI where being used the wrong way around. I'm currently "all-audited out" so I leave this task to another volunteer. -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From cjfields at uiuc.edu Mon Oct 16 21:07:55 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 20:07:55 -0500 Subject: [Bioperl-l] split location problems In-Reply-To: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org> References: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org> Message-ID: On Oct 16, 2006, at 5:45 PM, Jason Stajich wrote: > The whole point of split locations is to represent genes with > introns so that is not the "rare" case. > > I'm confused where the problem is. The locations that I get out > with to_FTstring on the location object are exactly the same as > those input. The problem is with the a subset of split locations described in the bug report. The following works: complement(join(2691..4571,4918..5163)) whereas this: join(complement(4918..5163),complement(2691..4571)) gives this: complement(join(4918..5163,2691..4571)) which is not syntactically the same. It should be: complement(join(2691..4571,4918..5163)) since 'join' implies that the order of the segments to be joined is important ('order' and 'bond' do not, I guess). > I have processed the genbank fungal genomes into GFF3 and have had > no problems so I'm confused where you are breaking down. If I > write them out as embl I also get the correct thing. This is using > the CVS version of bioperl from the HEAD. > > I've added code to test this to bug 2101 including a C.glabrata > chromsome downloaded from genbank. Perhaps the problem is on the > EMBL parsing side, I didn't test that. > > On the technical side, I still am not sure I fully know where the > strand information should be stored - the top level container or > the sub-features. I'll try and stay up on the discussion if > anything has been decided that I should know about. > > -jason Split::strand() sets the sublocations as well, which seems to confuse the situation more but it is consistent with LocationI, as Hilmar points out. I'm looking into a few solutions now, including a fix in Split::to_FTstring(). Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Mon Oct 16 22:48:14 2006 From: jason at bioperl.org (Jason Stajich) Date: Mon, 16 Oct 2006 19:48:14 -0700 Subject: [Bioperl-l] split location problems In-Reply-To: References: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org> Message-ID: <8273f6c20610161948w201537a5v2fcfa189eb809283@mail.gmail.com> This probably was exposed by the fact that the Split object used to explicitly sort the features by start*strand always. But with remote locations and needing to be able to explicitly set the order (for features that are not required to be 5' -> 3') that code must have been removed. I think there is just one place that must be missing a 'reverse' on the list of sub-locations when the top-level feature is a complement. I'll wait for your fix before wading in - we probably might want to figure out a 'consolidate' method to shrink redundant and equivalent representations to the shortest possible form. Ugh this really starts to resemble trying to write a boolean logic toolkit.... -jason On 10/16/06, Chris Fields wrote: > > > On Oct 16, 2006, at 5:45 PM, Jason Stajich wrote: > > > The whole point of split locations is to represent genes with > > introns so that is not the "rare" case. > > > > I'm confused where the problem is. The locations that I get out > > with to_FTstring on the location object are exactly the same as > > those input. > > The problem is with the a subset of split locations described in the > bug report. The following works: > > complement(join(2691..4571,4918..5163)) > > whereas this: > > join(complement(4918..5163),complement(2691..4571)) > > gives this: > > complement(join(4918..5163,2691..4571)) > > which is not syntactically the same. It should be: > > complement(join(2691..4571,4918..5163)) > > since 'join' implies that the order of the segments to be joined is > important ('order' and 'bond' do not, I guess). > > > I have processed the genbank fungal genomes into GFF3 and have had > > no problems so I'm confused where you are breaking down. If I > > write them out as embl I also get the correct thing. This is using > > the CVS version of bioperl from the HEAD. > > > > I've added code to test this to bug 2101 including a C.glabrata > > chromsome downloaded from genbank. Perhaps the problem is on the > > EMBL parsing side, I didn't test that. > > > > On the technical side, I still am not sure I fully know where the > > strand information should be stored - the top level container or > > the sub-features. I'll try and stay up on the discussion if > > anything has been decided that I should know about. > > > > -jason > > Split::strand() sets the sublocations as well, which seems to confuse > the situation more but it is consistent with LocationI, as Hilmar > points out. I'm looking into a few solutions now, including a fix in > Split::to_FTstring(). > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > -- Jason Stajich jason at bioperl.org http://www.duke.edu/~jes12/ From cjfields at uiuc.edu Mon Oct 16 23:34:25 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 22:34:25 -0500 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: References: Message-ID: On Oct 16, 2006, at 10:24 PM, Brian Osborne wrote: > Chris and Sendu, > > Sendu was correct in wondering whether id_parser() in Blast.pm > would work > after the module was altered to use SearchIO but what I've found > out from my > local tests is that id_parser() didn't work when BPlite was being used > either. I can continue to work on this but it's safe to say that > removing > BPlite doesn't cause a problem with id_parser, it was already there. > > Brian O. .... It may be one reason (the main reason?) the method wasn't tested. Maybe it should be removed if it can't be easily fixed; I don't think it makes sense keeping it otherwise. Chris Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bosborne11 at verizon.net Mon Oct 16 23:24:59 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 16 Oct 2006 23:24:59 -0400 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <000301c6f155$c7029ff0$15327e82@pyrimidine> Message-ID: Chris and Sendu, Sendu was correct in wondering whether id_parser() in Blast.pm would work after the module was altered to use SearchIO but what I've found out from my local tests is that id_parser() didn't work when BPlite was being used either. I can continue to work on this but it's safe to say that removing BPlite doesn't cause a problem with id_parser, it was already there. Brian O. On 10/16/06 3:03 PM, "Chris Fields" wrote: >> Hilmar Lapp wrote: >>> The problem is it is not maintained, and there are outstanding been bug >>> reports. >>> >>> If you un-deprecate it, then we need a response to people who come >>> across problems with it when using it. Either you change the POD to say >>> exactly who and when one should use it (or rather not) and point to the >>> fact that it is unsupported for all other cases. >>> >>> Or what would you suggest? >> >> I'm not sure. >> >> Does Bio::Index::Blast even work correctly? Does it suffer from whatever >> bugs Bio::Tools::BPlite has? Is Bio::Index::Blast maintained? Should >> that be deprecated as well? >> >> Ideally I'd like to see Bio::Index::Blast updated to use Bio::SearchIO >> and then Bio::Tools::BPlite can be deprecated. But like I say, it didn't >> seem trivial (or even appropriate). >> >> Ultimately I just wanted to solve the warnings in the test suite. >> Thoughts, Chris? > > My opinion is we either have to completely support BPlite (and the others) > or drop it altogether. I don't think we can state "use BPLite only with > Bio::Index::Blast, use SearchIO everywhere else." That's too inconsistent. > > > It seems simpler to deprecate the various Bio::Tools::BP* classes and either > fix Bio::Index::Blast to use Bio::SearchIO (which I think Brian is working > on) or deprecate Bio::Index::Blast as well. > > The warnings in the test suite belong to BlastIndex.t, correct? I updated > using Brian's Bio::Index::blast fix and it passes now w/o warnings. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From bosborne11 at verizon.net Mon Oct 16 23:48:56 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 16 Oct 2006 23:48:56 -0400 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: Message-ID: Chris, OK. In fact there's no written guarantee that all Bio::Index* modules have an id_parser() method. It happens that most do, and it's useful. I'll fix the documentation in Bio::Index::Blast and add an enhancement request to Bugzilla, may be able to get around to before 1.5.2 release but no promises. Brian O. On 10/16/06 11:34 PM, "Chris Fields" wrote: > > On Oct 16, 2006, at 10:24 PM, Brian Osborne wrote: > >> Chris and Sendu, >> >> Sendu was correct in wondering whether id_parser() in Blast.pm >> would work >> after the module was altered to use SearchIO but what I've found >> out from my >> local tests is that id_parser() didn't work when BPlite was being used >> either. I can continue to work on this but it's safe to say that >> removing >> BPlite doesn't cause a problem with id_parser, it was already there. >> >> Brian O. > > .... > > It may be one reason (the main reason?) the method wasn't tested. > Maybe it should be removed if it can't be easily fixed; I don't think > it makes sense keeping it otherwise. > > Chris > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From n.haigh at sheffield.ac.uk Tue Oct 17 02:35:43 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 17 Oct 2006 07:35:43 +0100 Subject: [Bioperl-l] INSTALL and INSTALL.WIN Message-ID: <453479BF.90408@sheffield.ac.uk> I'm a bit unclear as to what is happening with these files. Are these files now superseded by the wikified versions? If so, should these files now just simply contain a link to the wikified versions - otherwise things could get in a mess since I updated the wiki version of INSTALL.WIN and I see Sendu updated the cvs versions a couple of weeks ago - hopefully these differences aren't that big. Nath From faruque at ebi.ac.uk Tue Oct 17 04:19:44 2006 From: faruque at ebi.ac.uk (Nadeem Faruque) Date: Tue, 17 Oct 2006 09:19:44 +0100 Subject: [Bioperl-l] split location problems Message-ID: EMBL' currently outputs join-complements in the format join(complement(30..40),complement(10..20)) instead of the Genbank preferred complement(join(10..20,30..40)) EMBL's may reflect what happens in the cell a little more than Genbank's, but it is less readable and less concise. NB I've also seen a couple of people construct these incorrectly eg join(complement(10..20),complement(30..40)) I believe we are moving to the complement-join format but I can't give a date for the transition. Having said that, trans-splicing will still give us the joys of complex locations, eg join(1..5,complement(join(10..20,30..40))) complement(join(30..40,10..20)) <- looks wrong (unless it is a very small circle) but mis-ordered exons are resolved by the trans- splicing machinery. Nadeem -- S.M. Nadeem N. Faruque EMBL Nucleotide Database Curation Team EMBL Outstation Tel: +44 1223 494611 Fax: +44 1223 494472 The European Bioinformatics Institute URL: http://www.ebi.ac.uk/ Email for data submissions: datasubs at ebi.ac.uk Email for updates: update at ebi.ac.uk ======================================================== From bix at sendu.me.uk Tue Oct 17 04:59:36 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 17 Oct 2006 09:59:36 +0100 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net> References: <4521E74E.1040404@infotech.monash.edu.au> <452F54A1.7010908@sendu.me.uk> <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net> <45333E02.9070808@sendu.me.uk> <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net> Message-ID: <45349B78.8090905@sendu.me.uk> Hilmar Lapp wrote: > So it looks like an abstract base class, not an interface that > defines a contract or API? Should use Root.pm then, would be my vote. Agreed, that was actually what I did in my local copy when I made a new inheriting class (so discovering the problem). This change is harmless to other modules, but does mean they'll have redundant use of Bio::Root::Root which will want cleaning up at some stage. From bix at sendu.me.uk Tue Oct 17 06:32:54 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 17 Oct 2006 11:32:54 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 Message-ID: <4534B156.4090501@sendu.me.uk> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. See http://www.bioperl.org/wiki/Release_1.5.2 for instructions on getting and testing this RC. Developers: This should be the last RC before release ~next monday. Now would be a good time for last minute documentaiton updates and additions. Users: Even though 1.5.2 is a 'developer' release, we consider it the most stable and capable version of Bioperl, and recommend that you use it in all but the most critical production environments. Please try it out and let us know of any problems or difficulties you run into. Thank you, Sendu. From cjfields at uiuc.edu Tue Oct 17 07:16:47 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 06:16:47 -0500 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <453479BF.90408@sheffield.ac.uk> References: <453479BF.90408@sheffield.ac.uk> Message-ID: <72C437E7-8F69-416D-996B-FF3DD1498E78@uiuc.edu> The general consensus was to keep text versions available; we could add URL links to the wiki pages for the most up-to-dat version. BTW, I have modified INSTALL already. INSTALL.WIN is next in line (I was waiting for your changes). Chris On Oct 17, 2006, at 1:35 AM, Nathan S. Haigh wrote: > I'm a bit unclear as to what is happening with these files. > > Are these files now superseded by the wikified versions? If so, should > these files now just simply contain a link to the wikified versions - > otherwise things could get in a mess since I updated the wiki > version of > INSTALL.WIN and I see Sendu updated the cvs versions a couple of weeks > ago - hopefully these differences aren't that big. > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Tue Oct 17 07:45:45 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 17 Oct 2006 12:45:45 +0100 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <72C437E7-8F69-416D-996B-FF3DD1498E78@uiuc.edu> References: <453479BF.90408@sheffield.ac.uk> <72C437E7-8F69-416D-996B-FF3DD1498E78@uiuc.edu> Message-ID: <4534C269.5050704@sheffield.ac.uk> Chris Fields wrote: > The general consensus was to keep text versions available; we could > add URL links to the wiki pages for the most up-to-dat version. BTW, > I have modified INSTALL already. INSTALL.WIN is next in line (I was > waiting for your changes). > Is it possible to generate these files from the wiki whenever there is a release? I now edits shouldn't be too severe or too often - but I can see things getting a little messy/annoying if edits have to be made in 2 places. Nath From cjfields at uiuc.edu Tue Oct 17 10:04:32 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 09:04:32 -0500 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <4534C269.5050704@sheffield.ac.uk> Message-ID: <000301c6f1f5$2d3509d0$15327e82@pyrimidine> There isn't a very easy way since so many links have to be removed/modified. I have found a few CPAN modules that could help, but for now I just dump the text output from a text browser (elinks) using the 'printable version' page and hand-edit, which works very quickly. That works for the time being until I can find another more automated solution. Fortunately there have been very few edits to either INSTALL wiki page so they should remain relatively stable. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk] > Sent: Tuesday, October 17, 2006 6:46 AM > To: Chris Fields > Cc: bioperl-l > Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN > > Chris Fields wrote: > > The general consensus was to keep text versions available; we could > > add URL links to the wiki pages for the most up-to-dat version. BTW, > > I have modified INSTALL already. INSTALL.WIN is next in line (I was > > waiting for your changes). > > > Is it possible to generate these files from the wiki whenever there is a > release? I now edits shouldn't be too severe or too often - but I can > see things getting a little messy/annoying if edits have to be made in 2 > places. > > Nath From cjfields at uiuc.edu Tue Oct 17 10:12:09 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 09:12:09 -0500 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: Message-ID: <000401c6f1f6$424b5580$15327e82@pyrimidine> > Chris, > > OK. In fact there's no written guarantee that all Bio::Index* modules have > an id_parser() method. It happens that most do, and it's useful. I'll fix > the documentation in Bio::Index::Blast and add an enhancement request to > Bugzilla, may be able to get around to before 1.5.2 release but no > promises. > > Brian O. Do the various Bio::Index* modules share a common interface? I wouldn't worry too much about it for this release, unless you really have time. It is still, after all, a developer's release, and you've noted it in Bugzilla. We could try for another dev release in winter (rel 1.5.3, I guess) to get any bug fixes or new modules added. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > On 10/16/06 11:34 PM, "Chris Fields" wrote: > > > > > On Oct 16, 2006, at 10:24 PM, Brian Osborne wrote: > > > >> Chris and Sendu, > >> > >> Sendu was correct in wondering whether id_parser() in Blast.pm > >> would work > >> after the module was altered to use SearchIO but what I've found > >> out from my > >> local tests is that id_parser() didn't work when BPlite was being used > >> either. I can continue to work on this but it's safe to say that > >> removing > >> BPlite doesn't cause a problem with id_parser, it was already there. > >> > >> Brian O. > > > > .... > > > > It may be one reason (the main reason?) the method wasn't tested. > > Maybe it should be removed if it can't be easily fixed; I don't think > > it makes sense keeping it otherwise. > > > > Chris > > > > Christopher Fields > > Postdoctoral Researcher > > Lab of Dr. Robert Switzer > > Dept of Biochemistry > > University of Illinois Urbana-Champaign > > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From n.haigh at sheffield.ac.uk Tue Oct 17 10:15:17 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 17 Oct 2006 15:15:17 +0100 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <000301c6f1f5$2d3509d0$15327e82@pyrimidine> References: <000301c6f1f5$2d3509d0$15327e82@pyrimidine> Message-ID: <4534E575.5050308@sheffield.ac.uk> Chris Fields wrote: > There isn't a very easy way since so many links have to be removed/modified. > I have found a few CPAN modules that could help, but for now I just dump the > text output from a text browser (elinks) using the 'printable version' page > and hand-edit, which works very quickly. That works for the time being > until I can find another more automated solution. > > Fortunately there have been very few edits to either INSTALL wiki page so > they should remain relatively stable. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > So am I correct in saying that the best way is to make all updates to the wikified versions of these files, and then at regular intervals/major releases you (or someone else) will update the CVS version of the files in the way describe above? Cheers Nath From bix at sendu.me.uk Tue Oct 17 10:00:39 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 17 Oct 2006 15:00:39 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4534E09C.9030707@genomics.dk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> Message-ID: <4534E207.8030508@sendu.me.uk> Niels Larsen wrote: > Greetings, > > I am no perl beginner, but I am a BioPerl beginner. Today I looked > for remote similarity services that can be used from Perl. I found > the EBI SOAP interface where their example script returns > > Can't find method element in the message at > /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311. What script exactly? There was a problem with the SOAP server that was fixed earlier today. > and the DDBJ service which (from Denmark) returns > > undef What returned undef? Specifics please. > and then the NCBI server accessed through BioPerls RemoteBlast which > seems to spin in a loop that fills TMPDIR with many tempfiles. Will > release 1.5.2 include improvements to RemoteBlast? (I see Roger Hall > is working towards that). What version of Bioperl were you testing with? What did you do to get it to 'spin in a loop'? I can tell you that remote blasting certainly works in Bioperl 1.5.2, but you'll have to give more details on the things you tried and the problems you encountered. You can also answer the questions yourself by trying the release candidate. From B.Beckert at ibmc.u-strasbg.fr Tue Oct 17 09:59:30 2006 From: B.Beckert at ibmc.u-strasbg.fr (Bertrand Beckert) Date: Tue, 17 Oct 2006 15:59:30 +0200 Subject: [Bioperl-l] problems with: Bio::Tools::Run::RemoteBlast Message-ID: hi, I am running a large number of blasts via a connexion to ncbi blast page ('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi'). I try to use 'Bio::Tools::Run::RemoteBlast' but unfortunately I have some problems. I make a simple example with only one sequence in order to understand how work this module. This is my simple input file, a DNA sequence in fasta form: > test > TTTTGATGAGGCGCATCAATCATGAGTAAAGTTTAGATTACTGTCTGCTAACAGCTGAAT TTGAAAGGGTGCGATGCCGAAGCGATTATAATAGCAGTTATAATTTGTTGGACTTTTTGG TTAAGAGCTGAGAGTTTGTCATTATTTAAAAATAATGGAGTGCATCACTTGTA I have made some modification of the example available in doc of bioperl. It give me a RID which contain the results of my blast but I have a problem with the "$result=$factory->retrieve_blast($rid)" in my script. In the documentation it wrote that $result=$factory->retrieve_blast ($rid) return when it work a Bio::Tools::Bplite or Bio::Tools::Blast object. In my case it returns a Bio::SearchIO::blast... I don't understand why I don't have the good type of object return (see PART I). I also try to resolve the problem by replace the foreach loop in my script by a new one in order to explore the blast page result but it also don't work (see part II). could you help me please. Thank you Bertrand Beckert. PART I: Here is my script with a little annotation and also the shell window printing: ------------------------------------------------------------------------ ---------------------------- #!/usr/bin/perl -w use Bio::Tools::Run::RemoteBlast; use Bio::SearchIO; sub blast { my $prog='blastn'; my $db='refseq_genomic'; my $e_val='1e-10'; my $Input='Seq.fasta'; my @params = ('-prog' => $prog, '-data' => $db, '-expect' => $e_val, '-readmethod' => 'SearchIO'); my $factory = Bio::Tools::Run::RemoteBlast->new(@params); #changes parameters $Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'}='Bacteria [ORGN]'; $Bio::Tools::Run::RemoteBlast::HEADER{'MATRIX_NAME'}='BLOSUM25'; $factory->submit_blast($Input); print STDERR "waiting...\n"; while (my @rids=$factory->each_rid) { print "my rid: ", at rids,"\n"; #return me the ID of the submited blast i.e. RID: 1161079157-766-185099855365.BLASTQ2 #this page contains the result of my blast... foreach my $rid (@rids) { $result=$factory->retrieve_blast($rid); #line in order to understand what type of object is return by retrieve_blast print "rc:", $result,"\n"; } } } &blast; ------------------------------------------------------------------------ ---------------------------- here you can see the shell window: bbeckert at tatooine:~/Script_perl$ ./test.pl waiting... my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x890bc54) my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x890bc30) my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x89eb7f4) my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x8a2cc74) my rid: 1161079157-766-185099855365.BLASTQ2 ... my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x886bbac) my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x89eb5f0) my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x8a2d2d4) my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x84fa054) ... PARTII: I also try to resolve the problem by replace the foreach loop in my script by: ------------------------------------------------------------------------ ---------------------------- foreach my $rid (@rids) { while(1) { $result=$factory->retrieve_blast($rid)->next_result(); print "rc:", $result,"\n"; if ($result) { print $result->num_hits(),"\n"; } ------------------------------------------------------------------------ ---------------------------- With tis loop I could explore the result Blast page. that is what I obtain in the shell window: bbeckert at tatooine:~/Script_perl$ ./test.pl waiting... my rid: 1161088606-9905-123050755601.BLASTQ4 Use of uninitialized value in print at ./retrieve_blast.pl line 30. rc: Use of uninitialized value in print at ./retrieve_blast.pl line 30. rc: Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::Search::Result::BlastResult=HASH(0x84fb8b8) 0 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::Search::Result::BlastResult=HASH(0x84fba8c) 0 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::Search::Result::BlastResult=HASH(0x84fb834) ---- -- Berrtrand BECKERT PhD student IBMC - UPR 9002 du CNRS - ARN 15, rue Rene Descartes F-67084 STRASBOURG Cedex b.beckert at ibmc.u-strasbg.fr From niels at genomics.dk Tue Oct 17 09:54:36 2006 From: niels at genomics.dk (Niels Larsen) Date: Tue, 17 Oct 2006 15:54:36 +0200 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4534B156.4090501@sendu.me.uk> References: <4534B156.4090501@sendu.me.uk> Message-ID: <4534E09C.9030707@genomics.dk> Greetings, I am no perl beginner, but I am a BioPerl beginner. Today I looked for remote similarity services that can be used from Perl. I found the EBI SOAP interface where their example script returns Can't find method element in the message at /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311. and the DDBJ service which (from Denmark) returns undef and then the NCBI server accessed through BioPerls RemoteBlast which seems to spin in a loop that fills TMPDIR with many tempfiles. Will release 1.5.2 include improvements to RemoteBlast? (I see Roger Hall is working towards that). Niels L ------------------------------------------------------------------------ Niels Larsen Danish Genome Institute Gustav Wieds vej 10 C DK-8000 Aarhus C Denmark Electronic mail: niels at genomics.dk Skype: niels_larsen_denmark Telephone: +45-8942-5268 Telefax: +45-8620-1222 ------------------------------------------------------------------------ From cjfields at uiuc.edu Tue Oct 17 10:28:40 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 09:28:40 -0500 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <4534E575.5050308@sheffield.ac.uk> Message-ID: <000501c6f1f8$8b78efe0$15327e82@pyrimidine> ... > So am I correct in saying that the best way is to make all updates to > the wikified versions of these files, and then at regular > intervals/major releases you (or someone else) will update the CVS > version of the files in the way describe above? > > Cheers > Nath Yes. I think the online docs will stay relatively stable. A week or so ago Mauricio and I were discussing moving the dependencies list to it's own CVS document (since they pertain to all Bioperl installations, not just UNIX'y flavors). I haven't done that yet since I was waiting on the INSTALL.WIN changes before I made any more changes. Well, that and I've been really busy doing other things. One way we could make sure that changes to the online docs would match the CVS docs would be to only allow certain wiki users (such as sysadmins) make modifications to those pages. That way any changes would have to go through someone who also has CVS access and could make similar changes to the distribution docs. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Tue Oct 17 10:37:38 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 17 Oct 2006 15:37:38 +0100 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <000501c6f1f8$8b78efe0$15327e82@pyrimidine> References: <000501c6f1f8$8b78efe0$15327e82@pyrimidine> Message-ID: <4534EAB2.50609@sheffield.ac.uk> Chris Fields wrote: > ... > >> So am I correct in saying that the best way is to make all updates to >> the wikified versions of these files, and then at regular >> intervals/major releases you (or someone else) will update the CVS >> version of the files in the way describe above? >> >> Cheers >> Nath >> > > Yes. I think the online docs will stay relatively stable. A week or so ago > Mauricio and I were discussing moving the dependencies list to it's own CVS > document (since they pertain to all Bioperl installations, not just UNIX'y > flavors). I haven't done that yet since I was waiting on the INSTALL.WIN > changes before I made any more changes. Well, that and I've been really > busy doing other things. > Sounds good. > One way we could make sure that changes to the online docs would match the > CVS docs would be to only allow certain wiki users (such as sysadmins) make > modifications to those pages. That way any changes would have to go through > someone who also has CVS access and could make similar changes to the > distribution docs. > Ugh, not sure I like the sound of maintaining 2 copies of any files - sounds like a future headache even if they are pretty stable. It also makes it unclear which of the two file should be considered first (i.e. is the most up-to-date) on pages such as: http://www.bioperl.org/wiki/Installing_BioPerl It suggests that INSTALL and INSTALL.WIN should be looked at first, but there are online copies of those files available - this should now be the other way around - shouldn't it? I might just be making a mountain out of a molehill, so I'll shut up on this topic and make any future edits to the wiki pages instead. > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > From bosborne11 at verizon.net Tue Oct 17 10:48:54 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 17 Oct 2006 10:48:54 -0400 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <000401c6f1f6$424b5580$15327e82@pyrimidine> Message-ID: Chris, The Bio::Index modules either 'use base qw(Bio::Index::Abstract)' or 'use base qw(Bio::Index::AbstractSeq)'. Neither of these modules has an id_parser() method. Brian O. On 10/17/06 10:12 AM, "Chris Fields" wrote: > Do the various Bio::Index* modules share a common interface? From cjfields at uiuc.edu Tue Oct 17 10:45:53 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 09:45:53 -0500 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <4534EAB2.50609@sheffield.ac.uk> Message-ID: <000601c6f1fa$f260b560$15327e82@pyrimidine> ... > > One way we could make sure that changes to the online docs would match > the > > CVS docs would be to only allow certain wiki users (such as sysadmins) > make > > modifications to those pages. That way any changes would have to go > through > > someone who also has CVS access and could make similar changes to the > > distribution docs. > > > Ugh, not sure I like the sound of maintaining 2 copies of any files - > sounds like a future headache even if they are pretty stable. It also > makes it unclear which of the two file should be considered first (i.e. > is the most up-to-date) on pages such as: > http://www.bioperl.org/wiki/Installing_BioPerl > > It suggests that INSTALL and INSTALL.WIN should be looked at first, but > there are online copies of those files available - this should now be > the other way around - shouldn't it? I might just be making a mountain > out of a molehill, so I'll shut up on this topic and make any future > edits to the wiki pages instead. Yes that should be the other way around (the wiki would be the most up-to-date), so the CVS docs should point to the wiki, not vice-versa. Getting the docs right is as important as getting the code to work. So I don't consider it a 'mountain-out-of-a-molehill' problem. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 17 11:07:49 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 10:07:49 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4534E207.8030508@sendu.me.uk> Message-ID: <001001c6f1fe$02fd4de0$15327e82@pyrimidine> > Niels Larsen wrote: > > Greetings, > > > > I am no perl beginner, but I am a BioPerl beginner. Today I looked > > for remote similarity services that can be used from Perl. I found > > the EBI SOAP interface where their example script returns > > > > Can't find method element in the message at > > /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311. > > What script exactly? There was a problem with the SOAP server that was > fixed earlier today. > > > > and the DDBJ service which (from Denmark) returns > > > > undef > > What returned undef? Specifics please. > The first problem, like Sendu mentions, was fixed on the remote server (I get them to pass now). Those were from bioperl-run, though, not the bioperl core distribution. As for DDBJ, do you mean EBI or SwissProt? I ask b/c you mention Denmark. EBI were having server maintenance outages yesterday, which was announced here. As Sendu mentions, please be more specific. > > and then the NCBI server accessed through BioPerls RemoteBlast which > > seems to spin in a loop that fills TMPDIR with many tempfiles. Will > > release 1.5.2 include improvements to RemoteBlast? (I see Roger Hall > > is working towards that). > > What version of Bioperl were you testing with? What did you do to get it > to 'spin in a loop'? I can tell you that remote blasting certainly works > in Bioperl 1.5.2, but you'll have to give more details on the things you > tried and the problems you encountered. > > You can also answer the questions yourself by trying the release > candidate. The tempfiles showing up are from the repeated RID requests and are deleted after the BLAST run (at least they should be); this is quite normal. They don't 'spin in a loop' unless the BLAST query is taking a particularly long time, which can happen depending on how the BLAST query is set up, i.e. what type of BLAST program is requested, if comp-based stats are requested, length of query, database requested, etc. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Tue Oct 17 11:14:07 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 17 Oct 2006 16:14:07 +0100 Subject: [Bioperl-l] problems with: Bio::Tools::Run::RemoteBlast In-Reply-To: References: Message-ID: <4534F33F.3070809@sendu.me.uk> Bertrand Beckert wrote: > hi, > > I am running a large number of blasts via a connexion to ncbi blast > page ('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi'). > I try to use 'Bio::Tools::Run::RemoteBlast' but unfortunately I have > some problems. [snip] > In the documentation it wrote that $result=$factory->retrieve_blast > ($rid) return when it work a Bio::Tools::Bplite or Bio::Tools::Blast > object. In my case it returns a Bio::SearchIO::blast... I don't > understand why I don't have the good type of object return (see PART I). I take it you're using some old version of Bioperl where unfortunately the documentation was incorrect. In fact you're supposed to get a Bio::SearchIO object, so it is a good thing that you are. The latest version of Bioperl has (as far as I can see) correct documentation and behaviour. Bio::Tools::Bplite and Bio::Tools::Blast are deprecated. You want Bio::SearchIO::blast. All is well. > I also try to resolve the problem by replace the foreach loop in my > script by a new one in order to explore the blast page result but it > also don't work (see part II). I'm not really sure what problem you might be facing there, but take a look at some up-to-date documentation, using the new example code: http://doc.bioperl.org/bioperl-live/Bio/Tools/Run/RemoteBlast.html From n.haigh at sheffield.ac.uk Tue Oct 17 12:10:15 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 17 Oct 2006 17:10:15 +0100 Subject: [Bioperl-l] [Fwd: Re: Bundle::BioPerl] Message-ID: <45350067.6070604@sheffield.ac.uk> FYI on Bundle::BioPerl Nathan -------- Original Message -------- Subject: Re: Bundle::BioPerl Date: Tue, 17 Oct 2006 11:52:00 -0400 From: Chris Dagdigian To: Nathan S. Haigh References: <45348FB8.4050009 at sheffield.ac.uk> Hi Nathan, I've updated the Bundle and uploaded it to CPAN. I *think* the rationale for keeping it still exists but I'm removed enough from Bioperl now that I'll defer to others on the decision. The basic idea was that BioPerl has a heck of a lot of dependencies that it requires of (other perl modules) in order to get all the functionality out of it. Many of these dependencies may not be present in default Perl installations. Tracking down all of the dependencies and installing them (along with all of the dependencies- of-the-dependencies) by hand is a massive pain. The nice thing about the Bundle is that it lists the core module dependencies and it works great with the CPAN.pm module to automate the downloading and installation of everything that BioPerl requires. The CPAN module is smart enough that when processing *our* bundle it will also track down and install anything that our bundle entries themselves list as a dependency. So for unix/Linux systems the Bundle is a great one-liner ("perl - MCPAN -e 'install Bundle::BioPerl'" ) way to auto-install or update the many perl modules that BioPerl makes use of. On the windows side, not sure if it is of any help though. Regards, Chris On Oct 17, 2006, at 4:09 AM, Nathan S. Haigh wrote: > Hi Chris > > I've been working on making a PPD for the upcoming Bioperl 1.5.2 > release. During this time I also updated Bundle::BioPerl to include > up-to-date prereqs. I was wondering if you could update the CPAN > package? The updated BioPerl.pm file is attached. > > There is some talk about why and if we need Bundle::BioPerl > anymore. What was the rationale for having it in the first place, > and does it still hold true now? > > Cheers > Nath > From plu5even at gmail.com Tue Oct 17 12:26:34 2006 From: plu5even at gmail.com (Peter H. Baenziger) Date: Tue, 17 Oct 2006 12:26:34 -0400 Subject: [Bioperl-l] LocatableSeq object vs Sequence Object Message-ID: <830b42e70610170926h5bdaabd6hc1352b8d4f3c8c4c@mail.gmail.com> All, This is my first bioperl script (but not my first Perl script) so please forgive my naivety. I've read through documentation and looked through cookbooks and the like but to no avail. Any advice is appreciated. So...I am working with an alignment object of several sequences. My intentions is to loop through all the sequences of the alignment to find what amino acid they have at a known position in the alignment (not the position in the sequence). I was thinking I could use: foreach $seq ($alignment->each_seq()) to loop through the sequences and call: $seq->location_from_column($pos) on each of the sequences. However, I don't think I have "LocatableSequences" (the type of object that has method "location_from_columns") being returned by $alignment->each_seq(). So, how do I bridge this gap here? Or is there a better way? My appreciation in advance! Peter code: my $swissObj = $swissdb->get_Seq_by_acc($query); //put several of these in @sequenceObjects ... my $alignFactory = Bio::Tools::Run::Alignment::Clustalw->new(); my $alignment = $alignFactory->align(\@sequenceObjects); #print $alignment->overall_percentage_identity(); #works #now we find the "alignment position" of the mutation we have on the human version and get the amino acid at that "alignment position" for all seq my $humanSequence = $prefix."HUMAN"; my $pos = $alignment->column_from_residue_number($humanSequence, $aa_seqpos); #this is the "alignment position" equivalent to the mutation position #we'll keep track of what amino acid each species has at the "alignment equivalent" location listed as being a mutation on the the human version foreach $seq ($alignment->each_seq()) { #print $seq->species() . "\n"; #won't work because $alignment->each_seq() actually returns a locatableSeq object, not a normal sequence object $speciesAA{$species} = $seq->locatation_from_column($pos); } -- <<->> Peter H. Baenziger From akarger at CGR.Harvard.edu Tue Oct 17 12:53:19 2006 From: akarger at CGR.Harvard.edu (Amir Karger) Date: Tue, 17 Oct 2006 12:53:19 -0400 Subject: [Bioperl-l] split location problems Message-ID: > From: Jason Stajich [mailto:jason.stajich at gmail.com] > > The whole point of split locations is to represent genes with > introns > so that is not the "rare" case. Absolutely. > I have processed the genbank fungal genomes into GFF3 and > have had no > problems so I'm confused where you are breaking down. If I write > them out as embl I also get the correct thing. This is using > the CVS > version of bioperl from the HEAD. > > I've added code to test this to bug 2101 including a C.glabrata > chromsome downloaded from genbank. Perhaps the problem is on the > EMBL parsing side, I didn't test that. Well, I don't know whether it's EMBL parsing, or a bit further down the pipe, but I downloaded C.glabrata chromosome B for GenBank (NC_005968), and it describes the complement/joins in the way that Bioperl is handling correctly. GenBank: CDS complement(join(10347..10372,10632..11157)) /locus_tag="CAGL0B00242g" EMBL: FT CDS join(complement(10632..11157),complement(10347..10372)) FT /locus_tag="CAGL0B00242g" Here's the diff when I run the location-printing script I posted yesterday: diff biogb bio 1c1,5 < complement(join(10347..10372,10632..11157)) --- > complement(1701..2651) > complement(2635..3345) > complement(3980..4408) > complement(join(10632..11157,10347..10372)) > 10379..10615 209a214,217 > 498198..498890 > 499712..500062 > 499851..500702 > 500579..501364 As you can see, the complement/join CDS is written out in a different order, which is Bad. (I looked at at least one of the other differences: the GB file says it's a "misc feature" and EMBL says it's a CDS. But they don't seem to be relevant here.) -Amir > > On the technical side, I still am not sure I fully know where the > strand information should be stored - the top level container or the > sub-features. I'll try and stay up on the discussion if > anything has > been decided that I should know about. > > -jason > > > > From paul.boutros at utoronto.ca Tue Oct 17 12:57:19 2006 From: paul.boutros at utoronto.ca (Paul Boutros) Date: Tue, 17 Oct 2006 12:57:19 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 Message-ID: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca> Hi, Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two failed tests, the first seems to be just a result of me not having DBD::mysql installed. Paul Test Summary ============ Failed Test Stat Wstat Total Fail List of Failed ------------------------------------------------------------------------------- t/BioDBSeqFeature_mysql.t 46 46 1-46 t/SearchIO.t 22 5632 1337 2671 2-1337 2 tests and 106 subtests skipped. Failed 2/236 test scripts. 1382/11688 subtests failed. Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys = 159.61 CPU) BioDBSeqFeature_mysql ===================== pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t 1..46 install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8 /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at (eval 37) line 3. Perhaps the DBD::mysql perl module hasn't been fully installed, or perhaps the capitalisation of 'mysql' isn't right. Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge. at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208 SearchIO ======== pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more 1..1337 ok 1 -------------------- WARNING --------------------- MSG: XML::SAX::Expat not currently supported; must have local copies of NCBI DTD docs! --------------------------------------------------- -------------------- WARNING --------------------- MSG: error in parsing a report: 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' does not exist file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd Handler couldn't resolve external entity at line 2, column 82, byte 104 error in processing external entity reference at line 2, column 82, byte 104 at /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line 187 --------------------------------------------------- not ok 2 # Failed test 2 in t/SearchIO.t at line 68 Can't call method "database_name" on an undefined value at t/SearchIO.t line 69. ------------------------------ Message: 10 Date: Tue, 17 Oct 2006 11:32:54 +0100 From: Sendu Bala Subject: [Bioperl-l] Bioperl 1.5.2 RC2 To: bioperl-l at bioperl.org Message-ID: <4534B156.4090501 at sendu.me.uk> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. See http://www.bioperl.org/wiki/Release_1.5.2 for instructions on getting and testing this RC. Developers: This should be the last RC before release ~next monday. Now would be a good time for last minute documentaiton updates and additions. Users: Even though 1.5.2 is a 'developer' release, we consider it the most stable and capable version of Bioperl, and recommend that you use it in all but the most critical production environments. Please try it out and let us know of any problems or difficulties you run into. Thank you, Sendu. From barry.moore at genetics.utah.edu Tue Oct 17 12:57:48 2006 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Tue, 17 Oct 2006 10:57:48 -0600 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <000301c6f1f5$2d3509d0$15327e82@pyrimidine> References: <000301c6f1f5$2d3509d0$15327e82@pyrimidine> Message-ID: <3CACE1AB-D7F2-4F54-A1C3-35E1AE444CD1@genetics.utah.edu> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix does a reasonable job of textifying html. You get the links as numbered references at the bottom or: lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix | perl -ane 's/\[?\[\d+\](edit\])?//g;print' to remove the links all together. Barry P.S. Looks like this: #Creative Commons copyright Installing Bioperl for Unix From BioPerl Jump to: navigation, search Contents * 1 BIOPERL INSTALLATION * 2 SYSTEM REQUIREMENTS * 3 OPTIONAL * 4 ADDITIONAL INSTALLATION INFORMATION * 5 THE BIOPERL BUNDLE * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make' * 8 WHERE ARE THE MAN PAGES? * 9 EXTERNAL PROGRAMS + 9.1 Environment Variables * 10 INSTALLING BIOPERL SCRIPTS * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA * 12 INSTALLING BIOPERL MODULES THE HARD WAY * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION * 14 THE TEST SYSTEM * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE + 15.1 CONFIGURING for BSD and Solaris boxes + 15.2 INSTALLATION * 16 DEPENDENCIES AND Bundle::BioPerl BIOPERL INSTALLATION Bioperl has been installed on many forms of Unix, Win9X/NT/2000/XP, and on Mac OS X (see the PLATFORMS file for more details). Following are instructions for installing Bioperl for Unix/Linux/Mac OS X; Windows installation instructions can be found here. For installing Bioperl for Mac OS X using Fink, see Getting BioPerl. SYSTEM REQUIREMENTS * Perl 5.005 or later; version 5.6 and greater are recommended. Note that most modules will work with earlier versions of Perl. The only ones that will not are Bio::SimpleAlign and the Bio::Index::* modules. If you don't need these modules and you want to install Bioperl using an earlier version of Perl, edit the "require 5.005;" line in Makefile.PL as necessary. * External modules: Bioperl uses functionality provided in other Perl modules. Some of these are included in the standard perl package but some need to be obtained from the CPAN site. The list of external modules is included at the bottom of this document. The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of these external modules easy. Simply install the bundle using your CPAN shell and all necessary modules will be installed. See THE BIOPERL BUNDLE, below. OPTIONAL * ANSI C or GNU C compiler (gcc) for XS extensions (the bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext PACKAGE, below). ADDITIONAL INSTALLATION INFORMATION * Additional information on Bioperl and MAC OS: + OS 9 - http://bioperl.org/Core/mac-bioperl.html + OSX-http://www.tc.umn.edu/~cann0010/ Bioperl_OSX_install.html + OS X - Installing using Fink (in Getting BioPerl) THE BIOPERL BUNDLE You typically need root privileges to install using CPAN. If you don't have these privileges please see INSTALLING BIOPERL IN A PERSONAL MODULE AREA for additional information. Install Bundle::Bioperl using CPAN. One way: >perl -MCPAN -e "install Bundle::BioPerl" Another way: >perl -MCPAN -e shell cpan>install Bundle::BioPerl On Oct 17, 2006, at 8:04 AM, Chris Fields wrote: > There isn't a very easy way since so many links have to be removed/ > modified. > I have found a few CPAN modules that could help, but for now I just > dump the > text output from a text browser (elinks) using the 'printable > version' page > and hand-edit, which works very quickly. That works for the time > being > until I can find another more automated solution. > > Fortunately there have been very few edits to either INSTALL wiki > page so > they should remain relatively stable. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > >> -----Original Message----- >> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk] >> Sent: Tuesday, October 17, 2006 6:46 AM >> To: Chris Fields >> Cc: bioperl-l >> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN >> >> Chris Fields wrote: >>> The general consensus was to keep text versions available; we could >>> add URL links to the wiki pages for the most up-to-dat version. >>> BTW, >>> I have modified INSTALL already. INSTALL.WIN is next in line (I was >>> waiting for your changes). >>> >> Is it possible to generate these files from the wiki whenever >> there is a >> release? I now edits shouldn't be too severe or too often - but I can >> see things getting a little messy/annoying if edits have to be >> made in 2 >> places. >> >> Nath > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From niels at genomics.dk Tue Oct 17 12:58:14 2006 From: niels at genomics.dk (Niels Larsen) Date: Tue, 17 Oct 2006 18:58:14 +0200 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4534E207.8030508@sendu.me.uk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> Message-ID: <45350BA6.3040102@genomics.dk> Ok, here are ways to reproduce; I sure apologize if I made the test scripts wrong. And I suppose EBI/DDBJ's interfaces are not a bioperl issue really. Niels ------------ EBI I invoked the EBI script http://www.ebi.ac.uk/Tools/webservices/downloads/WSWUBlastSOAPLite.zip like this WSWUBlastClient.pl -p blastn -D embl test.fasta where the content of test.fasta is below, and got Can't find method element in the message at /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311. >Planctomyces sp. 282; Genbank Taxonomy ID: 79927 AATGAACGTTGGCGGCATGGATTAGGCATGCAAGTCGAGGGAGAACCCGCAAGGGGACACCGGCG AACGGGGTAGGAATACATAGGTAACGTACCCTCAGGACGGGGATAGCCAAGGGAAACTTTGGGTA ATACCCGATGTGATGGCAAGATGTGAATGCTTGTCATCAAAGGTGAGATTCCACCTGAGGAGCGG CTTATGCATCATTAGCTTGTTGGCGGGGTAACGGCCCACCAAGGCTGCGATGATTAGGGGGTGTG AGAGCATGGCCCCCACCACTGGCACTGAGACACTGGCCAGACACCTACGGGTGGCTGCAGTCGAG I tried with this test sequence in fasta format and with just the sequence. ------------ DDBJ Inspired by this page, http://xml.nig.ac.jp/doc/Blast.txt I made this test script ------ cut -- #!/usr/bin/env perl use strict; use warnings FATAL => qw ( all ); my ( $service, $seqstr, $result ); use SOAP::Lite; use Data::Dumper; $service = SOAP::Lite->service('http://xml.nig.ac.jp/wsdl/Blast.wsdl'); $seqstr = "MSSRIARALALVVTLLHLTRLALSTCPAACHCPLEAPKCAPGVGLVRDGCGCCKVCAKQL"; $result = $service->searchSimple( "blastp", "SWISS", $seqstr ); print Dumper( $result ); ------ cut -- which for me prints undef. ------------- NCBI/Bioperl I installed 1.5.2-RC2, looked at the RemoteBlast example in http://www.bioperl.org/wiki/Bptutorial.pl and then put that into this test code, more or less cut/paste, --- cut -- #!/usr/bin/env perl use strict; use warnings FATAL => qw ( all ); use Bio::Tools::Run::RemoteBlast; use Data::Dumper; my ( $remote_blast, $r, $rc, $rid, @rids ); $remote_blast = Bio::Tools::Run::RemoteBlast->new ( -prog => 'blastn', -data => 'ecoli', -expect => '1e-10' ); $r = $remote_blast->submit_blast("ecoli.fasta"); while ( @rids = $remote_blast->each_rid ) { # print Dumper( \@rids ); for $rid ( @rids ) { $rc = $remote_blast->retrieve_blast($rid); # print Dumper( $rc ); } sleep 10; } --- cut -- which saves the same blast report to TMPDIR for every 10 seconds. The "ecoli.fasta" file contains this >test gggggctctgttggttctcccgcaacgctactctgtttaccaggtcaggtccggaaggaa gcagccaaggcagatgacgcgtgtgccgggatgtagctggcagggcccccaccc Maybe I am supposed to add a check for content in $rc and then stop the inner loop? I could figure that out maybe, but I wish there was a function which simply takes a single sequence + arguments and only returns a list of matches when done, and does not return until then (or until a specified timeout). ------------------------------------------------------------------------ Niels Larsen Danish Genome Institute Gustav Wieds vej 10 C DK-8000 Aarhus C Denmark Electronic mail: niels at genomics.dk Skype: niels_larsen_denmark Telephone: +45-8942-5268 Telefax: +45-8620-1222 ------------------------------------------------------------------------ From bertrand.beckert at gmail.com Tue Oct 17 10:52:36 2006 From: bertrand.beckert at gmail.com (bertrand beckert) Date: Tue, 17 Oct 2006 16:52:36 +0200 Subject: [Bioperl-l] problems with: Bio::Tools::Run::RemoteBlast Message-ID: <500217090610170752q565cfc08t5208e3b64f99ef7f@mail.gmail.com> hi, I am running a large number of blasts via a connexion to ncbi blast page ('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi'). I try to use 'Bio::Tools::Run::RemoteBlast' but unfortunately I have some problems. I make a simple example with only one sequence in order to understand how work this module. This is my simple input file, a DNA sequence in fasta form: >test TTTTGATGAGGCGCATCAATCATGAGTAAAGTTTAGATTACTGTCTGCTAACAGCTGAAT TTGAAAGGGTGCGATGCCGAAGCGATTATAATAGCAGTTATAATTTGTTGGACTTTTTGG TTAAGAGCTGAGAGTTTGTCATTATTTAAAAATAATGGAGTGCATCACTTGTA I have made some modification of the example available in doc of bioperl. It give me a RID which contain the results of my blast but I have a problem with the "$result=$factory->retrieve_blast($rid)" in my script. In the documentation it wrote that $result=$factory->retrieve_blast ($rid) return when it work a Bio::Tools::Bplite or Bio::Tools::Blast object. In my case it returns a Bio::SearchIO::blast... I don't understand why I don't have the good type of object return (see PART I). I also try to resolve the problem by replace the foreach loop in my script by a new one in order to explore the blast page result but it also don't work (see part II). could you help me please. Thank you Bertrand Beckert. PART I: Here is my script with a little annotation and also the shell window printing: ------------------------------------------------------------------------ #!/usr/bin/perl -w use Bio::Tools::Run::RemoteBlast; use Bio::SearchIO; sub blast { my $prog='blastn'; my $db='refseq_genomic'; my $e_val='1e-10'; my $Input='Seq.fasta'; my @params = ('-prog' => $prog, '-data' => $db, '-expect' => $e_val, '-readmethod' => 'SearchIO'); my $factory = Bio::Tools::Run::RemoteBlast->new(@params); #changes parameters $Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'}='Bacteria [ORGN]'; $Bio::Tools::Run::RemoteBlast::HEADER{'MATRIX_NAME'}='BLOSUM25'; $factory->submit_blast($Input); print STDERR "waiting...\n"; while (my @rids=$factory->each_rid) { print "my rid: ", at rids,"\n"; #return me the ID of the submited blast i.e. RID: 1161079157-766-185099855365.BLASTQ2 #this page contains the result of my blast... foreach my $rid (@rids) { $result=$factory->retrieve_blast($rid); #line in order to understand what type of object is return by retrieve_blast print "rc:", $result,"\n"; } } } &blast; ------------------------------------------------------------------------ here you can see the shell window: bbeckert at tatooine:~/Script_perl$ ./test.pl waiting... my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x890bc54) my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x890bc30) my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x89eb7f4) my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x8a2cc74) my rid: 1161079157-766-185099855365.BLASTQ2 ... my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x886bbac) my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x89eb5f0) my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x8a2d2d4) my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x84fa054) ... PARTII: I also try to resolve the problem by replace the foreach loop in my script by: ------------------------------------------------------------------------ foreach my $rid (@rids) { while(1) { $result=$factory->retrieve_blast($rid)->next_result(); print "rc:", $result,"\n"; if ($result) { print $result->num_hits(),"\n"; } ------------------------------------------------------------------------ With tis loop I could explore the result Blast page. that is what I obtain in the shell window: bbeckert at tatooine:~/Script_perl$ ./test.pl waiting... my rid: 1161088606-9905-123050755601.BLASTQ4 Use of uninitialized value in print at ./retrieve_blast.pl line 30. rc: Use of uninitialized value in print at ./retrieve_blast.pl line 30. rc: Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::Search::Result::BlastResult=HASH(0x84fb8b8) 0 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::Search::Result::BlastResult=HASH(0x84fba8c) 0 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::Search::Result::BlastResult=HASH(0x84fb834) ---- -- Berrtrand BECKERT PhD student IBMC - UPR 9002 du CNRS - ARN 15, rue Rene Descartes F-67084 STRASBOURG Cedex b.beckert at ibmc.u-strasbg.fr bertrand.beckert at gmail.com From cjfields at uiuc.edu Tue Oct 17 13:50:49 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 12:50:49 -0500 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <3CACE1AB-D7F2-4F54-A1C3-35E1AE444CD1@genetics.utah.edu> Message-ID: <001201c6f214$c8934440$15327e82@pyrimidine> (Apologies for the top post, but I thought my response might get lost below) I use elinks in a similar fashion. It tends to format the tables a bit better than lynx. Chris > -----Original Message----- > From: Barry Moore [mailto:barry.moore at genetics.utah.edu] > Sent: Tuesday, October 17, 2006 11:58 AM > To: Chris Fields > Cc: 'Nathan S. Haigh'; 'bioperl-l' > Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN > > lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix > > does a reasonable job of textifying html. You get the links as > numbered references at the bottom or: > > lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix | > perl -ane 's/\[?\[\d+\](edit\])?//g;print' > > to remove the links all together. > > Barry > > P.S. Looks like this: > > #Creative Commons copyright > > Installing Bioperl for Unix > > From BioPerl > > Jump to: navigation, search > > Contents > > * 1 BIOPERL INSTALLATION > * 2 SYSTEM REQUIREMENTS > * 3 OPTIONAL > * 4 ADDITIONAL INSTALLATION INFORMATION > * 5 THE BIOPERL BUNDLE > * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN > * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make' > * 8 WHERE ARE THE MAN PAGES? > * 9 EXTERNAL PROGRAMS > + 9.1 Environment Variables > * 10 INSTALLING BIOPERL SCRIPTS > * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA > * 12 INSTALLING BIOPERL MODULES THE HARD WAY > * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION > * 14 THE TEST SYSTEM > * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE > + 15.1 CONFIGURING for BSD and Solaris boxes > + 15.2 INSTALLATION > * 16 DEPENDENCIES AND Bundle::BioPerl > > > BIOPERL INSTALLATION > > Bioperl has been installed on many forms of Unix, Win9X/NT/2000/XP, > and on Mac OS X (see the PLATFORMS file for more details). > Following are > instructions for installing Bioperl for Unix/Linux/Mac OS X; > Windows > installation instructions can be found here. For installing > Bioperl for > Mac OS X using Fink, see Getting BioPerl. > > > SYSTEM REQUIREMENTS > > * Perl 5.005 or later; version 5.6 and greater are recommended. > Note > that most modules will work with earlier versions of Perl. > The only ones > that will not are Bio::SimpleAlign and the Bio::Index::* > modules. If > you don't need these modules and you want to install Bioperl > using an > earlier version of Perl, edit the "require 5.005;" line in > Makefile.PL > as necessary. > > * External modules: Bioperl uses functionality provided in > other Perl > modules. Some of these are included in the standard perl > package but > some need to be obtained from the CPAN site. The list of > external > modules is included at the bottom of this document. > > The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of > these > external modules easy. Simply install the bundle using your CPAN > shell and > all necessary modules will be installed. See THE BIOPERL BUNDLE, > below. > > > OPTIONAL > > * ANSI C or GNU C compiler (gcc) for XS extensions (the > bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext > PACKAGE, below). > > > > ADDITIONAL INSTALLATION INFORMATION > > * Additional information on Bioperl and MAC OS: > + OS 9 - http://bioperl.org/Core/mac-bioperl.html > + OSX-http://www.tc.umn.edu/~cann0010/ > Bioperl_OSX_install.html > + OS X - Installing using Fink (in Getting BioPerl) > > > > THE BIOPERL BUNDLE > > You typically need root privileges to install using CPAN. If you > don't > have these privileges please see INSTALLING BIOPERL IN A PERSONAL > MODULE > AREA for additional information. > > Install Bundle::Bioperl using CPAN. One way: > >perl -MCPAN -e "install Bundle::BioPerl" > > Another way: > >perl -MCPAN -e shell > cpan>install Bundle::BioPerl > > > > On Oct 17, 2006, at 8:04 AM, Chris Fields wrote: > > > There isn't a very easy way since so many links have to be removed/ > > modified. > > I have found a few CPAN modules that could help, but for now I just > > dump the > > text output from a text browser (elinks) using the 'printable > > version' page > > and hand-edit, which works very quickly. That works for the time > > being > > until I can find another more automated solution. > > > > Fortunately there have been very few edits to either INSTALL wiki > > page so > > they should remain relatively stable. > > > > Christopher Fields > > Postdoctoral Researcher - Switzer Lab > > Dept. of Biochemistry > > University of Illinois Urbana-Champaign > > > >> -----Original Message----- > >> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk] > >> Sent: Tuesday, October 17, 2006 6:46 AM > >> To: Chris Fields > >> Cc: bioperl-l > >> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN > >> > >> Chris Fields wrote: > >>> The general consensus was to keep text versions available; we could > >>> add URL links to the wiki pages for the most up-to-dat version. > >>> BTW, > >>> I have modified INSTALL already. INSTALL.WIN is next in line (I was > >>> waiting for your changes). > >>> > >> Is it possible to generate these files from the wiki whenever > >> there is a > >> release? I now edits shouldn't be too severe or too often - but I can > >> see things getting a little messy/annoying if edits have to be > >> made in 2 > >> places. > >> > >> Nath > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Tue Oct 17 13:52:36 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 12:52:36 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca> Message-ID: <001301c6f215$07a9a070$15327e82@pyrimidine> What do you get when you run the SearchIO.t test by itself using 'perl -I. t/SearchIO.t'? It looks like something pretty catastrophic happened. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Paul Boutros > Sent: Tuesday, October 17, 2006 11:57 AM > To: bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2 > > Hi, > Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two failed > tests, the first seems to be just a result of me not having DBD::mysql > installed. > Paul > > Test Summary > ============ > > Failed Test Stat Wstat Total Fail List of Failed > -------------------------------------------------------------------------- > ----- > t/BioDBSeqFeature_mysql.t 46 46 1-46 > t/SearchIO.t 22 5632 1337 2671 2-1337 > 2 tests and 106 subtests skipped. > Failed 2/236 test scripts. 1382/11688 subtests failed. > Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys = > 159.61 CPU) > > BioDBSeqFeature_mysql > ===================== > pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t > 1..46 > install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC > contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t > /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8 > /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi > /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at > (eval 37) line 3. > Perhaps the DBD::mysql perl module hasn't been fully installed, > or perhaps the capitalisation of 'mysql' isn't right. > Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge. > at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208 > > SearchIO > ======== > pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more > 1..1337 > ok 1 > > -------------------- WARNING --------------------- > MSG: XML::SAX::Expat not currently supported; must have local copies > of NCBI DTD docs! > --------------------------------------------------- > > -------------------- WARNING --------------------- > MSG: error in parsing a report: > > 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' > does not exist > file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd > Handler couldn't resolve external entity at line 2, column 82, byte 104 > error in processing external entity reference at line 2, column 82, > byte 104 at > /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line > 187 > > --------------------------------------------------- > not ok 2 > # Failed test 2 in t/SearchIO.t at line 68 > Can't call method "database_name" on an undefined value at > t/SearchIO.t line 69. > > ------------------------------ > > Message: 10 > Date: Tue, 17 Oct 2006 11:32:54 +0100 > From: Sendu Bala > Subject: [Bioperl-l] Bioperl 1.5.2 RC2 > To: bioperl-l at bioperl.org > Message-ID: <4534B156.4090501 at sendu.me.uk> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. > See http://www.bioperl.org/wiki/Release_1.5.2 for > instructions on getting and testing this RC. > > Developers: > This should be the last RC before release ~next monday. Now would > be a good time for last minute documentaiton updates and additions. > > Users: > Even though 1.5.2 is a 'developer' release, we consider it the most > stable and capable version of Bioperl, and recommend that you use > it in all but the most critical production environments. Please > try it out and let us know of any problems or difficulties you run > into. > > > Thank you, > Sendu. > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From paul.boutros at utoronto.ca Tue Oct 17 13:59:33 2006 From: paul.boutros at utoronto.ca (Paul Boutros) Date: Tue, 17 Oct 2006 13:59:33 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <001301c6f215$07a9a070$15327e82@pyrimidine> References: <001301c6f215$07a9a070$15327e82@pyrimidine> Message-ID: <20061017135933.cu3li7pz8fzk8cko@webmail.utoronto.ca> Hi Chris, Here it is: pcboutro at ccb690[643] >> perl -I. t/SearchIO.t 1..1337 ok 1 -------------------- WARNING --------------------- MSG: XML::SAX::Expat not currently supported; must have local copies of NCBI DTD docs! --------------------------------------------------- -------------------- WARNING --------------------- MSG: error in parsing a report: 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' does not exist file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd Handler couldn't resolve external entity at line 2, column 82, byte 104 error in processing external entity reference at line 2, column 82, byte 104 at /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line 187 --------------------------------------------------- not ok 2 # Failed test 2 in t/SearchIO.t at line 68 Can't call method "database_name" on an undefined value at t/SearchIO.t line 69. Quoting Chris Fields : > What do you get when you run the SearchIO.t test by itself using 'perl -I. > t/SearchIO.t'? It looks like something pretty catastrophic happened. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > >> -----Original Message----- >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >> bounces at lists.open-bio.org] On Behalf Of Paul Boutros >> Sent: Tuesday, October 17, 2006 11:57 AM >> To: bioperl-l at lists.open-bio.org >> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2 >> >> Hi, >> Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two failed >> tests, the first seems to be just a result of me not having DBD::mysql >> installed. >> Paul >> >> Test Summary >> ============ >> >> Failed Test Stat Wstat Total Fail List of Failed >> -------------------------------------------------------------------------- >> ----- >> t/BioDBSeqFeature_mysql.t 46 46 1-46 >> t/SearchIO.t 22 5632 1337 2671 2-1337 >> 2 tests and 106 subtests skipped. >> Failed 2/236 test scripts. 1382/11688 subtests failed. >> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys = >> 159.61 CPU) >> >> BioDBSeqFeature_mysql >> ===================== >> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t >> 1..46 >> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC >> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t >> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8 >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi >> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at >> (eval 37) line 3. >> Perhaps the DBD::mysql perl module hasn't been fully installed, >> or perhaps the capitalisation of 'mysql' isn't right. >> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge. >> at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208 >> >> SearchIO >> ======== >> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more >> 1..1337 >> ok 1 >> >> -------------------- WARNING --------------------- >> MSG: XML::SAX::Expat not currently supported; must have local copies >> of NCBI DTD docs! >> --------------------------------------------------- >> >> -------------------- WARNING --------------------- >> MSG: error in parsing a report: >> >> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' >> does not exist >> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd >> Handler couldn't resolve external entity at line 2, column 82, byte 104 >> error in processing external entity reference at line 2, column 82, >> byte 104 at >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line >> 187 >> >> --------------------------------------------------- >> not ok 2 >> # Failed test 2 in t/SearchIO.t at line 68 >> Can't call method "database_name" on an undefined value at >> t/SearchIO.t line 69. >> >> ------------------------------ >> >> Message: 10 >> Date: Tue, 17 Oct 2006 11:32:54 +0100 >> From: Sendu Bala >> Subject: [Bioperl-l] Bioperl 1.5.2 RC2 >> To: bioperl-l at bioperl.org >> Message-ID: <4534B156.4090501 at sendu.me.uk> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed >> >> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. >> See http://www.bioperl.org/wiki/Release_1.5.2 for >> instructions on getting and testing this RC. >> >> Developers: >> This should be the last RC before release ~next monday. Now would >> be a good time for last minute documentaiton updates and additions. >> >> Users: >> Even though 1.5.2 is a 'developer' release, we consider it the most >> stable and capable version of Bioperl, and recommend that you use >> it in all but the most critical production environments. Please >> try it out and let us know of any problems or difficulties you run >> into. >> >> >> Thank you, >> Sendu. >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From barry.moore at genetics.utah.edu Tue Oct 17 14:07:12 2006 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Tue, 17 Oct 2006 12:07:12 -0600 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: References: Message-ID: <588DE26B-8F18-4540-BAEE-2B479CBDE8B3@genetics.utah.edu> In fact, I think it was you who taught me that trick in the first place. B On Oct 17, 2006, at 11:40 AM, Brian Osborne wrote: > Barry, > > I second that. lynx does the best job of converting HTML to text > I've seen. > > Brian O. > > > On 10/17/06 12:57 PM, "Barry Moore" > wrote: > >> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix >> >> does a reasonable job of textifying html. You get the links as >> numbered references at the bottom or: >> >> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix | >> perl -ane 's/\[?\[\d+\](edit\])?//g;print' >> >> to remove the links all together. >> >> Barry >> >> P.S. Looks like this: >> >> #Creative Commons copyright >> >> Installing Bioperl for Unix >> >> From BioPerl >> >> Jump to: navigation, search >> >> Contents >> >> * 1 BIOPERL INSTALLATION >> * 2 SYSTEM REQUIREMENTS >> * 3 OPTIONAL >> * 4 ADDITIONAL INSTALLATION INFORMATION >> * 5 THE BIOPERL BUNDLE >> * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN >> * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make' >> * 8 WHERE ARE THE MAN PAGES? >> * 9 EXTERNAL PROGRAMS >> + 9.1 Environment Variables >> * 10 INSTALLING BIOPERL SCRIPTS >> * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA >> * 12 INSTALLING BIOPERL MODULES THE HARD WAY >> * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION >> * 14 THE TEST SYSTEM >> * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE >> + 15.1 CONFIGURING for BSD and Solaris boxes >> + 15.2 INSTALLATION >> * 16 DEPENDENCIES AND Bundle::BioPerl >> >> >> BIOPERL INSTALLATION >> >> Bioperl has been installed on many forms of Unix, Win9X/NT/ >> 2000/XP, >> and on Mac OS X (see the PLATFORMS file for more details). >> Following are >> instructions for installing Bioperl for Unix/Linux/Mac OS X; >> Windows >> installation instructions can be found here. For installing >> Bioperl for >> Mac OS X using Fink, see Getting BioPerl. >> >> >> SYSTEM REQUIREMENTS >> >> * Perl 5.005 or later; version 5.6 and greater are recommended. >> Note >> that most modules will work with earlier versions of Perl. >> The only ones >> that will not are Bio::SimpleAlign and the Bio::Index::* >> modules. If >> you don't need these modules and you want to install Bioperl >> using an >> earlier version of Perl, edit the "require 5.005;" line in >> Makefile.PL >> as necessary. >> >> * External modules: Bioperl uses functionality provided in >> other Perl >> modules. Some of these are included in the standard perl >> package but >> some need to be obtained from the CPAN site. The list of >> external >> modules is included at the bottom of this document. >> >> The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of >> these >> external modules easy. Simply install the bundle using your CPAN >> shell and >> all necessary modules will be installed. See THE BIOPERL BUNDLE, >> below. >> >> >> OPTIONAL >> >> * ANSI C or GNU C compiler (gcc) for XS extensions >> (the >> bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext >> PACKAGE, below). >> >> >> >> ADDITIONAL INSTALLATION INFORMATION >> >> * Additional information on Bioperl and MAC OS: >> + OS 9 - http://bioperl.org/Core/mac-bioperl.html >> + OSX-http://www.tc.umn.edu/~cann0010/ >> Bioperl_OSX_install.html >> + OS X - Installing using Fink (in Getting BioPerl) >> >> >> >> THE BIOPERL BUNDLE >> >> You typically need root privileges to install using CPAN. If you >> don't >> have these privileges please see INSTALLING BIOPERL IN A PERSONAL >> MODULE >> AREA for additional information. >> >> Install Bundle::Bioperl using CPAN. One way: >>> perl -MCPAN -e "install Bundle::BioPerl" >> >> Another way: >>> perl -MCPAN -e shell >> cpan>install Bundle::BioPerl >> >> >> >> On Oct 17, 2006, at 8:04 AM, Chris Fields wrote: >> >>> There isn't a very easy way since so many links have to be removed/ >>> modified. >>> I have found a few CPAN modules that could help, but for now I just >>> dump the >>> text output from a text browser (elinks) using the 'printable >>> version' page >>> and hand-edit, which works very quickly. That works for the time >>> being >>> until I can find another more automated solution. >>> >>> Fortunately there have been very few edits to either INSTALL wiki >>> page so >>> they should remain relatively stable. >>> >>> Christopher Fields >>> Postdoctoral Researcher - Switzer Lab >>> Dept. of Biochemistry >>> University of Illinois Urbana-Champaign >>> >>>> -----Original Message----- >>>> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk] >>>> Sent: Tuesday, October 17, 2006 6:46 AM >>>> To: Chris Fields >>>> Cc: bioperl-l >>>> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN >>>> >>>> Chris Fields wrote: >>>>> The general consensus was to keep text versions available; we >>>>> could >>>>> add URL links to the wiki pages for the most up-to-dat version. >>>>> BTW, >>>>> I have modified INSTALL already. INSTALL.WIN is next in line >>>>> (I was >>>>> waiting for your changes). >>>>> >>>> Is it possible to generate these files from the wiki whenever >>>> there is a >>>> release? I now edits shouldn't be too severe or too often - but >>>> I can >>>> see things getting a little messy/annoying if edits have to be >>>> made in 2 >>>> places. >>>> >>>> Nath >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From bix at sendu.me.uk Tue Oct 17 14:07:04 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 17 Oct 2006 19:07:04 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca> References: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca> Message-ID: <45351BC8.9080507@sendu.me.uk> Paul Boutros wrote: > Hi, > Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two failed > tests, the first seems to be just a result of me not having DBD::mysql > installed. [snip] Thanks for those, very useful. Not something that's come up before afaik; I'll look into them. From cjfields at uiuc.edu Tue Oct 17 14:31:51 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 13:31:51 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <20061017135933.cu3li7pz8fzk8cko@webmail.utoronto.ca> Message-ID: <001401c6f21a$836f9fc0$15327e82@pyrimidine> Your local copy of XML::SAX has XML::SAX::Expat set as the default SAX backend parser. For some reason BLAST XML parsing doesn't work with that parser (it tries to verify the XML first before parsing, hence the DTD error). I may try getting this to work again, but so far I haven't found an easy way to prevent XML verification via XML::SAX::Expat. There are two options: 1) install XML::SAX::ExpatXS (the better option), which works AND is 4x faster than XML::SAX::Expat, or 2) set the default parser in the PareserDetails.ini file in your local to use XML::SAX::PurePerl. BTW, XML::SAX::ExpatXS is intended to replace XML::SAX::Expat (it just hasn't officially happened yet); the latter hasn't had significant development in about three years. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: Paul Boutros [mailto:paul.boutros at utoronto.ca] > Sent: Tuesday, October 17, 2006 1:00 PM > To: Chris Fields > Cc: bioperl-l at lists.open-bio.org > Subject: RE: [Bioperl-l] Bioperl 1.5.2 RC2 > > Hi Chris, > > Here it is: > pcboutro at ccb690[643] >> perl -I. t/SearchIO.t > 1..1337 > ok 1 > > -------------------- WARNING --------------------- > MSG: XML::SAX::Expat not currently supported; must have local copies > of NCBI DTD docs! > --------------------------------------------------- > > -------------------- WARNING --------------------- > MSG: error in parsing a report: > > 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' > does not exist > file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd > Handler couldn't resolve external entity at line 2, column 82, byte 104 > error in processing external entity reference at line 2, column 82, > byte 104 at > /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line > 187 > > --------------------------------------------------- > not ok 2 > # Failed test 2 in t/SearchIO.t at line 68 > Can't call method "database_name" on an undefined value at > t/SearchIO.t line 69. > > > Quoting Chris Fields : > > > What do you get when you run the SearchIO.t test by itself using 'perl - > I. > > t/SearchIO.t'? It looks like something pretty catastrophic happened. > > > > Christopher Fields > > Postdoctoral Researcher - Switzer Lab > > Dept. of Biochemistry > > University of Illinois Urbana-Champaign > > > > > >> -----Original Message----- > >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > >> bounces at lists.open-bio.org] On Behalf Of Paul Boutros > >> Sent: Tuesday, October 17, 2006 11:57 AM > >> To: bioperl-l at lists.open-bio.org > >> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2 > >> > >> Hi, > >> Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two failed > >> tests, the first seems to be just a result of me not having DBD::mysql > >> installed. > >> Paul > >> > >> Test Summary > >> ============ > >> > >> Failed Test Stat Wstat Total Fail List of Failed > >> ----------------------------------------------------------------------- > --- > >> ----- > >> t/BioDBSeqFeature_mysql.t 46 46 1-46 > >> t/SearchIO.t 22 5632 1337 2671 2-1337 > >> 2 tests and 106 subtests skipped. > >> Failed 2/236 test scripts. 1382/11688 subtests failed. > >> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys = > >> 159.61 CPU) > >> > >> BioDBSeqFeature_mysql > >> ===================== > >> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t > >> 1..46 > >> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC > >> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t > >> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8 > >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi > >> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at > >> (eval 37) line 3. > >> Perhaps the DBD::mysql perl module hasn't been fully installed, > >> or perhaps the capitalisation of 'mysql' isn't right. > >> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge. > >> at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208 > >> > >> SearchIO > >> ======== > >> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more > >> 1..1337 > >> ok 1 > >> > >> -------------------- WARNING --------------------- > >> MSG: XML::SAX::Expat not currently supported; must have local copies > >> of NCBI DTD docs! > >> --------------------------------------------------- > >> > >> -------------------- WARNING --------------------- > >> MSG: error in parsing a report: > >> > >> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' > >> does not exist > >> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd > >> Handler couldn't resolve external entity at line 2, column 82, byte 104 > >> error in processing external entity reference at line 2, column 82, > >> byte 104 at > >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line > >> 187 > >> > >> --------------------------------------------------- > >> not ok 2 > >> # Failed test 2 in t/SearchIO.t at line 68 > >> Can't call method "database_name" on an undefined value at > >> t/SearchIO.t line 69. > >> > >> ------------------------------ > >> > >> Message: 10 > >> Date: Tue, 17 Oct 2006 11:32:54 +0100 > >> From: Sendu Bala > >> Subject: [Bioperl-l] Bioperl 1.5.2 RC2 > >> To: bioperl-l at bioperl.org > >> Message-ID: <4534B156.4090501 at sendu.me.uk> > >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed > >> > >> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. > >> See http://www.bioperl.org/wiki/Release_1.5.2 for > >> instructions on getting and testing this RC. > >> > >> Developers: > >> This should be the last RC before release ~next monday. Now would > >> be a good time for last minute documentaiton updates and additions. > >> > >> Users: > >> Even though 1.5.2 is a 'developer' release, we consider it the most > >> stable and capable version of Bioperl, and recommend that you use > >> it in all but the most critical production environments. Please > >> try it out and let us know of any problems or difficulties you run > >> into. > >> > >> > >> Thank you, > >> Sendu. > >> > >> > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > From cjfields at uiuc.edu Tue Oct 17 15:05:59 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 14:05:59 -0500 Subject: [Bioperl-l] split location problems In-Reply-To: Message-ID: <001b01c6f21f$48640420$15327e82@pyrimidine> > > From: Jason Stajich [mailto:jason.stajich at gmail.com] > > > > The whole point of split locations is to represent genes with > > introns > > so that is not the "rare" case. > > Absolutely. Right, but that specific kind of join statement is not commonly used in GenBank files, which seems to be the format predominately used (no offense to EBI). This may explain why we haven't seen this pop up more often. I believe we're seeing is a difference in the way these locations are described at NCBI vs EBI, which Nadeem Faruque seems to corroborate. He indicated that EBI may move to using similar GenBank-like location strings. Regardless, FTlocationFactory and Bio::Location::Split should handle both if they are present but only seems to like the GenBank version. > > I've added code to test this to bug 2101 including a C.glabrata > > chromsome downloaded from genbank. Perhaps the problem is on the > > EMBL parsing side, I didn't test that. > > Well, I don't know whether it's EMBL parsing, or a bit further down the > pipe, but I downloaded C.glabrata chromosome B for GenBank (NC_005968), > and it describes the complement/joins in the way that Bioperl is > handling correctly. > > GenBank: > CDS complement(join(10347..10372,10632..11157)) > /locus_tag="CAGL0B00242g" > > EMBL: > FT CDS > join(complement(10632..11157),complement(10347..10372)) > FT /locus_tag="CAGL0B00242g" Yes, something that I found out independently (and corroborated by Nadeem). > Here's the diff when I run the location-printing script I posted > yesterday: > > diff biogb bio > 1c1,5 > < complement(join(10347..10372,10632..11157)) > --- > > complement(1701..2651) > > complement(2635..3345) > > complement(3980..4408) > > complement(join(10632..11157,10347..10372)) > > 10379..10615 > 209a214,217 > > 498198..498890 > > 499712..500062 > > 499851..500702 > > 500579..501364 > > As you can see, the complement/join CDS is written out in a different > order, which is Bad. I think this can be handled directly in to_FTstring(). I'll have to add a method to get the strand info from the Split object w/o going through strand(). However, I'm thinking about trying a different tact which is a bit simpler and, if it proves fruitful, may simplify Split locations somewhat. It won't be ready for 1.5.2 but maybe the next release. > (I looked at at least one of the other differences: the GB file says > it's a "misc feature" and EMBL says it's a CDS. But they don't seem to > be relevant here.) > -Amir Probably not but something to keep in mind. -c Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From er at xs4all.nl Tue Oct 17 15:01:48 2006 From: er at xs4all.nl (Erikjan) Date: Tue, 17 Oct 2006 21:01:48 +0200 (CEST) Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <001301c6f215$07a9a070$15327e82@pyrimidine> References: <001301c6f215$07a9a070$15327e82@pyrimidine> Message-ID: <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> Hello, I noticed a little problem with the Annotation "DBLink" from GenBank entries When I run: perl -MBio::DB::GenBank -e 'my $gi = 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio = $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my $ac=$seq->annotation(); my @annotations = $ac->get_Annotations("dblink"); for(@annotations) { print $_, "\n";} print $INC{ "Bio/Annotation/DBLink.pm" }, "\n"; ' This yields: GenBank:AL591065.17.17 and the place where the used Bio/Annotation/DBLink.pm resides. Can others repeat this? I have dug into the source a little and Bio::Annotation::DBLink seems to be the place where this happens: it has a concatenation which leads to that repeated version number. It this something that I should fix "client-side", so to speak, or is it worthwhile to add some logic to that concatenation to prevent this? Thanks, Eric From bosborne11 at verizon.net Tue Oct 17 13:40:54 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 17 Oct 2006 13:40:54 -0400 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <3CACE1AB-D7F2-4F54-A1C3-35E1AE444CD1@genetics.utah.edu> Message-ID: Barry, I second that. lynx does the best job of converting HTML to text I've seen. Brian O. On 10/17/06 12:57 PM, "Barry Moore" wrote: > lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix > > does a reasonable job of textifying html. You get the links as > numbered references at the bottom or: > > lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix | > perl -ane 's/\[?\[\d+\](edit\])?//g;print' > > to remove the links all together. > > Barry > > P.S. Looks like this: > > #Creative Commons copyright > > Installing Bioperl for Unix > > From BioPerl > > Jump to: navigation, search > > Contents > > * 1 BIOPERL INSTALLATION > * 2 SYSTEM REQUIREMENTS > * 3 OPTIONAL > * 4 ADDITIONAL INSTALLATION INFORMATION > * 5 THE BIOPERL BUNDLE > * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN > * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make' > * 8 WHERE ARE THE MAN PAGES? > * 9 EXTERNAL PROGRAMS > + 9.1 Environment Variables > * 10 INSTALLING BIOPERL SCRIPTS > * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA > * 12 INSTALLING BIOPERL MODULES THE HARD WAY > * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION > * 14 THE TEST SYSTEM > * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE > + 15.1 CONFIGURING for BSD and Solaris boxes > + 15.2 INSTALLATION > * 16 DEPENDENCIES AND Bundle::BioPerl > > > BIOPERL INSTALLATION > > Bioperl has been installed on many forms of Unix, Win9X/NT/2000/XP, > and on Mac OS X (see the PLATFORMS file for more details). > Following are > instructions for installing Bioperl for Unix/Linux/Mac OS X; > Windows > installation instructions can be found here. For installing > Bioperl for > Mac OS X using Fink, see Getting BioPerl. > > > SYSTEM REQUIREMENTS > > * Perl 5.005 or later; version 5.6 and greater are recommended. > Note > that most modules will work with earlier versions of Perl. > The only ones > that will not are Bio::SimpleAlign and the Bio::Index::* > modules. If > you don't need these modules and you want to install Bioperl > using an > earlier version of Perl, edit the "require 5.005;" line in > Makefile.PL > as necessary. > > * External modules: Bioperl uses functionality provided in > other Perl > modules. Some of these are included in the standard perl > package but > some need to be obtained from the CPAN site. The list of > external > modules is included at the bottom of this document. > > The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of > these > external modules easy. Simply install the bundle using your CPAN > shell and > all necessary modules will be installed. See THE BIOPERL BUNDLE, > below. > > > OPTIONAL > > * ANSI C or GNU C compiler (gcc) for XS extensions (the > bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext > PACKAGE, below). > > > > ADDITIONAL INSTALLATION INFORMATION > > * Additional information on Bioperl and MAC OS: > + OS 9 - http://bioperl.org/Core/mac-bioperl.html > + OSX-http://www.tc.umn.edu/~cann0010/ > Bioperl_OSX_install.html > + OS X - Installing using Fink (in Getting BioPerl) > > > > THE BIOPERL BUNDLE > > You typically need root privileges to install using CPAN. If you > don't > have these privileges please see INSTALLING BIOPERL IN A PERSONAL > MODULE > AREA for additional information. > > Install Bundle::Bioperl using CPAN. One way: >> perl -MCPAN -e "install Bundle::BioPerl" > > Another way: >> perl -MCPAN -e shell > cpan>install Bundle::BioPerl > > > > On Oct 17, 2006, at 8:04 AM, Chris Fields wrote: > >> There isn't a very easy way since so many links have to be removed/ >> modified. >> I have found a few CPAN modules that could help, but for now I just >> dump the >> text output from a text browser (elinks) using the 'printable >> version' page >> and hand-edit, which works very quickly. That works for the time >> being >> until I can find another more automated solution. >> >> Fortunately there have been very few edits to either INSTALL wiki >> page so >> they should remain relatively stable. >> >> Christopher Fields >> Postdoctoral Researcher - Switzer Lab >> Dept. of Biochemistry >> University of Illinois Urbana-Champaign >> >>> -----Original Message----- >>> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk] >>> Sent: Tuesday, October 17, 2006 6:46 AM >>> To: Chris Fields >>> Cc: bioperl-l >>> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN >>> >>> Chris Fields wrote: >>>> The general consensus was to keep text versions available; we could >>>> add URL links to the wiki pages for the most up-to-dat version. >>>> BTW, >>>> I have modified INSTALL already. INSTALL.WIN is next in line (I was >>>> waiting for your changes). >>>> >>> Is it possible to generate these files from the wiki whenever >>> there is a >>> release? I now edits shouldn't be too severe or too often - but I can >>> see things getting a little messy/annoying if edits have to be >>> made in 2 >>> places. >>> >>> Nath >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Tue Oct 17 16:30:15 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 15:30:15 -0500 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> References: <001301c6f215$07a9a070$15327e82@pyrimidine> <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> Message-ID: <0FB91820-B2A1-4F7F-866C-8D4791DD8306@uiuc.edu> I can confirm this using bioperl-live: GenBank:AL591065.17.17 /Users/cjfields/src/bioperl-live/Bio/Annotation/DBLink.pm Could you file a bug report via bugzilla? Chris On Oct 17, 2006, at 2:01 PM, Erikjan wrote: > Hello, > > I noticed a little problem with the Annotation "DBLink" from > GenBank entries > > When I run: > > perl -MBio::DB::GenBank -e 'my $gi = > 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio = > $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my > $ac=$seq->annotation(); my @annotations = $ac->get_Annotations > ("dblink"); > for(@annotations) { print $_, "\n";} print $INC{ > "Bio/Annotation/DBLink.pm" }, "\n"; ' > > This yields: > > GenBank:AL591065.17.17 > > and the place where the used Bio/Annotation/DBLink.pm resides. > > Can others repeat this? > > I have dug into the source a little and Bio::Annotation::DBLink > seems to > be the place where this happens: it has a concatenation which leads to > that repeated version number. > > It this something that I should fix "client-side", so to speak, or > is it > worthwhile to add some logic to that concatenation to prevent this? > > > Thanks, > > Eric > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From paul.boutros at utoronto.ca Tue Oct 17 19:49:52 2006 From: paul.boutros at utoronto.ca (Paul Boutros) Date: Tue, 17 Oct 2006 19:49:52 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <001401c6f21a$836f9fc0$15327e82@pyrimidine> References: <001401c6f21a$836f9fc0$15327e82@pyrimidine> Message-ID: <20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca> Hi Chris, Yup, that's it. I installed XML::SAX::ExpatXS (make test output below). Should there be a note somewhere in the INSTALL docs saying basically what you just wrote? Or maybe it's already there somewhere and I missed it. Also, I'd propose changing BioDBSeqFeature_mysql.t so that it checks if DBD::mysql can be loaded, and if not doesn't run the test. Since the file is only one-line long, here's the modified file rather than a patch: ################################################################ BEGIN { # DBD::mysql is required eval { require DBD::mysql; }; if ( $@ ) { use Test::More skip_all => "DBD::mysql is not installed or is installed incorrectly - skipping BioDBSeqFeature _mysql.t"; exit(0); } } system "perl t/BioDBSeqFeature.t -adaptor DBI::mysql -create 1 -temp 1 -dsn test"; ################################################################ And when I run it I get: t/BioDBSeqFeature_mysql......skipped all skipped: DBD::mysql is not installed or is installed incorrectly - skipping BioDBSeqFeature_mysql.t And for the overall make test: All tests successful, 3 tests and 106 subtests skipped. Files=236, Tests=11642, 247 wallclock secs (143.85 cusr + 20.39 csys = 164.24 CPU) Hope this helps, Paul Quoting Chris Fields : > Your local copy of XML::SAX has XML::SAX::Expat set as the default SAX > backend parser. For some reason BLAST XML parsing doesn't work with that > parser (it tries to verify the XML first before parsing, hence the DTD > error). I may try getting this to work again, but so far I haven't found an > easy way to prevent XML verification via XML::SAX::Expat. > > There are two options: 1) install XML::SAX::ExpatXS (the better option), > which works AND is 4x faster than XML::SAX::Expat, or 2) set the default > parser in the PareserDetails.ini file in your local to use > XML::SAX::PurePerl. > > BTW, XML::SAX::ExpatXS is intended to replace XML::SAX::Expat (it just > hasn't officially happened yet); the latter hasn't had significant > development in about three years. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > >> -----Original Message----- >> From: Paul Boutros [mailto:paul.boutros at utoronto.ca] >> Sent: Tuesday, October 17, 2006 1:00 PM >> To: Chris Fields >> Cc: bioperl-l at lists.open-bio.org >> Subject: RE: [Bioperl-l] Bioperl 1.5.2 RC2 >> >> Hi Chris, >> >> Here it is: >> pcboutro at ccb690[643] >> perl -I. t/SearchIO.t >> 1..1337 >> ok 1 >> >> -------------------- WARNING --------------------- >> MSG: XML::SAX::Expat not currently supported; must have local copies >> of NCBI DTD docs! >> --------------------------------------------------- >> >> -------------------- WARNING --------------------- >> MSG: error in parsing a report: >> >> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' >> does not exist >> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd >> Handler couldn't resolve external entity at line 2, column 82, byte 104 >> error in processing external entity reference at line 2, column 82, >> byte 104 at >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line >> 187 >> >> --------------------------------------------------- >> not ok 2 >> # Failed test 2 in t/SearchIO.t at line 68 >> Can't call method "database_name" on an undefined value at >> t/SearchIO.t line 69. >> >> >> Quoting Chris Fields : >> >> > What do you get when you run the SearchIO.t test by itself using 'perl - >> I. >> > t/SearchIO.t'? It looks like something pretty catastrophic happened. >> > >> > Christopher Fields >> > Postdoctoral Researcher - Switzer Lab >> > Dept. of Biochemistry >> > University of Illinois Urbana-Champaign >> > >> > >> >> -----Original Message----- >> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >> >> bounces at lists.open-bio.org] On Behalf Of Paul Boutros >> >> Sent: Tuesday, October 17, 2006 11:57 AM >> >> To: bioperl-l at lists.open-bio.org >> >> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2 >> >> >> >> Hi, >> >> Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two failed >> >> tests, the first seems to be just a result of me not having DBD::mysql >> >> installed. >> >> Paul >> >> >> >> Test Summary >> >> ============ >> >> >> >> Failed Test Stat Wstat Total Fail List of Failed >> >> ----------------------------------------------------------------------- >> --- >> >> ----- >> >> t/BioDBSeqFeature_mysql.t 46 46 1-46 >> >> t/SearchIO.t 22 5632 1337 2671 2-1337 >> >> 2 tests and 106 subtests skipped. >> >> Failed 2/236 test scripts. 1382/11688 subtests failed. >> >> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys = >> >> 159.61 CPU) >> >> >> >> BioDBSeqFeature_mysql >> >> ===================== >> >> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t >> >> 1..46 >> >> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC >> >> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t >> >> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8 >> >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi >> >> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at >> >> (eval 37) line 3. >> >> Perhaps the DBD::mysql perl module hasn't been fully installed, >> >> or perhaps the capitalisation of 'mysql' isn't right. >> >> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge. >> >> at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208 >> >> >> >> SearchIO >> >> ======== >> >> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more >> >> 1..1337 >> >> ok 1 >> >> >> >> -------------------- WARNING --------------------- >> >> MSG: XML::SAX::Expat not currently supported; must have local copies >> >> of NCBI DTD docs! >> >> --------------------------------------------------- >> >> >> >> -------------------- WARNING --------------------- >> >> MSG: error in parsing a report: >> >> >> >> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' >> >> does not exist >> >> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd >> >> Handler couldn't resolve external entity at line 2, column 82, byte 104 >> >> error in processing external entity reference at line 2, column 82, >> >> byte 104 at >> >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line >> >> 187 >> >> >> >> --------------------------------------------------- >> >> not ok 2 >> >> # Failed test 2 in t/SearchIO.t at line 68 >> >> Can't call method "database_name" on an undefined value at >> >> t/SearchIO.t line 69. >> >> >> >> ------------------------------ >> >> >> >> Message: 10 >> >> Date: Tue, 17 Oct 2006 11:32:54 +0100 >> >> From: Sendu Bala >> >> Subject: [Bioperl-l] Bioperl 1.5.2 RC2 >> >> To: bioperl-l at bioperl.org >> >> Message-ID: <4534B156.4090501 at sendu.me.uk> >> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed >> >> >> >> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. >> >> See http://www.bioperl.org/wiki/Release_1.5.2 for >> >> instructions on getting and testing this RC. >> >> >> >> Developers: >> >> This should be the last RC before release ~next monday. Now would >> >> be a good time for last minute documentaiton updates and additions. >> >> >> >> Users: >> >> Even though 1.5.2 is a 'developer' release, we consider it the most >> >> stable and capable version of Bioperl, and recommend that you use >> >> it in all but the most critical production environments. Please >> >> try it out and let us know of any problems or difficulties you run >> >> into. >> >> >> >> >> >> Thank you, >> >> Sendu. >> >> >> >> >> >> >> >> _______________________________________________ >> >> Bioperl-l mailing list >> >> Bioperl-l at lists.open-bio.org >> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > >> > >> > > > From cjfields at uiuc.edu Tue Oct 17 20:51:35 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 19:51:35 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca> References: <001401c6f21a$836f9fc0$15327e82@pyrimidine> <20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca> Message-ID: <67CA244C-5A2C-46A0-B474-47C5B67D199B@uiuc.edu> On Oct 17, 2006, at 6:49 PM, Paul Boutros wrote: > Hi Chris, > > Yup, that's it. I installed XML::SAX::ExpatXS (make test output > below). Should there be a note somewhere in the INSTALL docs saying > basically what you just wrote? Or maybe it's already there somewhere > and I missed it. The INSTALL docs should have this, yes. I'll double-check though. Pretty much anything that plugs into XML::SAX except XML::SAX::Expat works (XML::LibXML also works, I found). > Also, I'd propose changing BioDBSeqFeature_mysql.t so that it checks > if DBD::mysql can be loaded, and if not doesn't run the test. Since > the file is only one-line long, here's the modified file rather than a > patch: > ################################################################ > BEGIN { > # DBD::mysql is required > eval { > require DBD::mysql; > }; > if ( $@ ) { > use Test::More skip_all => "DBD::mysql is not > installed or is installed incorrectly - skipping BioDBSeqFeature > _mysql.t"; > exit(0); > } > } > > system "perl t/BioDBSeqFeature.t -adaptor DBI::mysql -create 1 -temp 1 > -dsn test"; > ################################################################ > > And when I run it I get: > t/BioDBSeqFeature_mysql......skipped > all skipped: DBD::mysql is not installed or is installed > incorrectly - skipping BioDBSeqFeature_mysql.t > > And for the overall make test: > All tests successful, 3 tests and 106 subtests skipped. > Files=236, Tests=11642, 247 wallclock secs (143.85 cusr + 20.39 csys = > 164.24 CPU) It should check this when using 'perl Makefile.PL', since the tests are only set up if MySQL is present (so you would assume that it checks for DBD::mysql). I'll look into it. Chris > Hope this helps, > Paul > > > Quoting Chris Fields : > >> Your local copy of XML::SAX has XML::SAX::Expat set as the default >> SAX >> backend parser. For some reason BLAST XML parsing doesn't work >> with that >> parser (it tries to verify the XML first before parsing, hence the >> DTD >> error). I may try getting this to work again, but so far I >> haven't found an >> easy way to prevent XML verification via XML::SAX::Expat. >> >> There are two options: 1) install XML::SAX::ExpatXS (the better >> option), >> which works AND is 4x faster than XML::SAX::Expat, or 2) set the >> default >> parser in the PareserDetails.ini file in your local to use >> XML::SAX::PurePerl. >> >> BTW, XML::SAX::ExpatXS is intended to replace XML::SAX::Expat (it >> just >> hasn't officially happened yet); the latter hasn't had significant >> development in about three years. >> >> Christopher Fields >> Postdoctoral Researcher - Switzer Lab >> Dept. of Biochemistry >> University of Illinois Urbana-Champaign >> >> >>> -----Original Message----- >>> From: Paul Boutros [mailto:paul.boutros at utoronto.ca] >>> Sent: Tuesday, October 17, 2006 1:00 PM >>> To: Chris Fields >>> Cc: bioperl-l at lists.open-bio.org >>> Subject: RE: [Bioperl-l] Bioperl 1.5.2 RC2 >>> >>> Hi Chris, >>> >>> Here it is: >>> pcboutro at ccb690[643] >> perl -I. t/SearchIO.t >>> 1..1337 >>> ok 1 >>> >>> -------------------- WARNING --------------------- >>> MSG: XML::SAX::Expat not currently supported; must have local copies >>> of NCBI DTD docs! >>> --------------------------------------------------- >>> >>> -------------------- WARNING --------------------- >>> MSG: error in parsing a report: >>> >>> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' >>> does not exist >>> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd >>> Handler couldn't resolve external entity at line 2, column 82, >>> byte 104 >>> error in processing external entity reference at line 2, column 82, >>> byte 104 at >>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm >>> line >>> 187 >>> >>> --------------------------------------------------- >>> not ok 2 >>> # Failed test 2 in t/SearchIO.t at line 68 >>> Can't call method "database_name" on an undefined value at >>> t/SearchIO.t line 69. >>> >>> >>> Quoting Chris Fields : >>> >>>> What do you get when you run the SearchIO.t test by itself using >>>> 'perl - >>> I. >>>> t/SearchIO.t'? It looks like something pretty catastrophic >>>> happened. >>>> >>>> Christopher Fields >>>> Postdoctoral Researcher - Switzer Lab >>>> Dept. of Biochemistry >>>> University of Illinois Urbana-Champaign >>>> >>>> >>>>> -----Original Message----- >>>>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >>>>> bounces at lists.open-bio.org] On Behalf Of Paul Boutros >>>>> Sent: Tuesday, October 17, 2006 11:57 AM >>>>> To: bioperl-l at lists.open-bio.org >>>>> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2 >>>>> >>>>> Hi, >>>>> Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two >>>>> failed >>>>> tests, the first seems to be just a result of me not having >>>>> DBD::mysql >>>>> installed. >>>>> Paul >>>>> >>>>> Test Summary >>>>> ============ >>>>> >>>>> Failed Test Stat Wstat Total Fail List of Failed >>>>> ------------------------------------------------------------------ >>>>> ----- >>> --- >>>>> ----- >>>>> t/BioDBSeqFeature_mysql.t 46 46 1-46 >>>>> t/SearchIO.t 22 5632 1337 2671 2-1337 >>>>> 2 tests and 106 subtests skipped. >>>>> Failed 2/236 test scripts. 1382/11688 subtests failed. >>>>> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 >>>>> csys = >>>>> 159.61 CPU) >>>>> >>>>> BioDBSeqFeature_mysql >>>>> ===================== >>>>> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t >>>>> 1..46 >>>>> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC >>>>> (@INC >>>>> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t >>>>> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8 >>>>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi >>>>> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/ >>>>> site_perl) at >>>>> (eval 37) line 3. >>>>> Perhaps the DBD::mysql perl module hasn't been fully installed, >>>>> or perhaps the capitalisation of 'mysql' isn't right. >>>>> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge. >>>>> at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208 >>>>> >>>>> SearchIO >>>>> ======== >>>>> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more >>>>> 1..1337 >>>>> ok 1 >>>>> >>>>> -------------------- WARNING --------------------- >>>>> MSG: XML::SAX::Expat not currently supported; must have local >>>>> copies >>>>> of NCBI DTD docs! >>>>> --------------------------------------------------- >>>>> >>>>> -------------------- WARNING --------------------- >>>>> MSG: error in parsing a report: >>>>> >>>>> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/ >>>>> NCBI_BlastOutput.dtd' >>>>> does not exist >>>>> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd >>>>> Handler couldn't resolve external entity at line 2, column 82, >>>>> byte 104 >>>>> error in processing external entity reference at line 2, column >>>>> 82, >>>>> byte 104 at >>>>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/ >>>>> Parser.pm line >>>>> 187 >>>>> >>>>> --------------------------------------------------- >>>>> not ok 2 >>>>> # Failed test 2 in t/SearchIO.t at line 68 >>>>> Can't call method "database_name" on an undefined value at >>>>> t/SearchIO.t line 69. >>>>> >>>>> ------------------------------ >>>>> >>>>> Message: 10 >>>>> Date: Tue, 17 Oct 2006 11:32:54 +0100 >>>>> From: Sendu Bala >>>>> Subject: [Bioperl-l] Bioperl 1.5.2 RC2 >>>>> To: bioperl-l at bioperl.org >>>>> Message-ID: <4534B156.4090501 at sendu.me.uk> >>>>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed >>>>> >>>>> Bioperl 1.5.2 Release Candidate 2 is ready and available for >>>>> testing. >>>>> See http://www.bioperl.org/wiki/Release_1.5.2 for >>>>> instructions on getting and testing this RC. >>>>> >>>>> Developers: >>>>> This should be the last RC before release ~next monday. Now >>>>> would >>>>> be a good time for last minute documentaiton updates and >>>>> additions. >>>>> >>>>> Users: >>>>> Even though 1.5.2 is a 'developer' release, we consider it >>>>> the most >>>>> stable and capable version of Bioperl, and recommend that >>>>> you use >>>>> it in all but the most critical production environments. >>>>> Please >>>>> try it out and let us know of any problems or difficulties >>>>> you run >>>>> into. >>>>> >>>>> >>>>> Thank you, >>>>> Sendu. >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> >>> >> >> >> > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Wed Oct 18 02:52:05 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 18 Oct 2006 07:52:05 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4534B156.4090501@sendu.me.uk> References: <4534B156.4090501@sendu.me.uk> Message-ID: <4535CF15.4090502@sendu.me.uk> Sendu Bala wrote: > Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. > See http://www.bioperl.org/wiki/Release_1.5.2 for > instructions on getting and testing this RC. > > Developers: > This should be the last RC before release ~next monday. Now would > be a good time for last minute documentaiton updates and additions. Given the few issues that have come up, it would be prudent to have another RC, so expect one around the time the 'Needs investigation' issues on the release page have been solved. If you think there are more things that need investigation, please add them, but note the bias toward things that affect the successful completion of the test suite as opposed to general bugs which should go to Bugzilla as normal. From bix at sendu.me.uk Wed Oct 18 04:55:21 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 18 Oct 2006 09:55:21 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <45350BA6.3040102@genomics.dk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> Message-ID: <4535EBF9.1090706@sendu.me.uk> Niels Larsen wrote: > ------------ EBI > > I invoked the EBI script > > http://www.ebi.ac.uk/Tools/webservices/downloads/WSWUBlastSOAPLite.zip > > like this > > WSWUBlastClient.pl -p blastn -D embl test.fasta > > where the content of test.fasta is below, and got > > Can't find method element in the message at > /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311. As you admit, this is not a Bioperl issue. I would suggest you contact EBI support. In the mean time/alternatively I'd suggest investigating the Bioperl interface to the SOAP server, which is part of the Bioperl-run package. http://doc.bioperl.org/releases/bioperl-current/bioperl-run/Bio/Tools/Run/Analysis.html > ------------ DDBJ > > Inspired by this page, > > http://xml.nig.ac.jp/doc/Blast.txt > > I made this test script [snip] > which for me prints undef. Again, not something I can really help you with. You'll need to triple-check your code and then seek support from the providers of that SOAP service. > ------------- NCBI/Bioperl > > I installed 1.5.2-RC2, looked at the RemoteBlast example in > > http://www.bioperl.org/wiki/Bptutorial.pl > > and then put that into this test code, more or less cut/paste, [snip] > Maybe I am supposed to add a check for content in $rc and then stop > the inner loop? Yes, the wiki page example isn't really adequate. I'll update it. For a better code example see the RemoteBlast documentation: http://doc.bioperl.org/bioperl-live/Bio/Tools/Run/RemoteBlast.html > I could figure that out maybe, but I wish there was a > function which simply takes a single sequence + arguments and only > returns a list of matches when done, and does not return until then > (or until a specified timeout). Yes, I hardly find dealing with RIDs that pleasant. You might like to add a feature request to Bugzilla. From n.haigh at sheffield.ac.uk Wed Oct 18 05:58:00 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Wed, 18 Oct 2006 10:58:00 +0100 Subject: [Bioperl-l] RC2 test results on WinXP Message-ID: <4535FAA8.2050506@sheffield.ac.uk> I get all tests passing except for BioDBSeqFeature_mysql which fails all tests (1-46). During perl Makefile.PL I get: "I see you have Berkeleydb installed. I will create the DBD tests for Bio::DB::SeqFeature::Store..." I notice under the "needs investigation" there is mention about tests been generated even if DBD::mysql isn't installed. I assume this is the problem? If this is the problem should DBD::mysql be added to the dependencies in Makefile.PL? Is there an easy way to find out what tests are being skipped due to absent modules? Cheers Nath From n.haigh at sheffield.ac.uk Wed Oct 18 07:34:21 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Wed, 18 Oct 2006 12:34:21 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4535EBF9.1090706@sendu.me.uk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> <4535EBF9.1090706@sendu.me.uk> Message-ID: <4536113D.1080307@sheffield.ac.uk> I've just added test results for 1.5.2 RC2 to the wiki. There are lots of fails for packages other than bioperl-live. I'm not sure excatly how the test fails/skipps are/should be handled since my setups are as follows. Clean WinXP Pro: This is a clean install of WinXP Pro SP2 with no major software installed, other than ActivePerl 5.8.8.819 and a few tools for archive extracting, anti virus etc. Therefore, I'm unsure how tests in bioperl-network and bioperl-db should return. For example, I have made no effort to setup biosql-schema but I thought that maybe there would be a test that would detect this, and fail, then skip over other tests gracefully - like the bioperl-run tests when a piece of software is not installed??? Debian Linux: This is a Bio-Linux machine with quite a lot of bioinformatics software installed in the Path. So most of the tests in bioperl-run should probably have passed. The same goes for bioperl-network and bioperl-db as with my Windows setup. If my thoughts are totally wrong - let me know! Nath From bix at sendu.me.uk Wed Oct 18 08:03:11 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 18 Oct 2006 13:03:11 +0100 Subject: [Bioperl-l] RC2 test results on WinXP In-Reply-To: <4535FAA8.2050506@sheffield.ac.uk> References: <4535FAA8.2050506@sheffield.ac.uk> Message-ID: <453617FF.9080508@sendu.me.uk> Nathan Haigh wrote: > I get all tests passing except for BioDBSeqFeature_mysql which fails all > tests (1-46). > > During perl Makefile.PL I get: > "I see you have Berkeleydb installed. I will create the DBD tests for > Bio::DB::SeqFeature::Store..." > > I notice under the "needs investigation" there is mention about tests > been generated even if DBD::mysql isn't installed. I assume this is the > problem? Probably. I'm looking into it. Not sure why it wasn't causing a problem before now. > If this is the problem should DBD::mysql be added to the > dependencies in Makefile.PL? No. You can use the modules in question without mysql (presumably; ie. you have a different sql setup), so it makes no sense to warn people they don't have a module they absolutely do not need. > Is there an easy way to find out what tests are being skipped due to > absent modules? Ideally, when the skip occurs the test script will issue a message. I think that happens in most, if not all cases. From bix at sendu.me.uk Wed Oct 18 09:02:50 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 18 Oct 2006 14:02:50 +0100 Subject: [Bioperl-l] RC2 test results on WinXP In-Reply-To: <453617FF.9080508@sendu.me.uk> References: <4535FAA8.2050506@sheffield.ac.uk> <453617FF.9080508@sendu.me.uk> Message-ID: <453625FA.6090907@sendu.me.uk> Sendu Bala wrote: > Nathan Haigh wrote: ? >> I notice under the "needs investigation" there is mention about tests >> been generated even if DBD::mysql isn't installed. I assume this is the >> problem? > > Probably. I'm looking into it. Not sure why it wasn't causing a problem > before now. > > > If this is the problem should DBD::mysql be added to the > > dependencies in Makefile.PL? > > No. You can use the modules in question without mysql (presumably; ie. > you have a different sql setup), so it makes no sense to warn people > they don't have a module they absolutely do not need. Oops. It /is/ in the pre-reqs in Makefile.PL. Maybe DBD::mysql is the only supported driver? From bix at sendu.me.uk Wed Oct 18 09:16:24 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 18 Oct 2006 14:16:24 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <67CA244C-5A2C-46A0-B474-47C5B67D199B@uiuc.edu> References: <001401c6f21a$836f9fc0$15327e82@pyrimidine> <20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca> <67CA244C-5A2C-46A0-B474-47C5B67D199B@uiuc.edu> Message-ID: <45362928.8070104@sendu.me.uk> Chris Fields wrote: > On Oct 17, 2006, at 6:49 PM, Paul Boutros wrote: > >> Hi Chris, >> >> Yup, that's it. I installed XML::SAX::ExpatXS (make test output >> below). Should there be a note somewhere in the INSTALL docs saying >> basically what you just wrote? Or maybe it's already there somewhere >> and I missed it. > > The INSTALL docs should have this, yes. I'll double-check though. > > Pretty much anything that plugs into XML::SAX except XML::SAX::Expat > works (XML::LibXML also works, I found). > >> Also, I'd propose changing BioDBSeqFeature_mysql.t so that it checks >> if DBD::mysql can be loaded, [snip] > It should check this when using 'perl Makefile.PL', since the tests > are only set up if MySQL is present (so you would assume that it > checks for DBD::mysql). I'll look into it. This looks like a booboo on my part. I left t/BioDBSeqFeature_mysql.t in my t directory when I packed it up for release. I'm tweaking Makefile.PL right now in any case; there are a few errors and t/BioDBSeqFeature_mysql.t et al. don't get deleted in a make clean. From cjfields at uiuc.edu Wed Oct 18 09:55:37 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 08:55:37 -0500 Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in GenBank/EMBL/DDBJ Message-ID: <001e01c6f2bd$1769aca0$15327e82@pyrimidine> Ding dong the witch is dead! As announce previously, from the latest GenBank release (156.0): ----------------------------------------------- 1.3.8 Feature location syntax X.Y no longer supported The Feature Table has supported feature locations of the form 'X.Y', to represent a base position which is greater or equal to X, and less than or equal to Y. For example: misc_feature 1.10..20 misc_feature join(100..150,200.210..250) In the first example, the misc_feature starts somewhere between bases 1 and 10 (inclusive), and ends at basepair 20. In the second, the 51 bases from 100..150 are joined together with a second basepair interval, which could be anywhere from 200..250 to 210..250 . Although this syntax seems like a reasonable way to capture an uncertain interval, it is used for features on a vanishingly small number of sequence records, most database submission mechanisms don't support it, and the meaning of its use in a join() context is not entirely clear. As of October 2006, this type of location is no longer supported. Those records with features which utilize X.Y locations will be reviewed and converted to a non-uncertain format. ----------------------------------------------- EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations. Not sure about UniProt/SwissProt. I guess we're keeping this in for backwards compatibility only, but how do we handle any bugs that pop up related to this? Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Wed Oct 18 10:10:07 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 09:10:07 -0500 Subject: [Bioperl-l] RC2 test results on WinXP In-Reply-To: <453617FF.9080508@sendu.me.uk> Message-ID: <001f01c6f2bf$20737270$15327e82@pyrimidine> > Nathan Haigh wrote: > > I get all tests passing except for BioDBSeqFeature_mysql which fails all > > tests (1-46). > > > > During perl Makefile.PL I get: > > "I see you have Berkeleydb installed. I will create the DBD tests for > > Bio::DB::SeqFeature::Store..." > > > > I notice under the "needs investigation" there is mention about tests > > been generated even if DBD::mysql isn't installed. I assume this is the > > problem? > > Probably. I'm looking into it. Not sure why it wasn't causing a problem > before now. Funny, I don't have problems with the BioDBSeqFeature_mysql tests on WinXP because 'perl Makefile.PL' doesn't detect my MySQL installation, so the MySQL-based tests don't run even though I have DBD::mysql installed. I thought this might just be a WinXP issue, but apparently not. If I can get to it I'll run a few checks. > > If this is the problem should DBD::mysql be added to the > > dependencies in Makefile.PL? > > No. You can use the modules in question without mysql (presumably; ie. > you have a different sql setup), so it makes no sense to warn people > they don't have a module they absolutely do not need. Agreed, though I don't know if other relational DB's are supported like PostgreSQL. > > Is there an easy way to find out what tests are being skipped due to > > absent modules? > > Ideally, when the skip occurs the test script will issue a message. I > think that happens in most, if not all cases. Yes, though we may run into the same issue we had with XEMBL tests not reporting the reasons it skipped. Each test suite should run an eval{} to check the required modules, then only skip blocks of tests that rely on those modules. I think we have caught most of those, but who knows w/o doing a complete test suite audit? Our eventual complete switchover to Test::More should hopefully clean these up. I don't consider it a pressing issue for this release, though Sendu may feel differently. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Wed Oct 18 10:12:52 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 09:12:52 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <45362928.8070104@sendu.me.uk> Message-ID: <002001c6f2bf$807849c0$15327e82@pyrimidine> ... > This looks like a booboo on my part. I left t/BioDBSeqFeature_mysql.t in > my t directory when I packed it up for release. > > I'm tweaking Makefile.PL right now in any case; there are a few errors > and t/BioDBSeqFeature_mysql.t et al. don't get deleted in a make clean. Okay, makes sense now. No big deal, it's still an RC (a developer's RC at that!). Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Wed Oct 18 10:17:35 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Wed, 18 Oct 2006 15:17:35 +0100 Subject: [Bioperl-l] RC2 test results on WinXP In-Reply-To: <001f01c6f2bf$20737270$15327e82@pyrimidine> References: <001f01c6f2bf$20737270$15327e82@pyrimidine> Message-ID: <4536377F.6000408@sheffield.ac.uk> Chris Fields wrote: >> Nathan Haigh wrote: >> >>> I get all tests passing except for BioDBSeqFeature_mysql which fails all >>> tests (1-46). >>> >>> During perl Makefile.PL I get: >>> "I see you have Berkeleydb installed. I will create the DBD tests for >>> Bio::DB::SeqFeature::Store..." >>> >>> I notice under the "needs investigation" there is mention about tests >>> been generated even if DBD::mysql isn't installed. I assume this is the >>> problem? >>> >> Probably. I'm looking into it. Not sure why it wasn't causing a problem >> before now. >> > > Funny, I don't have problems with the BioDBSeqFeature_mysql tests on WinXP > because 'perl Makefile.PL' doesn't detect my MySQL installation, so the > MySQL-based tests don't run even though I have DBD::mysql installed. I > thought this might just be a WinXP issue, but apparently not. If I can get > to it I'll run a few checks. > > This was on WinXP. >> > If this is the problem should DBD::mysql be added to the >> > dependencies in Makefile.PL? >> >> No. You can use the modules in question without mysql (presumably; ie. >> you have a different sql setup), so it makes no sense to warn people >> they don't have a module they absolutely do not need. >> > > Agreed, though I don't know if other relational DB's are supported like > PostgreSQL. > > >>> Is there an easy way to find out what tests are being skipped due to >>> absent modules? >>> >> Ideally, when the skip occurs the test script will issue a message. I >> think that happens in most, if not all cases. >> > > Yes, though we may run into the same issue we had with XEMBL tests not > reporting the reasons it skipped. Each test suite should run an eval{} to > check the required modules, then only skip blocks of tests that rely on > those modules. I think we have caught most of those, but who knows w/o > doing a complete test suite audit? > > Our eventual complete switchover to Test::More should hopefully clean these > up. I don't consider it a pressing issue for this release, though Sendu may > feel differently. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > From hlapp at gmx.net Wed Oct 18 10:36:31 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 18 Oct 2006 10:36:31 -0400 Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in GenBank/EMBL/DDBJ In-Reply-To: <001e01c6f2bd$1769aca0$15327e82@pyrimidine> References: <001e01c6f2bd$1769aca0$15327e82@pyrimidine> Message-ID: On Oct 18, 2006, at 9:55 AM, Chris Fields wrote: > how do we handle any bugs that pop up related to this? By an evil grin, followed by deflecting the blame to NCBI, followed by another evil grin. -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Oct 18 10:43:31 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 09:43:31 -0500 Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in GenBank/EMBL/DDBJ In-Reply-To: Message-ID: <002401c6f2c3$c83c7e30$15327e82@pyrimidine> > On Oct 18, 2006, at 9:55 AM, Chris Fields wrote: > > > how do we handle any bugs that pop up related to this? > > By an evil grin, followed by deflecting the blame to NCBI, followed > by another evil grin. > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== Sounds good to me! One less thing to worry about. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Wed Oct 18 10:45:57 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Wed, 18 Oct 2006 15:45:57 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4536113D.1080307@sheffield.ac.uk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> <4535EBF9.1090706@sendu.me.uk> <4536113D.1080307@sheffield.ac.uk> Message-ID: <45363E25.8010806@sheffield.ac.uk> Nathan Haigh wrote: > I've just added test results for 1.5.2 RC2 to the wiki. > > There are lots of fails for packages other than bioperl-live. I'm not > sure excatly how the test fails/skipps are/should be handled since my > setups are as follows. > > Clean WinXP Pro: > This is a clean install of WinXP Pro SP2 with no major software > installed, other than ActivePerl 5.8.8.819 and a few tools for archive > extracting, anti virus etc. Therefore, I'm unsure how tests in > bioperl-network and bioperl-db should return. For example, I have made > no effort to setup biosql-schema but I thought that maybe there would be > a test that would detect this, and fail, then skip over other tests > gracefully - like the bioperl-run tests when a piece of software is not > installed??? > > Debian Linux: > This is a Bio-Linux machine with quite a lot of bioinformatics software > installed in the Path. So most of the tests in bioperl-run should > probably have passed. The same goes for bioperl-network and bioperl-db > as with my Windows setup. > > If my thoughts are totally wrong - let me know! > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > Just looking into the failed Linux tests. Several of the tests result in errors like: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Unallowed parameter: ARGUMENTS ! STACK: Error::throw STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328 STACK: Bio::Tools::Run::Alignment::Exonerate::AUTOLOAD /home/bo1nsh/cvswc/bioperl-run-live/blib/lib/Bio/Tools/Run/Alignment/Exonerate.pm:126 STACK: Bio::Tools::Run::Alignment::Exonerate::new /home/bo1nsh/cvswc/bioperl-run-live/blib/lib/Bio/Tools/Run/Alignment/Exonerate.pm:154 STACK: t/Exonerate.t:32 ----------------------------------------------------------- ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Unallowed parameter: 'arguments' ! STACK: Error::throw STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328 STACK: Bio::Tools::Run::Hmmer::AUTOLOAD Bio/Tools/Run/Hmmer.pm:172 STACK: Bio::Tools::Run::Hmmer::_run Bio/Tools/Run/Hmmer.pm:253 STACK: Bio::Tools::Run::Hmmer::run Bio/Tools/Run/Hmmer.pm:228 STACK: t/Hmmer.t:54 ----------------------------------------------------------- ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Unallowed parameter: ARGUMENTS ! STACK: Error::throw STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328 STACK: Bio::Tools::Run::Phrap::AUTOLOAD Bio/Tools/Run/Phrap.pm:137 STACK: Bio::Tools::Run::Phrap::new Bio/Tools/Run/Phrap.pm:165 STACK: t/Phrap.t:34 ----------------------------------------------------------- Any ideas?? Nath From hlapp at gmx.net Wed Oct 18 10:51:36 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 18 Oct 2006 10:51:36 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4536113D.1080307@sheffield.ac.uk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> <4535EBF9.1090706@sendu.me.uk> <4536113D.1080307@sheffield.ac.uk> Message-ID: On Oct 18, 2006, at 7:34 AM, Nathan Haigh wrote: > For example, I have made > no effort to setup biosql-schema but I thought that maybe there > would be > a test that would detect this I'm afraid there isn't. Bioperl-db is meaningless without biosql-schema. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bosborne11 at verizon.net Wed Oct 18 10:43:06 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 18 Oct 2006 10:43:06 -0400 Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in GenBank/EMBL/DDBJ In-Reply-To: <001e01c6f2bd$1769aca0$15327e82@pyrimidine> Message-ID: Chris, I'm fairly sure that Swissprot/Uniprot still supports fuzzy locations, all of the more recent examples in t/LocationFactory.t come from there. Brian O. On 10/18/06 9:55 AM, "Chris Fields" wrote: > EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations. > Not sure about UniProt/SwissProt. From cjfields at uiuc.edu Wed Oct 18 11:00:30 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 10:00:30 -0500 Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in GenBank/EMBL/DDBJ In-Reply-To: Message-ID: <002501c6f2c6$27625540$15327e82@pyrimidine> Do they still use the X.Y notations? Those are the most troublesome. I guess we still don't support the ones containing '?'. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: Brian Osborne [mailto:bosborne11 at verizon.net] > Sent: Wednesday, October 18, 2006 9:43 AM > To: Chris Fields; bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] X.Y Fuzzy locations no longer supported in > GenBank/EMBL/DDBJ > > Chris, > > I'm fairly sure that Swissprot/Uniprot still supports fuzzy locations, all > of the more recent examples in t/LocationFactory.t come from there. > > Brian O. > > > On 10/18/06 9:55 AM, "Chris Fields" wrote: > > > EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations. > > Not sure about UniProt/SwissProt. From Kevin.M.Brown at asu.edu Wed Oct 18 11:16:50 2006 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Wed, 18 Oct 2006 08:16:50 -0700 Subject: [Bioperl-l] Blast information Message-ID: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu> I just recently upgraded to 1.5.1 on WinXP to bring this version closer to live to parse some locally created blast files. I'm trying to find the method that returns the values that are underneath the Identities and Positives information as I'm trying to replicate the output of an old blast parser we have here written in RealBasic which is showing its age. Once I have it replicating the old output I then intend to add more features in terms of filtering returned hits (like not returning self->self hits or a->b so don't show b->a). Example: I'm looking for the methods that will return 117 from identities and 117 from positives. I can't just use num_identical/percent_identity as that isn't 100% accurate. >BurkM_2016 Length = 241 Score = 43.2 bits (88), Expect = 7e-005 Identities = 26/117 (22%), Positives = 51/117 (43%) Query: 298 QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL 357 Q F F + A+ ++ + + + L +R GL + P E + A+L Sbjct: 111 QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL 170 Query: 358 MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414 A+ G++L++DV ++ H R ++ L L + +S R +T+ L + L Sbjct: 171 KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227 Thanks, Kevin From cjfields at uiuc.edu Wed Oct 18 11:25:59 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 10:25:59 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4536113D.1080307@sheffield.ac.uk> Message-ID: <002601c6f2c9$b6d04a90$15327e82@pyrimidine> > I've just added test results for 1.5.2 RC2 to the wiki. > > There are lots of fails for packages other than bioperl-live. I'm not > sure excatly how the test fails/skipps are/should be handled since my > setups are as follows. > > Clean WinXP Pro: > This is a clean install of WinXP Pro SP2 with no major software > installed, other than ActivePerl 5.8.8.819 and a few tools for archive > extracting, anti virus etc. Therefore, I'm unsure how tests in > bioperl-network and bioperl-db should return. For example, I have made > no effort to setup biosql-schema but I thought that maybe there would be > a test that would detect this, and fail, then skip over other tests > gracefully - like the bioperl-run tests when a piece of software is not > installed??? > > Debian Linux: > This is a Bio-Linux machine with quite a lot of bioinformatics software > installed in the Path. So most of the tests in bioperl-run should > probably have passed. The same goes for bioperl-network and bioperl-db > as with my Windows setup. > > If my thoughts are totally wrong - let me know! > Nath The bioperl-db tests rely on a local BioSQL database and on having a properly set up configuration file (these are detailed in the bioperl-db INSTALL doc). Furthermore, there are serious problems with bioperl-db and WinXP (see Bug 1938 in bugzilla). There is a workaround, but it isn't perfect by any means. http://bugzilla.open-bio.org/show_bug.cgi?id=1938 Many of the bioperl-run tests rely on env. variables being set properly, so maybe that's why they failed. These should all be detailed in the INSTALL file (but maybe they aren't?). I haven't tried installing bioperl-network or bioperl-run on WinXP or Mac OS X yet but intended on doing this within the week. The INSTALL file details the requirements for the packages (Graph 0.80 is the only one for bioperl-network, for instance, and there isn't a PPM for that version available yet). It would be nice to skip the tests based on absence of the particular modules or installed programs, and I think the final goal is to possibly attempt to do this. However, all of the bioperl-related distributions have their own documentation which outline their installation, requirements, and use. At least we can point to that, which works for now. We could always start up a wiki page for the various bioperl distributions to monitor problems or issues with each based on OS, proposed enhancements/ideas, etc. Also, most (if not all, including core) have been primarily tested on some *nix-related system, which means that they may not work on Win32 systems. Though the Windows support is light-years ahead of what it used to be circa rel 0.7, I don't think it is full-proof yet, as witnessed by the bioperl-db bug. Frankly, we need more WinXP users for those packages willing to test them out and offer suggestions. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign l From bosborne11 at verizon.net Wed Oct 18 11:13:51 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 18 Oct 2006 11:13:51 -0400 Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in GenBank/EMBL/DDBJ In-Reply-To: <002501c6f2c6$27625540$15327e82@pyrimidine> Message-ID: Chris, No, I don't think they use the form X.Y. See below, from t/LocationFactory.t, we do support most of the forms using ?. Supposedly these tests accommodate all of the possible fuzzy locations encountered in Swissprot, I wrote these a year or so ago. Brian O. # UNCERTAIN locations and positions (Swissprot) "?2465..2774" => [$fuzzy_impl, 2465, 2465, "UNCERTAIN", 2774, 2774, "EXACT", "EXACT", 1, 1], "22..?64" => [$fuzzy_impl, 22, 22, "EXACT", 64, 64, "UNCERTAIN", "EXACT", 1, 1], "?22..?64" => [$fuzzy_impl, 22, 22, "UNCERTAIN", 64, 64, "UNCERTAIN", "EXACT", 1, 1], "?..>393" => [$fuzzy_impl, undef, undef, "UNCERTAIN", 393, undef, "AFTER", "UNCERTAIN", 1, 1], "<1..?" => [$fuzzy_impl, undef, 1, "BEFORE", undef, undef, "UNCERTAIN", "UNCERTAIN", 1, 1], "?..536" => [$fuzzy_impl, undef, undef, "UNCERTAIN", 536, 536, "EXACT", "UNCERTAIN", 1, 1], "1..?" => [$fuzzy_impl, 1, 1, "EXACT", undef, undef, "UNCERTAIN", "UNCERTAIN", 1, 1], "?..?" => [$fuzzy_impl, undef, undef, "UNCERTAIN", undef, undef, "UNCERTAIN", "UNCERTAIN", 1, 1], # Not working yet: #"12..?1" => [$fuzzy_impl, # 1, 1, "UNCERTAIN", 12, 12, "EXACT", "EXACT", 1, 1] On 10/18/06 11:00 AM, "Chris Fields" wrote: > Do they still use the X.Y notations? Those are the most troublesome. I > guess we still don't support the ones containing '?'. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > >> -----Original Message----- >> From: Brian Osborne [mailto:bosborne11 at verizon.net] >> Sent: Wednesday, October 18, 2006 9:43 AM >> To: Chris Fields; bioperl-l at lists.open-bio.org >> Subject: Re: [Bioperl-l] X.Y Fuzzy locations no longer supported in >> GenBank/EMBL/DDBJ >> >> Chris, >> >> I'm fairly sure that Swissprot/Uniprot still supports fuzzy locations, all >> of the more recent examples in t/LocationFactory.t come from there. >> >> Brian O. >> >> >> On 10/18/06 9:55 AM, "Chris Fields" wrote: >> >>> EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations. >>> Not sure about UniProt/SwissProt. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Wed Oct 18 12:56:07 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 11:56:07 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <002601c6f2c9$b6d04a90$15327e82@pyrimidine> Message-ID: <000401c6f2d6$5144e2f0$15327e82@pyrimidine> ... > I haven't tried installing bioperl-network or bioperl-run on WinXP or Mac > OS All, > X yet but intended on doing this within the week. The INSTALL file > details > the requirements for the packages (Graph 0.80 is the only one for > bioperl-network, for instance, and there isn't a PPM for that version > available yet). ... As a followup in this, I tried bioperl-network and had similar failed tests with Graph 0.79 (the only PPM available from ActiveState). However, the INSTALL docs state that Graph 0.80 is needed, and the test run gave several warnings about not having Graph 0.80 installed. I made a PPM of Graph 0.80, installed, retried bioperl-network tests, and everything passed. Maybe we need to have a Graph PPM available for those who want bioperl-network? As for bioperl-run, all tests passed from a new CVS checkout even though I have none of the programs installed, so they seem to skip properly. The test run also printed warnings when a program wasn't available or installed. Chris From bosborne11 at verizon.net Wed Oct 18 13:10:34 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 18 Oct 2006 13:10:34 -0400 Subject: [Bioperl-l] Blast information In-Reply-To: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu> Message-ID: Kevin, Are you looking for hsp_length()? See the SearchIO HOWTO for a list of methods: http://www.bioperl.org/wiki/HOWTO:SearchIO Brian O. On 10/18/06 11:16 AM, "Kevin Brown" wrote: > I just recently upgraded to 1.5.1 on WinXP to bring this version closer > to live to parse some locally created blast files. I'm trying to find > the method that returns the values that are underneath the Identities > and Positives information as I'm trying to replicate the output of an > old blast parser we have here written in RealBasic which is showing its > age. Once I have it replicating the old output I then intend to add > more features in terms of filtering returned hits (like not returning > self->self hits or a->b so don't show b->a). > > Example: > I'm looking for the methods that will return 117 from identities and 117 > from positives. I can't just use num_identical/percent_identity as that > isn't 100% accurate. > >> BurkM_2016 > Length = 241 > > Score = 43.2 bits (88), Expect = 7e-005 > Identities = 26/117 (22%), Positives = 51/117 (43%) > > Query: 298 QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL > 357 > Q F F + A+ ++ + + + L +R GL + P E + A+L > Sbjct: 111 QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL > 170 > > Query: 358 MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414 > A+ G++L++DV ++ H R ++ L L + +S R +T+ L + L > Sbjct: 171 KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227 > > Thanks, > Kevin > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From Kevin.M.Brown at asu.edu Wed Oct 18 17:25:48 2006 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Wed, 18 Oct 2006 14:25:48 -0700 Subject: [Bioperl-l] Blast information Message-ID: <1A4207F8295607498283FE9E93B775B4022A71C3@EX02.asurite.ad.asu.edu> Yes, that does indeed look like what I was after. > -----Original Message----- > From: Brian Osborne [mailto:bosborne11 at verizon.net] > Sent: Wednesday, October 18, 2006 10:11 AM > To: Kevin Brown; bioperl-l > Subject: Re: [Bioperl-l] Blast information > > Kevin, > > Are you looking for hsp_length()? See the SearchIO HOWTO for a list of > methods: > > http://www.bioperl.org/wiki/HOWTO:SearchIO > > > Brian O. > > > On 10/18/06 11:16 AM, "Kevin Brown" wrote: > > > I just recently upgraded to 1.5.1 on WinXP to bring this > version closer > > to live to parse some locally created blast files. I'm > trying to find > > the method that returns the values that are underneath the > Identities > > and Positives information as I'm trying to replicate the > output of an > > old blast parser we have here written in RealBasic which is > showing its > > age. Once I have it replicating the old output I then intend to add > > more features in terms of filtering returned hits (like not > returning > > self->self hits or a->b so don't show b->a). > > > > Example: > > I'm looking for the methods that will return 117 from > identities and 117 > > from positives. I can't just use > num_identical/percent_identity as that > > isn't 100% accurate. > > > >> BurkM_2016 > > Length = 241 > > > > Score = 43.2 bits (88), Expect = 7e-005 > > Identities = 26/117 (22%), Positives = 51/117 (43%) > > > > Query: 298 > QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL > > 357 > > Q F F + A+ ++ + + + L +R GL + > P E + A+L > > Sbjct: 111 > QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL > > 170 > > > > Query: 358 > MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414 > > A+ G++L++DV ++ H R ++ L L + +S R +T+ L + L > > Sbjct: 171 > KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227 > > > > Thanks, > > Kevin > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > From n.appleby at uq.edu.au Wed Oct 18 17:58:06 2006 From: n.appleby at uq.edu.au (Nikki Appleby) Date: Thu, 19 Oct 2006 07:58:06 +1000 Subject: [Bioperl-l] CONTIG dealing Message-ID: <000f01c6f300$7e2f3390$1b776682@IMBPC.AD> I have just entered the wonderful new world of BioPerl, so the answer to my question may be obvious to any of the gurus reading this. I need to collect sequence features and ontology annotations. Here goes. I am retrieving sequences from SwissProt via Bio::DB::SwissProt and get_Seq_by_id, for this example Q8RZV7. Once I have parsed it into an RDBMS format that I am happy with I can get at the xref ids. In this case, they are AP003451; BAB86144.1; -; Genomic_DNA. AP008207; BAF07116.1; -; Genomic_DNA. AB103395; BAC81207.1; -; mRNA. I can happily go off and fetch those from Bio::DB::GenBank (first column), and Bio::DB::GenPept (second). All good, except... AP008207 is a contig. I don't want to get all of the features for the entire thing, just the single contig that actually matches the original sequence. It takes a couple of hours to get at it and then it gives me way too much. I will come across this problem with other sequences. How do I (a) find out if it is a contig without downloading it in it's entirety and (b) extract the list of sequences that are about to be contigged together. I have searched the web for answers, including this list, but see nothing. Help! Nikki Appleby. From bosborne11 at verizon.net Wed Oct 18 20:54:04 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 18 Oct 2006 20:54:04 -0400 Subject: [Bioperl-l] LocatableSeq object vs Sequence Object In-Reply-To: <830b42e70610170926h5bdaabd6hc1352b8d4f3c8c4c@mail.gmail.com> Message-ID: Peter, I'm not understanding your question, partly because your letter and your code are saying different things. You say you want to call location_from_column() but your code shows you calling species(). What happens when you call location_from_column? Do you see errors? Brian O. On 10/17/06 12:26 PM, "Peter H. Baenziger" wrote: > I was thinking I could use: > foreach $seq ($alignment->each_seq()) > to loop through the sequences and call: > $seq->location_from_column($pos) > on each of the sequences. From cjfields at uiuc.edu Wed Oct 18 22:46:14 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 21:46:14 -0500 Subject: [Bioperl-l] CONTIG dealing In-Reply-To: <000f01c6f300$7e2f3390$1b776682@IMBPC.AD> References: <000f01c6f300$7e2f3390$1b776682@IMBPC.AD> Message-ID: On Oct 18, 2006, at 4:58 PM, Nikki Appleby wrote: > > I have just entered the wonderful new world of BioPerl, so the > answer to my > question may be obvious to any of the gurus reading this. > > I need to collect sequence features and ontology annotations. Here > goes. > > I am retrieving sequences from SwissProt via Bio::DB::SwissProt and > get_Seq_by_id, for this example Q8RZV7. Once I have parsed it into > an RDBMS > format that I am happy with I can get at the xref ids. In this > case, they > are > > AP003451; BAB86144.1; -; Genomic_DNA. > AP008207; BAF07116.1; -; Genomic_DNA. > AB103395; BAC81207.1; -; mRNA. > > I can happily go off and fetch those from Bio::DB::GenBank (first > column), > and Bio::DB::GenPept (second). All good, except... > > AP008207 is a contig. I don't want to get all of the features for > the entire > thing, just the single contig that actually matches the original > sequence. > It takes a couple of hours to get at it and then it gives me way > too much. > > I will come across this problem with other sequences. How do I (a) > find out > if it is a contig without downloading it in it's entirety and (b) > extract > the list of sequences that are about to be contigged together. > > I have searched the web for answers, including this list, but see > nothing. > Help! > > Nikki Appleby. The default setting for the retrieval format for GenBank is 'gbwithparts' (which gets the full sequence at all times). You can set this to 'gb' using request_format() to retrieve the sequence file with the contig information instead of the sequence, if it contains such (otherwise it just retrieves the sequence anyway). However, I have noticed this particular file does not represent a true contig record but is the entire chromosome sequence. The contig information is in the comments section, probably b/c the record is converted over. You could just download the sequence record and run regexp to grab the comments section, then parse out the contigs (a pain) if you really want that. Or you could try to find the equivalent GenBank record, such as the ones derived from the WGS records. I did notice the list of dbxrefs in your swissprot record indicate three EMBL sequences. If the order is consistent for the SwissProt entries you want, they probably represent: The contig (what you want): AP003451; BAB86144.1; -; Genomic_DNA. The supercontig (chromosome) : AP008207; BAF07116.1; -; Genomic_DNA. The cDNA : AB103395; BAC81207.1; -; mRNA. I checked the first one (AP003451), which seems to confirm this. Since the chromosome supercontig is built from the smaller sequence contigs you could just grab the first EMBL dbxref instead of all of them. It parses much faster than the chromosome file. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Wed Oct 18 11:47:14 2006 From: jason at bioperl.org (Jason Stajich) Date: Wed, 18 Oct 2006 08:47:14 -0700 Subject: [Bioperl-l] Blast information In-Reply-To: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu> References: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu> Message-ID: <6B7D24F3-69F1-498D-AB53-B4CEB14E4F3D@bioperl.org> I think this will work for you. The seq_inds method parses the middle homology sequence and classifies each alignment column and returns a list of the columns meeting the criteria. You can interrogate query or hit in this case since you are requiring it to be identical my $identicalbases = scalar $hsp->seq_inds('query', 'identical'); my $conservedbases = scalar $hsp->seq_inds('query','conserved'); Conserved returns those identical or conserved, if you want just those with conservative replacements use 'conserved-not-identical' See http://bioperl.org/wiki/HOWTO:SearchIO#Table_of_Methods for more info. -jason On Oct 18, 2006, at 8:16 AM, Kevin Brown wrote: > I just recently upgraded to 1.5.1 on WinXP to bring this version > closer > to live to parse some locally created blast files. I'm trying to find > the method that returns the values that are underneath the Identities > and Positives information as I'm trying to replicate the output of an > old blast parser we have here written in RealBasic which is showing > its > age. Once I have it replicating the old output I then intend to add > more features in terms of filtering returned hits (like not returning > self->self hits or a->b so don't show b->a). > > Example: > I'm looking for the methods that will return 117 from identities > and 117 > from positives. I can't just use num_identical/percent_identity as > that > isn't 100% accurate. > >> BurkM_2016 > Length = 241 > > Score = 43.2 bits (88), Expect = 7e-005 > Identities = 26/117 (22%), Positives = 51/117 (43%) > > Query: 298 > QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL > 357 > Q F F + A+ ++ + + + L +R GL + P E + > A+L > Sbjct: 111 > QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL > 170 > > Query: 358 > MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414 > A+ G++L++DV ++ H R ++ L L + +S R +T+ L + L > Sbjct: 171 > KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227 > > Thanks, > Kevin > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From jason at bioperl.org Thu Oct 19 01:00:28 2006 From: jason at bioperl.org (Jason Stajich) Date: Wed, 18 Oct 2006 22:00:28 -0700 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> References: <001301c6f215$07a9a070$15327e82@pyrimidine> <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> Message-ID: <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org> So I'm unsure what we should do here. We can certainly fix the problem which you report which is relying on the "" method -- if you were to do instead: print $_->database, ":", $_->primary_id, "\n"; you'll get the right answer. We at a minimum just fix the auto- string converting method to do The Right Thing. But I am not sure if we should keep the version out of the primary_id field. This will require some rejiggering in several modules when it comes to printing DBlinks and I don't want to do this before the release. I also am not sure if there was an explicit reason why someone did put the version information in the primary_id. (I hope it wasn't me because I don't think I'm going to remember why). Does anyone else have a strong feeling? -jason On Oct 17, 2006, at 12:01 PM, Erikjan wrote: > Hello, > > I noticed a little problem with the Annotation "DBLink" from > GenBank entries > > When I run: > > perl -MBio::DB::GenBank -e 'my $gi = > 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio = > $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my > $ac=$seq->annotation(); my @annotations = $ac->get_Annotations > ("dblink"); > for(@annotations) { print $_, "\n";} print $INC{ > "Bio/Annotation/DBLink.pm" }, "\n"; ' > > This yields: > > GenBank:AL591065.17.17 > > and the place where the used Bio/Annotation/DBLink.pm resides. > > Can others repeat this? > > I have dug into the source a little and Bio::Annotation::DBLink > seems to > be the place where this happens: it has a concatenation which leads to > that repeated version number. > > It this something that I should fix "client-side", so to speak, or > is it > worthwhile to add some logic to that concatenation to prevent this? > > > Thanks, > > Eric > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From n.haigh at sheffield.ac.uk Thu Oct 19 02:41:02 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 19 Oct 2006 07:41:02 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <000401c6f2d6$5144e2f0$15327e82@pyrimidine> References: <000401c6f2d6$5144e2f0$15327e82@pyrimidine> Message-ID: <45371DFE.6050306@sheffield.ac.uk> > As a followup in this, I tried bioperl-network and had similar failed tests > with Graph 0.79 (the only PPM available from ActiveState). However, the > INSTALL docs state that Graph 0.80 is needed, and the test run gave several > warnings about not having Graph 0.80 installed. > > I made a PPM of Graph 0.80, installed, retried bioperl-network tests, and > everything passed. Maybe we need to have a Graph PPM available for those > who want bioperl-network? > > As for bioperl-run, all tests passed from a new CVS checkout even though I > have none of the programs installed, so they seem to skip properly. The > test run also printed warnings when a program wasn't available or installed. > > > Chris > > If you can pop the Graph 0.80 ppd files in DIST/RC/ i'll make modifications to integrate them into the package.xml file for PPM4 clients. Nath From n.haigh at sheffield.ac.uk Thu Oct 19 06:40:21 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 19 Oct 2006 11:40:21 +0100 Subject: [Bioperl-l] bioperl-run t/EMBOSS.t Message-ID: <45375615.1020603@sheffield.ac.uk> Should line 25 read: require Bio::Factory::EMBOSS instead of: require Bio::EMBOSS::Factory; Nath From hlapp at gmx.net Thu Oct 19 09:56:05 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 19 Oct 2006 09:56:05 -0400 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org> References: <001301c6f215$07a9a070$15327e82@pyrimidine> <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org> Message-ID: <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net> Here is the overload code: use overload '""' => sub { (($_[0]->database ? $_[0]->database . ':' : '' ) . ($_[0]->primary_id ? $_[0]->primary_id : '') . ($_[0]->version ? '.' . $_[0]->version : '')) || '' }; Except that the last '||' is redundant and unnecessary (it either does nothing or replaces an empty string with an empty string), I don't see the potential for duplicating the version number here - unless primary_id() did that, which I don't see it doing. So, to me this seems to come from a parsing error in the beginning, rather than an erroneous mangling of version into primary_id later. Is someone in the position to confirm this? -hilmar On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: > So I'm unsure what we should do here. > > We can certainly fix the problem which you report which is relying on > the "" method -- if you were to do instead: > print $_->database, ":", $_->primary_id, "\n"; > > you'll get the right answer. We at a minimum just fix the auto- > string converting method to do The Right Thing. > > But I am not sure if we should keep the version out of the primary_id > field. This will require some rejiggering in several modules when it > comes to printing DBlinks and I don't want to do this before the > release. I also am not sure if there was an explicit reason why > someone did put the version information in the primary_id. (I hope it > wasn't me because I don't think I'm going to remember why). > > Does anyone else have a strong feeling? > > -jason > On Oct 17, 2006, at 12:01 PM, Erikjan wrote: > >> Hello, >> >> I noticed a little problem with the Annotation "DBLink" from >> GenBank entries >> >> When I run: >> >> perl -MBio::DB::GenBank -e 'my $gi = >> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio = >> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my >> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations >> ("dblink"); >> for(@annotations) { print $_, "\n";} print $INC{ >> "Bio/Annotation/DBLink.pm" }, "\n"; ' >> >> This yields: >> >> GenBank:AL591065.17.17 >> >> and the place where the used Bio/Annotation/DBLink.pm resides. >> >> Can others repeat this? >> >> I have dug into the source a little and Bio::Annotation::DBLink >> seems to >> be the place where this happens: it has a concatenation which >> leads to >> that repeated version number. >> >> It this something that I should fix "client-side", so to speak, or >> is it >> worthwhile to add some logic to that concatenation to prevent this? >> >> >> Thanks, >> >> Eric >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich, PhD > Miller Research Fellow > University of California > Dept of Plant and Microbial Biology > 321 Koshland Hall #3102 > Berkeley, CA 94720-3102 > lab: 510.642.8441 > http://pmb.berkeley.edu/~taylor/people/js.html > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From dmessina at wustl.edu Thu Oct 19 09:55:31 2006 From: dmessina at wustl.edu (David Messina) Date: Thu, 19 Oct 2006 08:55:31 -0500 Subject: [Bioperl-l] missing documentation (request for help) Message-ID: <69453D5F-7794-4DC7-BAE1-A8B2191752E6@wustl.edu> Hi all, There are a few modules missing a one-line description, and by one- line description, I'm referring to the part that comes after the module name in the POD. e.g. in =head1 NAME Bio::SearchIO - Driver for parsing Sequence Database Searches (BLAST, FASTA, ...) =head1 SYNOPSIS [etc...] "Driver for parsing Sequence Database Searches (BLAST, FASTA, ...)" is the one-line description (even though it falls onto two lines) :). I fixed the modules that I knew something about, but there are some I haven't used. Perhaps the author, or someone else familiar with these modules, could fill in an appropriate short description? Here is the list of affected modules: Bio::DB::Expression Bio::Expression::Contact Bio::Expression::DataSet Bio::Expression::Platform Bio::Expression::Sample Bio::Search::Processor Bio::DB::EUtilities::ElinkData Bio::DB::GFF::Adaptor::memory::feature_serializer Bio::DB::SeqFeature::Store::DBI::Iterator Bio::Expression::FeatureGroup::FeatureGroupMas50 Bio::Expression::FeatureSet::FeatureSetMas50 Bio::Matrix::PSM::PsmHeaderI Bio::OntologyIO::Handlers::BaseSAXHandler Some of these are missing other POD parts as well -- please add those too if you can. Thanks, Dave From mckays at cshl.edu Thu Oct 19 09:51:18 2006 From: mckays at cshl.edu (Sheldon McKay) Date: Thu, 19 Oct 2006 09:51:18 -0400 Subject: [Bioperl-l] chromosome ideograms Message-ID: <6b0de00426b3c04b0d0d7641bc8e14e3@cshl.edu> Hi, Sorry for the late reply. I have been working on a karyotype drawing tool as part of the Generic Genome Browser that may be useful. In addition to drawing features next to chromosome ideograms, it also supports making chromosome 'bands' from any kind of scored features to create a sort of heat map on the chromosome itself. I have a demo running at http://mckay.cshl.edu/cgi-bin/gbrowse_karyotype and the source is available from the GMOD CVS HEAD http://www.gmod.org/cvs Sheldon -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Sheldon McKay, PhD Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 From n.haigh at sheffield.ac.uk Thu Oct 19 11:37:31 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 19 Oct 2006 15:37:31 +0000 Subject: [Bioperl-l] bioperl-run t/EMBOSS.t In-Reply-To: <45375615.1020603@sheffield.ac.uk> References: <45375615.1020603@sheffield.ac.uk> Message-ID: <45379BBB.1040400@sheffield.ac.uk> Thanks for committing that change Brian. Now the tests proceed from this point, I get the following error: ------------- EXCEPTION: Bio::Root::NotImplemented ------------- MSG: Abstract method "Bio::Tools::Run::WrapperBase::program_dir" is not implemented by package Bio::Tools::Run::EMBOSSApplication. This is not your fault - author of Bio::Tools::Run::EMBOSSApplication should be blamed! STACK: Error::throw STACK: Bio::Root::Root::throw /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/Root.pm:350 STACK: Bio::Root::RootI::throw_not_implemented /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/RootI.pm:522 STACK: Bio::Tools::Run::WrapperBase::program_dir /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:346 STACK: Bio::Tools::Run::WrapperBase::program_path /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:327 STACK: Bio::Tools::Run::WrapperBase::executable /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:297 STACK: t/EMBOSS.t:58 ---------------------------------------------------------------- From N.Haigh at sheffield.ac.uk Thu Oct 19 11:03:00 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 16:03:00 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <45379BBB.1040400@sheffield.ac.uk> References: <45375615.1020603@sheffield.ac.uk> <45379BBB.1040400@sheffield.ac.uk> Message-ID: <1161270180.453793a432e4f@webmail.shef.ac.uk> I thought I'd have my first proper try at writing some tests. I was wondering if there is a template test file that I should use/study in order to be consistent with other tests. Failing that - Is there a good test writing style I should follow in one of the other test files? Thanks Nathan From bosborne11 at verizon.net Thu Oct 19 11:06:08 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Thu, 19 Oct 2006 11:06:08 -0400 Subject: [Bioperl-l] bioperl-run t/EMBOSS.t In-Reply-To: <45379BBB.1040400@sheffield.ac.uk> Message-ID: Nathan, Yes, I see. Those EMBOSS programs work a bit differently from the typical app run by bioperl-run, there's no need for WrapperBase methods like program_dir(), executable(), it seems. Well, I can try and take a look at this tonight but there's probably someone better suited to this than me, I've spent very little time with bioperl-run. Volunteer? Brian O. On 10/19/06 11:37 AM, "Nathan S. Haigh" wrote: > Thanks for committing that change Brian. Now the tests proceed from this > point, I get the following error: > > ------------- EXCEPTION: Bio::Root::NotImplemented ------------- > MSG: Abstract method "Bio::Tools::Run::WrapperBase::program_dir" is not > implemented by package Bio::Tools::Run::EMBOSSApplication. > This is not your fault - author of Bio::Tools::Run::EMBOSSApplication > should be blamed! > > STACK: Error::throw > STACK: Bio::Root::Root::throw > /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/Root.pm:350 > STACK: Bio::Root::RootI::throw_not_implemented > /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/RootI.pm:522 > STACK: Bio::Tools::Run::WrapperBase::program_dir > /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:346 > STACK: Bio::Tools::Run::WrapperBase::program_path > /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:327 > STACK: Bio::Tools::Run::WrapperBase::executable > /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:297 > STACK: t/EMBOSS.t:58 > ---------------------------------------------------------------- > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From niels at genomics.dk Thu Oct 19 11:16:37 2006 From: niels at genomics.dk (Niels Larsen) Date: Thu, 19 Oct 2006 17:16:37 +0200 Subject: [Bioperl-l] From EBI support re WU-Blast SOAP service In-Reply-To: <4535EBF9.1090706@sendu.me.uk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> <4535EBF9.1090706@sendu.me.uk> Message-ID: <453796D5.2070808@genomics.dk> Sendu Bala wrote: >> I invoked the EBI script >> >> http://www.ebi.ac.uk/Tools/webservices/downloads/WSWUBlastSOAPLite.zip >> >> like this >> >> WSWUBlastClient.pl -p blastn -D embl test.fasta >> >> where the content of test.fasta is below, and got >> >> Can't find method element in the message at >> /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311. > > As you admit, this is not a Bioperl issue. I would suggest you contact > EBI support. > To use EBI's WU-blast SOAP interface from perl, EBI support says it one must use SOAP::Lite v 0.60 (no later version) and include '--email you.example.com' on the command line. This is neither evident from their web pages or the script usage statement, but they promised to fix. ------------------------------------------------------------------------ Niels Larsen Danish Genome Institute Gustav Wieds vej 10 C DK-8000 Aarhus C Denmark Electronic mail: niels at genomics.dk Skype: niels_larsen_denmark Telephone: +45-8942-5268 Telefax: +45-8620-1222 ------------------------------------------------------------------------ From cjfields at uiuc.edu Thu Oct 19 11:31:45 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 10:31:45 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <45371DFE.6050306@sheffield.ac.uk> Message-ID: <001501c6f393$b66bd4a0$15327e82@pyrimidine> > > As a followup in this, I tried bioperl-network and had similar failed > tests > > with Graph 0.79 (the only PPM available from ActiveState). However, the > > INSTALL docs state that Graph 0.80 is needed, and the test run gave > several > > warnings about not having Graph 0.80 installed. > > > > I made a PPM of Graph 0.80, installed, retried bioperl-network tests, > and > > everything passed. Maybe we need to have a Graph PPM available for > those > > who want bioperl-network? > > > > As for bioperl-run, all tests passed from a new CVS checkout even though > I > > have none of the programs installed, so they seem to skip properly. The > > test run also printed warnings when a program wasn't available or > installed. > > > > > > Chris > > > > > If you can pop the Graph 0.80 ppd files in DIST/RC/ i'll make > modifications to integrate them into the package.xml file for PPM4 > clients. > > Nath Will do. Should these be forwarded to Mauricio? Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From N.Haigh at sheffield.ac.uk Thu Oct 19 11:38:05 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 16:38:05 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <001501c6f393$b66bd4a0$15327e82@pyrimidine> References: <001501c6f393$b66bd4a0$15327e82@pyrimidine> Message-ID: <1161272285.45379bdd1aea4@webmail.shef.ac.uk> > > If you can pop the Graph 0.80 ppd files in DIST/RC/ i'll make > > modifications to integrate them into the package.xml file for PPM4 > > clients. > > > > Nath > > Will do. Should these be forwarded to Mauricio? > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > If you don't have access to the web, you can send them to me - I now have an account on that server. Cheers Nath From cjfields at uiuc.edu Thu Oct 19 11:45:00 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 10:45:00 -0500 Subject: [Bioperl-l] test::more template In-Reply-To: <1161270180.453793a432e4f@webmail.shef.ac.uk> Message-ID: <001601c6f395$8a752ed0$15327e82@pyrimidine> > I thought I'd have my first proper try at writing some tests. I was > wondering if there is a template test file that I should use/study in > order to be > consistent with other tests. > > Failing that - Is there a good test writing style I should follow in one > of the other test files? > > Thanks > Nathan I would start with the Test::Simple and Test::More perldoc; they're pretty self-explanatory. You can look at the various test suites using Test::More as well for pointers. By far, most tests will use is(). You can use SKIP blocks to skip tests that have a requirement, or skip all tests if they all require something. Pretty flexible. We should probably get a wiki page for the developers underway, maybe a HOWTO on writing tests. At least have these focus on BioPerl, OOP, remote DB tests, etc. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Oct 19 12:23:40 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 11:23:40 -0500 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net> Message-ID: <001b01c6f39a$f0288ba0$15327e82@pyrimidine> > Here is the overload code: > > use overload '""' => sub { > (($_[0]->database ? $_[0]->database . ':' : '' ) > . ($_[0]->primary_id ? $_[0]->primary_id : '') > . ($_[0]->version ? '.' . $_[0]->version : '')) > || '' }; > > Except that the last '||' is redundant and unnecessary (it either > does nothing or replaces an empty string with an empty string), I > don't see the potential for duplicating the version number here - > unless primary_id() did that, which I don't see it doing. > > So, to me this seems to come from a parsing error in the beginning, > rather than an erroneous mangling of version into primary_id later. > > Is someone in the position to confirm this? > > -hilmar I have attached a script to the bug report on bugzilla, as well as the test output sequence and the actual GenBank record. There are a number of problems: 1) primary_id() is assigned both the id and version. 2) version() is still assigned the version. The above explain when printing the object directly using the overload (it concatenates them). However, there are a few more issues. The ID is printed normally (accession.version), but the source DB is not present when SeqIO handles the sequence. I have attached the output and the original GenBank record to the bug report. I can look into it but it won't be today; got my hands full with enzyme assays. Chris From N.Haigh at sheffield.ac.uk Thu Oct 19 12:50:57 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 17:50:57 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <001601c6f395$8a752ed0$15327e82@pyrimidine> References: <001601c6f395$8a752ed0$15327e82@pyrimidine> Message-ID: <1161276657.4537acf1edc80@webmail.shef.ac.uk> Quoting Chris Fields : > > I thought I'd have my first proper try at writing some tests. I was > > wondering if there is a template test file that I should use/study in > > order to be > > consistent with other tests. > > > > Failing that - Is there a good test writing style I should follow in one > > of the other test files? > > > > Thanks > > Nathan > > I would start with the Test::Simple and Test::More perldoc; they're pretty > self-explanatory. You can look at the various test suites using Test::More > as well for pointers. By far, most tests will use is(). You can use SKIP > blocks to skip tests that have a requirement, or skip all tests if they all > require something. Pretty flexible. > > We should probably get a wiki page for the developers underway, maybe a > HOWTO on writing tests. At least have these focus on BioPerl, OOP, remote > DB tests, etc. > > Chris > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > > Just working through some test things now, I thought I'd start on the bioperl-run stuff as I thought it might be a bit more straight forward, i'm familiar with some of them and they seem to get neglected. I'm heavily commenting my tests with the thought of starting a wiki guide to testing Bioperl modules. See how far I get! Nath From hlapp at gmx.net Thu Oct 19 13:11:27 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 19 Oct 2006 13:11:27 -0400 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org> References: <001301c6f215$07a9a070$15327e82@pyrimidine> <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org> <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net> <8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org> Message-ID: <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net> Actually you did that Jason: http://tinyurl.com/ye2edk Apparently the motivation was to "parse swissprot fields in genpept file (dbsource)"? It clearly looks wrong to add the version. You've probably had a reason why you did this at the time but if we (you :) can't recover that I guess it's best to just fix it to do the right thing (in both places obviously). -hilmar On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote: > Well there is explicit addition of the version to the primary id so > it isn't so much a parsing error as a deliberate decision to append > it. > see Bio::SeqIO::genbank > > to make the dblink > $annotation- > >add_Annotation > ('dblink', > > Bio::Annotation::DBLink->new > (-primary_id > => $id . "." . $version, > -version => > $version, > -database => > $db, > -tagname => > 'dblink')); > > and the code to print the dblink back out in the writer already > assumes the version number is appended... > > foreach my $ref ( $seq->annotation->get_Annotations > ('dblink') ) { > # if ($ref->comment eq 'DBSOURCE') { > $self->_print('DBSOURCE accession ', > $ref->primary_id, "\n"); > # } > } > > On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote: > >> Here is the overload code: >> >> use overload '""' => sub { >> (($_[0]->database ? $_[0]->database . ':' : '' ) >> . ($_[0]->primary_id ? $_[0]->primary_id : '') >> . ($_[0]->version ? '.' . $_[0]->version : '')) >> || '' }; >> >> Except that the last '||' is redundant and unnecessary (it either >> does nothing or replaces an empty string with an empty string), I >> don't see the potential for duplicating the version number here - >> unless primary_id() did that, which I don't see it doing. >> >> So, to me this seems to come from a parsing error in the >> beginning, rather than an erroneous mangling of version into >> primary_id later. >> >> Is someone in the position to confirm this? >> >> -hilmar >> >> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: >> >>> So I'm unsure what we should do here. >>> >>> We can certainly fix the problem which you report which is >>> relying on >>> the "" method -- if you were to do instead: >>> print $_->database, ":", $_->primary_id, "\n"; >>> >>> you'll get the right answer. We at a minimum just fix the auto- >>> string converting method to do The Right Thing. >>> >>> But I am not sure if we should keep the version out of the >>> primary_id >>> field. This will require some rejiggering in several modules >>> when it >>> comes to printing DBlinks and I don't want to do this before the >>> release. I also am not sure if there was an explicit reason why >>> someone did put the version information in the primary_id. (I >>> hope it >>> wasn't me because I don't think I'm going to remember why). >>> >>> Does anyone else have a strong feeling? >>> >>> -jason >>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote: >>> >>>> Hello, >>>> >>>> I noticed a little problem with the Annotation "DBLink" from >>>> GenBank entries >>>> >>>> When I run: >>>> >>>> perl -MBio::DB::GenBank -e 'my $gi = >>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my >>>> $seqio = >>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my >>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations >>>> ("dblink"); >>>> for(@annotations) { print $_, "\n";} print $INC{ >>>> "Bio/Annotation/DBLink.pm" }, "\n"; ' >>>> >>>> This yields: >>>> >>>> GenBank:AL591065.17.17 >>>> >>>> and the place where the used Bio/Annotation/DBLink.pm resides. >>>> >>>> Can others repeat this? >>>> >>>> I have dug into the source a little and Bio::Annotation::DBLink >>>> seems to >>>> be the place where this happens: it has a concatenation which >>>> leads to >>>> that repeated version number. >>>> >>>> It this something that I should fix "client-side", so to speak, or >>>> is it >>>> worthwhile to add some logic to that concatenation to prevent this? >>>> >>>> >>>> Thanks, >>>> >>>> Eric >>>> >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> -- >>> Jason Stajich, PhD >>> Miller Research Fellow >>> University of California >>> Dept of Plant and Microbial Biology >>> 321 Koshland Hall #3102 >>> Berkeley, CA 94720-3102 >>> lab: 510.642.8441 >>> http://pmb.berkeley.edu/~taylor/people/js.html >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> > > -- > Jason Stajich, PhD > Miller Research Fellow > University of California > Dept of Plant and Microbial Biology > 321 Koshland Hall #3102 > Berkeley, CA 94720-3102 > lab: 510.642.8441 > http://pmb.berkeley.edu/~taylor/people/js.html > > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From N.Haigh at sheffield.ac.uk Thu Oct 19 13:17:33 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 18:17:33 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <001601c6f395$8a752ed0$15327e82@pyrimidine> References: <001601c6f395$8a752ed0$15327e82@pyrimidine> Message-ID: <1161278253.4537b32dd3d15@webmail.shef.ac.uk> Quoting Chris Fields : > > I thought I'd have my first proper try at writing some tests. I was > > wondering if there is a template test file that I should use/study in > > order to be > > consistent with other tests. > > > > Failing that - Is there a good test writing style I should follow in one > > of the other test files? > > > > Thanks > > Nathan > > I would start with the Test::Simple and Test::More perldoc; they're pretty > self-explanatory. You can look at the various test suites using Test::More > as well for pointers. By far, most tests will use is(). You can use SKIP > blocks to skip tests that have a requirement, or skip all tests if they all > require something. Pretty flexible. > > We should probably get a wiki page for the developers underway, maybe a > HOWTO on writing tests. At least have these focus on BioPerl, OOP, remote > DB tests, etc. > > Chris > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > Just wrote a partial and small test script for t/Amap.t in bioperl-run. When I run "perl -I. t/Amap.t" I get the following output: 1..10 ok 1 - use Bio::Tools::Run::Alignment::Amap; ok 2 - use Bio::AlignIO; ok 3 - use Bio::SeqIO; ok 4 - use Bio::Root::IO; ok 5 - All the required modules are present ok 6 - new() returned something ok 7 - and its the right class not ok 8 - executable() got the correct filename # Failed test 'executable() got the correct filename' # in t/Amap.t at line 90. # got: undef # expected: 'filename' ok 9 # skip Got incorrect filename for executable ok 10 # skip Got incorrect filename for executable # Looks like you failed 1 test of 10. So far this looks good (well, that it's failing passing expected tests). However, when i run "make test" the output is unexpected and I don't know why. It seems to die and produce the results of the testing before the rest of the test suit is run: t/Amap....................NOK 8 # Failed test 'executable() got the correct filename' # in t/Amap.t at line 90. # got: undef # expected: 'filename' # Looks like you failed 1 test of 10. t/Amap....................dubious Test returned status 1 (wstat 256, 0x100) DIED. FAILED test 8 Failed 1/10 tests, 90.00% okay (less 2 skipped tests: 7 okay, 70.00%) t/Analysis_soap...........ok 7/17make: *** wait: No child processes. Stop. Is there something I'm missing?? If it's something less obvious, let me know and i'll post whole test file. Nath From cjfields at uiuc.edu Thu Oct 19 13:26:45 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 12:26:45 -0500 Subject: [Bioperl-l] test::more template In-Reply-To: <1161278253.4537b32dd3d15@webmail.shef.ac.uk> Message-ID: <002001c6f3a3$c00b9080$15327e82@pyrimidine> ... > Just wrote a partial and small test script for t/Amap.t in bioperl-run. > When I run "perl -I. t/Amap.t" I get the following output: > 1..10 > ok 1 - use Bio::Tools::Run::Alignment::Amap; > ok 2 - use Bio::AlignIO; > ok 3 - use Bio::SeqIO; > ok 4 - use Bio::Root::IO; > ok 5 - All the required modules are present > ok 6 - new() returned something > ok 7 - and its the right class > not ok 8 - executable() got the correct filename > # Failed test 'executable() got the correct filename' > # in t/Amap.t at line 90. > # got: undef > # expected: 'filename' > ok 9 # skip Got incorrect filename for executable > ok 10 # skip Got incorrect filename for executable > # Looks like you failed 1 test of 10. > > > So far this looks good (well, that it's failing passing expected tests). > However, when i run "make test" the output is unexpected and I don't know > why. It seems to die and produce the results of the testing before the > rest of the test suit is run: > t/Amap....................NOK 8 > # Failed test 'executable() got the correct filename' > # in t/Amap.t at line 90. > # got: undef > # expected: 'filename' > # Looks like you failed 1 test of 10. > t/Amap....................dubious > Test returned status 1 (wstat 256, 0x100) > DIED. FAILED test 8 > Failed 1/10 tests, 90.00% okay (less 2 skipped tests: 7 okay, > 70.00%) > t/Analysis_soap...........ok 7/17make: *** wait: No child processes. > Stop. > > > > Is there something I'm missing?? If it's something less obvious, let me > know and i'll post whole test file. > Nath Well, Analysis_soap.t ran immediately after, so Amap.t tests shouldn't be the problem. The only issue I can think of is that Test::More TODO blocks require a newer version of Test::Harness (which most users have anyway). Are you using a TODO block? You can send me Amap.t and I'll give it a try, but I can't promise I'll get to it immediately (busy day). Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From N.Haigh at sheffield.ac.uk Thu Oct 19 13:38:25 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 18:38:25 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <002001c6f3a3$c00b9080$15327e82@pyrimidine> References: <002001c6f3a3$c00b9080$15327e82@pyrimidine> Message-ID: <1161279505.4537b811e143f@webmail.shef.ac.uk> > Well, Analysis_soap.t ran immediately after, so Amap.t tests shouldn't be > the problem. The only issue I can think of is that Test::More TODO blocks > require a newer version of Test::Harness (which most users have anyway). > Are you using a TODO block? > > You can send me Amap.t and I'll give it a try, but I can't promise I'll get > to it immediately (busy day). > > Chris > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > No TODO blocks. I must have done something wrong - it's the first time I've seen this - but then again, I don't look that closely at the output of "make test" unless something shows as a fail. Anyway, below is the short bit of code. Thanks Nath use strict; use Bio::Root::IO; # cant test for this, might be needed to get Test::More BEGIN { # Things to do ASAP once the script is run # even before anything else in the file is parsed use vars qw($NUMTESTS $DEBUG $error); $DEBUG = $ENV{'BIOIPERLDEBUG'} || 0; # Use installed Test module, otherwise fall back # to copy of Test.pm located in the t dir eval { require Test::More; }; if ( $@ ) { use lib Bio::Root::IO->catfile('t','lib'); } # Currently no errors $error = 0; # Setup the number of tests to be run # what about using: # use Test::More 'no_plan'; use Test::More; $NUMTESTS = 10; plan tests => $NUMTESTS; # Use modules that are needed in this test that are from # any of the Bioperl packages: Bioperl-core, Bioperl-run ... etc # use_ok(''); use_ok('Bio::Tools::Run::Alignment::Amap'); use_ok('Bio::AlignIO'); use_ok('Bio::SeqIO'); use_ok('Bio::Root::IO'); } # Multiple END blocks are run in reverse order of their definition # Last In, First Out (LIFO) END { # Things to do right at the very end, just # when the interpreter finishes/exits # E.g. deleting intermediate files produced during the test foreach my $file ( qw(cysprot.dnd cysprot1a.dnd) ) { unlink $file; # check it was deleted } #unlink qw(cysprot.dnd cysprot1a.dnd) } END { # Not sure what this is doing? #for ( $Test::ntest..$NUMTESTS ) { # skip("Amap program not found. Skipping.\n",1); #} } # if we got to here, thats OK! # is this really needed? ok( 1, 'All the required modules are present'); # setup input files etc my $inputfilename = Bio::Root::IO->catfile("t","data","cysprot.fa"); # setup output files etc # none in this test # setup global objects that are to be used in more than one test # Also test they were initialised correctly my @params = (); my $aln; my $factory = Bio::Tools::Run::Alignment::Amap->new(@params); ok( defined $factory, 'new() returned something' ); ok( $factory->isa('Bio::Tools::Run::Alignment::Amap'), ' and its the right class' ); # Now onto the nitty gritty tests of the modules methods my $executable_file = $factory->executable(); #is( $factory->executable(), 'filename', 'executable() got the correct filename' ); # block of tests to skip if you know the tests will fail # under some condition. E.g.: # Need network access, # Wont work on particular OS, # Cant find the exectuable # Do not just skip tests that seem to fail for an unknown reason SKIP: { # condition used to skip this block of tests #skip($why, $how_many_in_block); skip("Got incorrect filename for executable", 2) unless is($factory->executable(), 'filename', 'executable() got the correct filename'); ok( -e $executable_file, 'Found executable' ); ok( $factory->version >= 2.0, 'Code tested on Amap versions >= 2.0' ); } From jason at bioperl.org Thu Oct 19 13:44:51 2006 From: jason at bioperl.org (Jason Stajich) Date: Thu, 19 Oct 2006 10:44:51 -0700 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net> References: <001301c6f215$07a9a070$15327e82@pyrimidine> <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org> <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net> <8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org> <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net> Message-ID: Yikes - I was worried that it might have been me..... Okay I'll look into fixing it -- ChrisF - check in with me before diving in, in case I've gotten it done and I expect your enzyme assays might take up the time. -jason On Oct 19, 2006, at 10:11 AM, Hilmar Lapp wrote: > Actually you did that Jason: http://tinyurl.com/ye2edk > > Apparently the motivation was to "parse swissprot fields in genpept > file (dbsource)"? > > It clearly looks wrong to add the version. You've probably had a > reason why you did this at the time but if we (you :) can't recover > that I guess it's best to just fix it to do the right thing (in > both places obviously). > > -hilmar > > On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote: > >> Well there is explicit addition of the version to the primary id >> so it isn't so much a parsing error as a deliberate decision to >> append it. >> see Bio::SeqIO::genbank >> >> to make the dblink >> $annotation- >> >add_Annotation >> ('dblink', >> >> Bio::Annotation::DBLink->new >> (-primary_id >> => $id . "." . $version, >> -version => >> $version, >> -database => >> $db, >> -tagname => >> 'dblink')); >> >> and the code to print the dblink back out in the writer already >> assumes the version number is appended... >> >> foreach my $ref ( $seq->annotation->get_Annotations >> ('dblink') ) { >> # if ($ref->comment eq 'DBSOURCE') { >> $self->_print('DBSOURCE accession ', >> $ref->primary_id, "\n"); >> # } >> } >> >> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote: >> >>> Here is the overload code: >>> >>> use overload '""' => sub { >>> (($_[0]->database ? $_[0]->database . ':' : '' ) >>> . ($_[0]->primary_id ? $_[0]->primary_id : '') >>> . ($_[0]->version ? '.' . $_[0]->version : '')) >>> || '' }; >>> >>> Except that the last '||' is redundant and unnecessary (it either >>> does nothing or replaces an empty string with an empty string), I >>> don't see the potential for duplicating the version number here - >>> unless primary_id() did that, which I don't see it doing. >>> >>> So, to me this seems to come from a parsing error in the >>> beginning, rather than an erroneous mangling of version into >>> primary_id later. >>> >>> Is someone in the position to confirm this? >>> >>> -hilmar >>> >>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: >>> >>>> So I'm unsure what we should do here. >>>> >>>> We can certainly fix the problem which you report which is >>>> relying on >>>> the "" method -- if you were to do instead: >>>> print $_->database, ":", $_->primary_id, "\n"; >>>> >>>> you'll get the right answer. We at a minimum just fix the auto- >>>> string converting method to do The Right Thing. >>>> >>>> But I am not sure if we should keep the version out of the >>>> primary_id >>>> field. This will require some rejiggering in several modules >>>> when it >>>> comes to printing DBlinks and I don't want to do this before the >>>> release. I also am not sure if there was an explicit reason why >>>> someone did put the version information in the primary_id. (I >>>> hope it >>>> wasn't me because I don't think I'm going to remember why). >>>> >>>> Does anyone else have a strong feeling? >>>> >>>> -jason >>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote: >>>> >>>>> Hello, >>>>> >>>>> I noticed a little problem with the Annotation "DBLink" from >>>>> GenBank entries >>>>> >>>>> When I run: >>>>> >>>>> perl -MBio::DB::GenBank -e 'my $gi = >>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my >>>>> $seqio = >>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my >>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations >>>>> ("dblink"); >>>>> for(@annotations) { print $_, "\n";} print $INC{ >>>>> "Bio/Annotation/DBLink.pm" }, "\n"; ' >>>>> >>>>> This yields: >>>>> >>>>> GenBank:AL591065.17.17 >>>>> >>>>> and the place where the used Bio/Annotation/DBLink.pm resides. >>>>> >>>>> Can others repeat this? >>>>> >>>>> I have dug into the source a little and Bio::Annotation::DBLink >>>>> seems to >>>>> be the place where this happens: it has a concatenation which >>>>> leads to >>>>> that repeated version number. >>>>> >>>>> It this something that I should fix "client-side", so to speak, or >>>>> is it >>>>> worthwhile to add some logic to that concatenation to prevent >>>>> this? >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Eric >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> -- >>>> Jason Stajich, PhD >>>> Miller Research Fellow >>>> University of California >>>> Dept of Plant and Microbial Biology >>>> 321 Koshland Hall #3102 >>>> Berkeley, CA 94720-3102 >>>> lab: 510.642.8441 >>>> http://pmb.berkeley.edu/~taylor/people/js.html >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> >>> >>> >>> >>> >> >> -- >> Jason Stajich, PhD >> Miller Research Fellow >> University of California >> Dept of Plant and Microbial Biology >> 321 Koshland Hall #3102 >> Berkeley, CA 94720-3102 >> lab: 510.642.8441 >> http://pmb.berkeley.edu/~taylor/people/js.html >> >> > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From cjfields at uiuc.edu Thu Oct 19 14:03:52 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 13:03:52 -0500 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net> Message-ID: <000001c6f3a8$f0a46a00$15327e82@pyrimidine> Also seems that the DBSOURCE line isn't caught correctly and stuffs it by default into a GenBank dblink (the dbsource ihn the test case is EMBL, not GenBank). http://bugzilla.open-bio.org/show_bug.cgi?id=2124 It looks like NCBI may be now using: DBSOURCE embl accession Z49548.1 instead of the old version: DBSOURCE embl locus SCYJR048W, accession Z49548.1 I don't recall NCBI mentioning changes regarding DBSOURCE in any of the recent release notes. Chris > Actually you did that Jason: http://tinyurl.com/ye2edk > > Apparently the motivation was to "parse swissprot fields in genpept > file (dbsource)"? > > It clearly looks wrong to add the version. You've probably had a > reason why you did this at the time but if we (you :) can't recover > that I guess it's best to just fix it to do the right thing (in both > places obviously). > > -hilmar > > On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote: > > > Well there is explicit addition of the version to the primary id so > > it isn't so much a parsing error as a deliberate decision to append > > it. > > see Bio::SeqIO::genbank > > > > to make the dblink > > $annotation- > > >add_Annotation > > ('dblink', > > > > Bio::Annotation::DBLink->new > > (-primary_id > > => $id . "." . $version, > > -version => > > $version, > > -database => > > $db, > > -tagname => > > 'dblink')); > > > > and the code to print the dblink back out in the writer already > > assumes the version number is appended... > > > > foreach my $ref ( $seq->annotation->get_Annotations > > ('dblink') ) { > > # if ($ref->comment eq 'DBSOURCE') { > > $self->_print('DBSOURCE accession ', > > $ref->primary_id, "\n"); > > # } > > } > > > > On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote: > > > >> Here is the overload code: > >> > >> use overload '""' => sub { > >> (($_[0]->database ? $_[0]->database . ':' : '' ) > >> . ($_[0]->primary_id ? $_[0]->primary_id : '') > >> . ($_[0]->version ? '.' . $_[0]->version : '')) > >> || '' }; > >> > >> Except that the last '||' is redundant and unnecessary (it either > >> does nothing or replaces an empty string with an empty string), I > >> don't see the potential for duplicating the version number here - > >> unless primary_id() did that, which I don't see it doing. > >> > >> So, to me this seems to come from a parsing error in the > >> beginning, rather than an erroneous mangling of version into > >> primary_id later. > >> > >> Is someone in the position to confirm this? > >> > >> -hilmar > >> > >> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: > >> > >>> So I'm unsure what we should do here. > >>> > >>> We can certainly fix the problem which you report which is > >>> relying on > >>> the "" method -- if you were to do instead: > >>> print $_->database, ":", $_->primary_id, "\n"; > >>> > >>> you'll get the right answer. We at a minimum just fix the auto- > >>> string converting method to do The Right Thing. > >>> > >>> But I am not sure if we should keep the version out of the > >>> primary_id > >>> field. This will require some rejiggering in several modules > >>> when it > >>> comes to printing DBlinks and I don't want to do this before the > >>> release. I also am not sure if there was an explicit reason why > >>> someone did put the version information in the primary_id. (I > >>> hope it > >>> wasn't me because I don't think I'm going to remember why). > >>> > >>> Does anyone else have a strong feeling? > >>> > >>> -jason > >>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote: > >>> > >>>> Hello, > >>>> > >>>> I noticed a little problem with the Annotation "DBLink" from > >>>> GenBank entries > >>>> > >>>> When I run: > >>>> > >>>> perl -MBio::DB::GenBank -e 'my $gi = > >>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my > >>>> $seqio = > >>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my > >>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations > >>>> ("dblink"); > >>>> for(@annotations) { print $_, "\n";} print $INC{ > >>>> "Bio/Annotation/DBLink.pm" }, "\n"; ' > >>>> > >>>> This yields: > >>>> > >>>> GenBank:AL591065.17.17 > >>>> > >>>> and the place where the used Bio/Annotation/DBLink.pm resides. > >>>> > >>>> Can others repeat this? > >>>> > >>>> I have dug into the source a little and Bio::Annotation::DBLink > >>>> seems to > >>>> be the place where this happens: it has a concatenation which > >>>> leads to > >>>> that repeated version number. > >>>> > >>>> It this something that I should fix "client-side", so to speak, or > >>>> is it > >>>> worthwhile to add some logic to that concatenation to prevent this? > >>>> > >>>> > >>>> Thanks, > >>>> > >>>> Eric > >>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> Bioperl-l mailing list > >>>> Bioperl-l at lists.open-bio.org > >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>> > >>> -- > >>> Jason Stajich, PhD > >>> Miller Research Fellow > >>> University of California > >>> Dept of Plant and Microbial Biology > >>> 321 Koshland Hall #3102 > >>> Berkeley, CA 94720-3102 > >>> lab: 510.642.8441 > >>> http://pmb.berkeley.edu/~taylor/people/js.html > >>> > >>> > >>> _______________________________________________ > >>> Bioperl-l mailing list > >>> Bioperl-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>> > >> > >> -- > >> =========================================================== > >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > >> =========================================================== > >> > >> > >> > >> > >> > > > > -- > > Jason Stajich, PhD > > Miller Research Fellow > > University of California > > Dept of Plant and Microbial Biology > > 321 Koshland Hall #3102 > > Berkeley, CA 94720-3102 > > lab: 510.642.8441 > > http://pmb.berkeley.edu/~taylor/people/js.html > > > > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From N.Haigh at sheffield.ac.uk Thu Oct 19 14:06:11 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 19:06:11 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <002001c6f3a3$c00b9080$15327e82@pyrimidine> References: <002001c6f3a3$c00b9080$15327e82@pyrimidine> Message-ID: <1161281171.4537be93b63c9@webmail.shef.ac.uk> > > Well, Analysis_soap.t ran immediately after, so Amap.t tests shouldn't be > the problem. The only issue I can think of is that Test::More TODO blocks > require a newer version of Test::Harness (which most users have anyway). > Are you using a TODO block? > > You can send me Amap.t and I'll give it a try, but I can't promise I'll get > to it immediately (busy day). > > Chris > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > Nevermind about this - It's working as expected! I got confused as a previous run threw errors but wasn't included in the final table of failed tests - working now. Nath From N.Haigh at sheffield.ac.uk Thu Oct 19 14:14:54 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 19:14:54 +0100 Subject: [Bioperl-l] bioperl-run executable In-Reply-To: <002001c6f3a3$c00b9080$15327e82@pyrimidine> References: <002001c6f3a3$c00b9080$15327e82@pyrimidine> Message-ID: <1161281694.4537c09ebd0f8@webmail.shef.ac.uk> I have a few questions about How bioperl-run modules. 1) How do modules define what the name of the executable is that it uses? 2) Is there a way to test what this is? 3) Does $factory->executable return this or does it only return the name if it successfully found it? Thanks Nath From cjfields at uiuc.edu Thu Oct 19 14:15:08 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 13:15:08 -0500 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: Message-ID: <000001c6f3aa$82845ba0$15327e82@pyrimidine> Go for it. I haven't got the time to spare at the moment, sucky protein assays.... Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Jason Stajich > Sent: Thursday, October 19, 2006 12:45 PM > To: Hilmar Lapp > Cc: bioperl-l at lists.open-bio.org; Erikjan > Subject: Re: [Bioperl-l] Annotation-DBLink- version numbers repeating > > Yikes - I was worried that it might have been me..... > > Okay I'll look into fixing it -- ChrisF - check in with me before > diving in, in case I've gotten it done and I expect your enzyme > assays might take up the time. > > -jason > On Oct 19, 2006, at 10:11 AM, Hilmar Lapp wrote: > > > Actually you did that Jason: http://tinyurl.com/ye2edk > > > > Apparently the motivation was to "parse swissprot fields in genpept > > file (dbsource)"? > > > > It clearly looks wrong to add the version. You've probably had a > > reason why you did this at the time but if we (you :) can't recover > > that I guess it's best to just fix it to do the right thing (in > > both places obviously). > > > > -hilmar > > > > On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote: > > > >> Well there is explicit addition of the version to the primary id > >> so it isn't so much a parsing error as a deliberate decision to > >> append it. > >> see Bio::SeqIO::genbank > >> > >> to make the dblink > >> $annotation- > >> >add_Annotation > >> ('dblink', > >> > >> Bio::Annotation::DBLink->new > >> (-primary_id > >> => $id . "." . $version, > >> -version => > >> $version, > >> -database => > >> $db, > >> -tagname => > >> 'dblink')); > >> > >> and the code to print the dblink back out in the writer already > >> assumes the version number is appended... > >> > >> foreach my $ref ( $seq->annotation->get_Annotations > >> ('dblink') ) { > >> # if ($ref->comment eq 'DBSOURCE') { > >> $self->_print('DBSOURCE accession ', > >> $ref->primary_id, "\n"); > >> # } > >> } > >> > >> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote: > >> > >>> Here is the overload code: > >>> > >>> use overload '""' => sub { > >>> (($_[0]->database ? $_[0]->database . ':' : '' ) > >>> . ($_[0]->primary_id ? $_[0]->primary_id : '') > >>> . ($_[0]->version ? '.' . $_[0]->version : '')) > >>> || '' }; > >>> > >>> Except that the last '||' is redundant and unnecessary (it either > >>> does nothing or replaces an empty string with an empty string), I > >>> don't see the potential for duplicating the version number here - > >>> unless primary_id() did that, which I don't see it doing. > >>> > >>> So, to me this seems to come from a parsing error in the > >>> beginning, rather than an erroneous mangling of version into > >>> primary_id later. > >>> > >>> Is someone in the position to confirm this? > >>> > >>> -hilmar > >>> > >>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: > >>> > >>>> So I'm unsure what we should do here. > >>>> > >>>> We can certainly fix the problem which you report which is > >>>> relying on > >>>> the "" method -- if you were to do instead: > >>>> print $_->database, ":", $_->primary_id, "\n"; > >>>> > >>>> you'll get the right answer. We at a minimum just fix the auto- > >>>> string converting method to do The Right Thing. > >>>> > >>>> But I am not sure if we should keep the version out of the > >>>> primary_id > >>>> field. This will require some rejiggering in several modules > >>>> when it > >>>> comes to printing DBlinks and I don't want to do this before the > >>>> release. I also am not sure if there was an explicit reason why > >>>> someone did put the version information in the primary_id. (I > >>>> hope it > >>>> wasn't me because I don't think I'm going to remember why). > >>>> > >>>> Does anyone else have a strong feeling? > >>>> > >>>> -jason > >>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote: > >>>> > >>>>> Hello, > >>>>> > >>>>> I noticed a little problem with the Annotation "DBLink" from > >>>>> GenBank entries > >>>>> > >>>>> When I run: > >>>>> > >>>>> perl -MBio::DB::GenBank -e 'my $gi = > >>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my > >>>>> $seqio = > >>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my > >>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations > >>>>> ("dblink"); > >>>>> for(@annotations) { print $_, "\n";} print $INC{ > >>>>> "Bio/Annotation/DBLink.pm" }, "\n"; ' > >>>>> > >>>>> This yields: > >>>>> > >>>>> GenBank:AL591065.17.17 > >>>>> > >>>>> and the place where the used Bio/Annotation/DBLink.pm resides. > >>>>> > >>>>> Can others repeat this? > >>>>> > >>>>> I have dug into the source a little and Bio::Annotation::DBLink > >>>>> seems to > >>>>> be the place where this happens: it has a concatenation which > >>>>> leads to > >>>>> that repeated version number. > >>>>> > >>>>> It this something that I should fix "client-side", so to speak, or > >>>>> is it > >>>>> worthwhile to add some logic to that concatenation to prevent > >>>>> this? > >>>>> > >>>>> > >>>>> Thanks, > >>>>> > >>>>> Eric > >>>>> > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> Bioperl-l mailing list > >>>>> Bioperl-l at lists.open-bio.org > >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>>> > >>>> -- > >>>> Jason Stajich, PhD > >>>> Miller Research Fellow > >>>> University of California > >>>> Dept of Plant and Microbial Biology > >>>> 321 Koshland Hall #3102 > >>>> Berkeley, CA 94720-3102 > >>>> lab: 510.642.8441 > >>>> http://pmb.berkeley.edu/~taylor/people/js.html > >>>> > >>>> > >>>> _______________________________________________ > >>>> Bioperl-l mailing list > >>>> Bioperl-l at lists.open-bio.org > >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>>> > >>> > >>> -- > >>> =========================================================== > >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > >>> =========================================================== > >>> > >>> > >>> > >>> > >>> > >> > >> -- > >> Jason Stajich, PhD > >> Miller Research Fellow > >> University of California > >> Dept of Plant and Microbial Biology > >> 321 Koshland Hall #3102 > >> Berkeley, CA 94720-3102 > >> lab: 510.642.8441 > >> http://pmb.berkeley.edu/~taylor/people/js.html > >> > >> > > > > -- > > =========================================================== > > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > > =========================================================== > > > > > > > > > > > > -- > Jason Stajich, PhD > Miller Research Fellow > University of California > Dept of Plant and Microbial Biology > 321 Koshland Hall #3102 > Berkeley, CA 94720-3102 > lab: 510.642.8441 > http://pmb.berkeley.edu/~taylor/people/js.html > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Thu Oct 19 14:35:08 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 13:35:08 -0500 Subject: [Bioperl-l] bioperl-run executable In-Reply-To: <1161281694.4537c09ebd0f8@webmail.shef.ac.uk> Message-ID: <000301c6f3ad$4ec9af10$15327e82@pyrimidine> I think a lot of the bioperl-run modules use Bio::Tools::Run::WrapperBase but I'm not sure. I haven't used them very much myself but plan on making wrappers at some point soon for some programs I use. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: Nathan Haigh [mailto:N.Haigh at sheffield.ac.uk] > Sent: Thursday, October 19, 2006 1:15 PM > To: Chris Fields > Cc: 'bioperl-l' > Subject: bioperl-run executable > > I have a few questions about How bioperl-run modules. > > 1) How do modules define what the name of the executable is that it uses? > 2) Is there a way to test what this is? > 3) Does $factory->executable return this or does it only return the name > if it successfully found it? > > Thanks > Nath From N.Haigh at sheffield.ac.uk Thu Oct 19 14:47:01 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 19:47:01 +0100 Subject: [Bioperl-l] bioperl-run executable In-Reply-To: <000301c6f3ad$4ec9af10$15327e82@pyrimidine> References: <000301c6f3ad$4ec9af10$15327e82@pyrimidine> Message-ID: <1161283620.4537c82501c43@webmail.shef.ac.uk> Quoting Chris Fields : > I think a lot of the bioperl-run modules use Bio::Tools::Run::WrapperBase > but I'm not sure. I haven't used them very much myself but plan on making > wrappers at some point soon for some programs I use. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > On closer inspection of a couple of other modules (Clustalw.pm and TCoffee.pm), they seem to hardcode the exe name in $PROGRAM_NAME and have a sub (program_name) that simply returns this value. I'd like to see the program_name become a getter/setter so users can change the default and have the string stored in the factory object. Does it matter that Bio::Tools::Run::WrapperBase is in bioperl-core not bioperl-run? I suppose not since bioperl-core is a prerep for bioperl-run but wouldn't it make sence to go in bioperl-run? Nath From cjfields at uiuc.edu Thu Oct 19 15:07:05 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 14:07:05 -0500 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: Message-ID: <000701c6f3b1$c5914230$15327e82@pyrimidine> Jason, Hilmar, How about changing the default parsed dblink in SeqIO::genbank (line 520) to if( $dbsource =~ /^(\S*?)\s*accession\s+(\S+)\.(\d+)/ ) { my ($db,$id,$version) = ($1,$2,$3); $annotation->add_Annotation ('dblink', Bio::Annotation::DBLink->new (-primary_id => $id, -version => $version, -database => $db || 'GenBank', -tagname => 'dblink')); } It passes tests and catches the optional database ('embl' for the bugzilla report). The output sequence still doesn't print the DB if it isn't GenBank via write_seq(), but that should be too hard to fix (famous last words). Okay, okay, back to the assays... Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Jason Stajich > Sent: Thursday, October 19, 2006 12:45 PM > To: Hilmar Lapp > Cc: bioperl-l at lists.open-bio.org; Erikjan > Subject: Re: [Bioperl-l] Annotation-DBLink- version numbers repeating > > Yikes - I was worried that it might have been me..... > > Okay I'll look into fixing it -- ChrisF - check in with me before > diving in, in case I've gotten it done and I expect your enzyme > assays might take up the time. > > -jason > On Oct 19, 2006, at 10:11 AM, Hilmar Lapp wrote: > > > Actually you did that Jason: http://tinyurl.com/ye2edk > > > > Apparently the motivation was to "parse swissprot fields in genpept > > file (dbsource)"? > > > > It clearly looks wrong to add the version. You've probably had a > > reason why you did this at the time but if we (you :) can't recover > > that I guess it's best to just fix it to do the right thing (in > > both places obviously). > > > > -hilmar > > > > On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote: > > > >> Well there is explicit addition of the version to the primary id > >> so it isn't so much a parsing error as a deliberate decision to > >> append it. > >> see Bio::SeqIO::genbank > >> > >> to make the dblink > >> $annotation- > >> >add_Annotation > >> ('dblink', > >> > >> Bio::Annotation::DBLink->new > >> (-primary_id > >> => $id . "." . $version, > >> -version => > >> $version, > >> -database => > >> $db, > >> -tagname => > >> 'dblink')); > >> > >> and the code to print the dblink back out in the writer already > >> assumes the version number is appended... > >> > >> foreach my $ref ( $seq->annotation->get_Annotations > >> ('dblink') ) { > >> # if ($ref->comment eq 'DBSOURCE') { > >> $self->_print('DBSOURCE accession ', > >> $ref->primary_id, "\n"); > >> # } > >> } > >> > >> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote: > >> > >>> Here is the overload code: > >>> > >>> use overload '""' => sub { > >>> (($_[0]->database ? $_[0]->database . ':' : '' ) > >>> . ($_[0]->primary_id ? $_[0]->primary_id : '') > >>> . ($_[0]->version ? '.' . $_[0]->version : '')) > >>> || '' }; > >>> > >>> Except that the last '||' is redundant and unnecessary (it either > >>> does nothing or replaces an empty string with an empty string), I > >>> don't see the potential for duplicating the version number here - > >>> unless primary_id() did that, which I don't see it doing. > >>> > >>> So, to me this seems to come from a parsing error in the > >>> beginning, rather than an erroneous mangling of version into > >>> primary_id later. > >>> > >>> Is someone in the position to confirm this? > >>> > >>> -hilmar > >>> > >>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: > >>> > >>>> So I'm unsure what we should do here. > >>>> > >>>> We can certainly fix the problem which you report which is > >>>> relying on > >>>> the "" method -- if you were to do instead: > >>>> print $_->database, ":", $_->primary_id, "\n"; > >>>> > >>>> you'll get the right answer. We at a minimum just fix the auto- > >>>> string converting method to do The Right Thing. > >>>> > >>>> But I am not sure if we should keep the version out of the > >>>> primary_id > >>>> field. This will require some rejiggering in several modules > >>>> when it > >>>> comes to printing DBlinks and I don't want to do this before the > >>>> release. I also am not sure if there was an explicit reason why > >>>> someone did put the version information in the primary_id. (I > >>>> hope it > >>>> wasn't me because I don't think I'm going to remember why). > >>>> > >>>> Does anyone else have a strong feeling? > >>>> > >>>> -jason > >>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote: > >>>> > >>>>> Hello, > >>>>> > >>>>> I noticed a little problem with the Annotation "DBLink" from > >>>>> GenBank entries > >>>>> > >>>>> When I run: > >>>>> > >>>>> perl -MBio::DB::GenBank -e 'my $gi = > >>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my > >>>>> $seqio = > >>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my > >>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations > >>>>> ("dblink"); > >>>>> for(@annotations) { print $_, "\n";} print $INC{ > >>>>> "Bio/Annotation/DBLink.pm" }, "\n"; ' > >>>>> > >>>>> This yields: > >>>>> > >>>>> GenBank:AL591065.17.17 > >>>>> > >>>>> and the place where the used Bio/Annotation/DBLink.pm resides. > >>>>> > >>>>> Can others repeat this? > >>>>> > >>>>> I have dug into the source a little and Bio::Annotation::DBLink > >>>>> seems to > >>>>> be the place where this happens: it has a concatenation which > >>>>> leads to > >>>>> that repeated version number. > >>>>> > >>>>> It this something that I should fix "client-side", so to speak, or > >>>>> is it > >>>>> worthwhile to add some logic to that concatenation to prevent > >>>>> this? > >>>>> > >>>>> > >>>>> Thanks, > >>>>> > >>>>> Eric > >>>>> > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> Bioperl-l mailing list > >>>>> Bioperl-l at lists.open-bio.org > >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>>> > >>>> -- > >>>> Jason Stajich, PhD > >>>> Miller Research Fellow > >>>> University of California > >>>> Dept of Plant and Microbial Biology > >>>> 321 Koshland Hall #3102 > >>>> Berkeley, CA 94720-3102 > >>>> lab: 510.642.8441 > >>>> http://pmb.berkeley.edu/~taylor/people/js.html > >>>> > >>>> > >>>> _______________________________________________ > >>>> Bioperl-l mailing list > >>>> Bioperl-l at lists.open-bio.org > >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>>> > >>> > >>> -- > >>> =========================================================== > >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > >>> =========================================================== > >>> > >>> > >>> > >>> > >>> > >> > >> -- > >> Jason Stajich, PhD > >> Miller Research Fellow > >> University of California > >> Dept of Plant and Microbial Biology > >> 321 Koshland Hall #3102 > >> Berkeley, CA 94720-3102 > >> lab: 510.642.8441 > >> http://pmb.berkeley.edu/~taylor/people/js.html > >> > >> > > > > -- > > =========================================================== > > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > > =========================================================== > > > > > > > > > > > > -- > Jason Stajich, PhD > Miller Research Fellow > University of California > Dept of Plant and Microbial Biology > 321 Koshland Hall #3102 > Berkeley, CA 94720-3102 > lab: 510.642.8441 > http://pmb.berkeley.edu/~taylor/people/js.html > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jason at bioperl.org Thu Oct 19 14:48:28 2006 From: jason at bioperl.org (Jason Stajich) Date: Thu, 19 Oct 2006 11:48:28 -0700 Subject: [Bioperl-l] bioperl-run executable In-Reply-To: <1161281694.4537c09ebd0f8@webmail.shef.ac.uk> References: <002001c6f3a3$c00b9080$15327e82@pyrimidine> <1161281694.4537c09ebd0f8@webmail.shef.ac.uk> Message-ID: <67650240-D61B-4842-AE7C-75F15F608F6F@bioperl.org> program_name() Should return the name of the program executable() Is a function that you don't have to mess with that tries to find the executable named program_name() based on your PATH. -jason On Oct 19, 2006, at 11:14 AM, Nathan Haigh wrote: > I have a few questions about How bioperl-run modules. > > 1) How do modules define what the name of the executable is that it > uses? > 2) Is there a way to test what this is? > 3) Does $factory->executable return this or does it only return the > name if it successfully found it? > > Thanks > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From jason at bioperl.org Thu Oct 19 17:06:43 2006 From: jason at bioperl.org (Jason Stajich) Date: Thu, 19 Oct 2006 14:06:43 -0700 Subject: [Bioperl-l] bioperl-run executable In-Reply-To: <1161283620.4537c82501c43@webmail.shef.ac.uk> References: <000301c6f3ad$4ec9af10$15327e82@pyrimidine> <1161283620.4537c82501c43@webmail.shef.ac.uk> Message-ID: It can be reset now but of course this not a very nice way of doing it: $Bio::Tools::Run::Alignment::Clustalw::PROGRAM_NAME = 'clustalw_smp'; I am not sure if there are pros and cons to making it a getter- setter, but if you want to run with it, please do. The whole run system has been hard to keep people adhering to a standard (and the standard has changed a bit) so some auditing is warranted. -jason On Oct 19, 2006, at 11:47 AM, Nathan Haigh wrote: > Quoting Chris Fields : > >> I think a lot of the bioperl-run modules use >> Bio::Tools::Run::WrapperBase >> but I'm not sure. I haven't used them very much myself but plan >> on making >> wrappers at some point soon for some programs I use. >> >> Christopher Fields >> Postdoctoral Researcher - Switzer Lab >> Dept. of Biochemistry >> University of Illinois Urbana-Champaign >> > > On closer inspection of a couple of other modules (Clustalw.pm and > TCoffee.pm), they seem to hardcode the exe name in $PROGRAM_NAME > and have a sub > (program_name) that simply returns this value. I'd like to see the > program_name become a getter/setter so users can change the default > and have the > string stored in the factory object. > > Does it matter that Bio::Tools::Run::WrapperBase is in bioperl-core > not bioperl-run? I suppose not since bioperl-core is a prerep for > bioperl-run but > wouldn't it make sence to go in bioperl-run? > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From torsten.seemann at infotech.monash.edu.au Thu Oct 19 19:24:03 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Fri, 20 Oct 2006 09:24:03 +1000 Subject: [Bioperl-l] test::more template In-Reply-To: <1161279505.4537b811e143f@webmail.shef.ac.uk> References: <002001c6f3a3$c00b9080$15327e82@pyrimidine> <1161279505.4537b811e143f@webmail.shef.ac.uk> Message-ID: <45380913.3070506@infotech.monash.edu.au> Nathan, > use strict; > use Bio::Root::IO; # cant test for this, might be needed to get Test::More use File::Spec->catfile as Bio::Root::IO->catfile just delegates anyway, and File::Spec is "guaranteed" to be installed with Perl 5.6+. > use lib Bio::Root::IO->catfile('t','lib'); Simpler as: use lib 't/lib'; I understand the 'lib.pm' accepts Unix style directories REGARDLESS of native platform. -- Torsten Seemann Victorian Bioinformatics Consortium, Monash University, Australia From prabubio at gmail.com Thu Oct 19 20:11:36 2006 From: prabubio at gmail.com (Prabu Raja) Date: 20 Oct 2006 00:11:36 -0000 Subject: [Bioperl-l] Prabu Raja sent you this link Message-ID: <20061020001136.86586.qmail@x05.namesdatabase.com> Remember your link from Prabu Raja: http://namesdatabase.com/2m.pl?k2=40892642637&s=nsender2 1 -> Use Prabu Raja's link by clicking above. 2 -> Enter your info for a membership connected to Prabu. 3 -> Share links with other friends, family and co-workers. 4 -> Use the members-only people search tools. Prabu selected you for this on 09-02-2004 22:52 ET. prabubio at gmail.com (Prabu Raja) initiated this to bioperl-l at lists.open-bio.org at 10-05-2006 02:30 on namesdatabase.com from the IP address 59.92.88.99. If you do not know a Prabu Raja, use http://namesdatabase.com/u.pl?bb2=40892642637&s=nsender2 to halt more reminders about this. For reference, the address of The Names Database is 1253 N. Research Way, Suite Q-2500, Orem, UT 84097. From cjfields at uiuc.edu Thu Oct 19 20:29:11 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 19:29:11 -0500 Subject: [Bioperl-l] test::more template In-Reply-To: <45380913.3070506@infotech.monash.edu.au> Message-ID: <000f01c6f3de$c3d91170$15327e82@pyrimidine> > Nathan, > > > use strict; > > use Bio::Root::IO; # cant test for this, might be needed to get > Test::More > > use File::Spec->catfile as Bio::Root::IO->catfile just delegates anyway, > and File::Spec is "guaranteed" to be installed with Perl 5.6+. > > > use lib Bio::Root::IO->catfile('t','lib'); > > Simpler as: > use lib 't/lib'; > I understand the 'lib.pm' accepts Unix style directories REGARDLESS of > native > platform. > > -- > Torsten Seemann > Victorian Bioinformatics Consortium, Monash University, Australia That is true, at least for WinXP (not sure about older Windows versions out there). I was using 'Root::IO->catfile' but found 'use lib 't/lib' works. I may have a few of the 'catfile' versions floating around out there, which may be where that originated. Note that if you plan on using Test::More with the bioperl-run test suite, you should add it to the bioperl-run CVS distribution directory in 't/lib'. Most people will have it installed, but you never know. Chris From cjfields at uiuc.edu Thu Oct 19 20:33:22 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 19:33:22 -0500 Subject: [Bioperl-l] Prabu Raja sent you this link In-Reply-To: <20061020001136.86586.qmail@x05.namesdatabase.com> Message-ID: <001001c6f3df$598a24c0$15327e82@pyrimidine> That Prabu Raja sure gets around... Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Prabu Raja > Sent: Thursday, October 19, 2006 7:12 PM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] Prabu Raja sent you this link > > Remember your link from Prabu Raja: > > http://namesdatabase.com/2m.pl?k2=40892642637&s=nsender2 > > > 1 -> Use Prabu Raja's link by clicking above. > > 2 -> Enter your info for a membership connected to Prabu. > > 3 -> Share links with other friends, family and co-workers. > > 4 -> Use the members-only people search tools. > > Prabu selected you for this on 09-02-2004 22:52 ET. > > > prabubio at gmail.com (Prabu Raja) initiated this to bioperl-l at lists.open- > bio.org > at 10-05-2006 02:30 on namesdatabase.com from the IP address 59.92.88.99. > If you do not know a Prabu Raja, use > http://namesdatabase.com/u.pl?bb2=40892642637&s=nsender2 to halt more > reminders about this. > For reference, the address of The Names Database is 1253 N. Research Way, > Suite Q-2500, Orem, UT 84097. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From keithplayer at hotmail.com Thu Oct 19 22:13:52 2006 From: keithplayer at hotmail.com (Keith Player) Date: Fri, 20 Oct 2006 02:13:52 +0000 (UTC) Subject: [Bioperl-l] Bio::DB::GFF::Util::Binning Message-ID: I know that there may be some changes resulting from new GFF3 implementations, but thought I would see if the following is useful anyway. I implemented the R-tree binning schema as used by Bio::DB::GFF::Util::Binning and as mention in this article: I tested the following query on a normal table (no binning), but it assumes that you know the longest range in the table. So for example with a table of human genes, where the longest gene we know of is around 2.4Mb. SELECT COUNT(*) as count FROM groups WHERE start > max(0,[start-2.4Mb]) AND g.start < [end] AND g.end > [start] AND g.chromosome = '1' so for 100Mb:101Mb SELECT COUNT(*) as count FROM groups WHERE start > 97600000 AND g.start < 101000000 AND g.end > 100000000 AND g.chromosome = '1' where [start] and [end] define the region of interest. This query outperforms the R-Tree implementation on all tests that I have performed (for lengths of 200bp to 10Mb across a whole chromsome). Could this be of some practical use? From jason at bioperl.org Thu Oct 19 11:50:49 2006 From: jason at bioperl.org (Jason Stajich) Date: Thu, 19 Oct 2006 08:50:49 -0700 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net> References: <001301c6f215$07a9a070$15327e82@pyrimidine> <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org> <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net> Message-ID: <8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org> Well there is explicit addition of the version to the primary id so it isn't so much a parsing error as a deliberate decision to append it. see Bio::SeqIO::genbank to make the dblink $annotation- >add_Annotation ('dblink', Bio::Annotation::DBLink->new (-primary_id => $id . "." . $version, -version => $version, -database => $db, -tagname => 'dblink')); and the code to print the dblink back out in the writer already assumes the version number is appended... foreach my $ref ( $seq->annotation->get_Annotations ('dblink') ) { # if ($ref->comment eq 'DBSOURCE') { $self->_print('DBSOURCE accession ', $ref->primary_id, "\n"); # } } On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote: > Here is the overload code: > > use overload '""' => sub { > (($_[0]->database ? $_[0]->database . ':' : '' ) > . ($_[0]->primary_id ? $_[0]->primary_id : '') > . ($_[0]->version ? '.' . $_[0]->version : '')) > || '' }; > > Except that the last '||' is redundant and unnecessary (it either > does nothing or replaces an empty string with an empty string), I > don't see the potential for duplicating the version number here - > unless primary_id() did that, which I don't see it doing. > > So, to me this seems to come from a parsing error in the beginning, > rather than an erroneous mangling of version into primary_id later. > > Is someone in the position to confirm this? > > -hilmar > > On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: > >> So I'm unsure what we should do here. >> >> We can certainly fix the problem which you report which is relying on >> the "" method -- if you were to do instead: >> print $_->database, ":", $_->primary_id, "\n"; >> >> you'll get the right answer. We at a minimum just fix the auto- >> string converting method to do The Right Thing. >> >> But I am not sure if we should keep the version out of the primary_id >> field. This will require some rejiggering in several modules when it >> comes to printing DBlinks and I don't want to do this before the >> release. I also am not sure if there was an explicit reason why >> someone did put the version information in the primary_id. (I hope it >> wasn't me because I don't think I'm going to remember why). >> >> Does anyone else have a strong feeling? >> >> -jason >> On Oct 17, 2006, at 12:01 PM, Erikjan wrote: >> >>> Hello, >>> >>> I noticed a little problem with the Annotation "DBLink" from >>> GenBank entries >>> >>> When I run: >>> >>> perl -MBio::DB::GenBank -e 'my $gi = >>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my >>> $seqio = >>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my >>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations >>> ("dblink"); >>> for(@annotations) { print $_, "\n";} print $INC{ >>> "Bio/Annotation/DBLink.pm" }, "\n"; ' >>> >>> This yields: >>> >>> GenBank:AL591065.17.17 >>> >>> and the place where the used Bio/Annotation/DBLink.pm resides. >>> >>> Can others repeat this? >>> >>> I have dug into the source a little and Bio::Annotation::DBLink >>> seems to >>> be the place where this happens: it has a concatenation which >>> leads to >>> that repeated version number. >>> >>> It this something that I should fix "client-side", so to speak, or >>> is it >>> worthwhile to add some logic to that concatenation to prevent this? >>> >>> >>> Thanks, >>> >>> Eric >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> Jason Stajich, PhD >> Miller Research Fellow >> University of California >> Dept of Plant and Microbial Biology >> 321 Koshland Hall #3102 >> Berkeley, CA 94720-3102 >> lab: 510.642.8441 >> http://pmb.berkeley.edu/~taylor/people/js.html >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From n.haigh at sheffield.ac.uk Fri Oct 20 04:35:03 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 20 Oct 2006 08:35:03 +0000 Subject: [Bioperl-l] test::more template In-Reply-To: <000f01c6f3de$c3d91170$15327e82@pyrimidine> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> Message-ID: <45388A37.7040505@sheffield.ac.uk> Chris Fields wrote: >> Nathan, >> >> >>> use strict; >>> use Bio::Root::IO; # cant test for this, might be needed to get >>> >> Test::More >> >> use File::Spec->catfile as Bio::Root::IO->catfile just delegates anyway, >> and File::Spec is "guaranteed" to be installed with Perl 5.6+. >> >> >>> use lib Bio::Root::IO->catfile('t','lib'); >>> >> Simpler as: >> use lib 't/lib'; >> I understand the 'lib.pm' accepts Unix style directories REGARDLESS of >> native >> platform. >> >> -- >> Torsten Seemann >> Victorian Bioinformatics Consortium, Monash University, Australia >> > > That is true, at least for WinXP (not sure about older Windows versions out > there). I was using 'Root::IO->catfile' but found 'use lib 't/lib' works. > I may have a few of the 'catfile' versions floating around out there, which > may be where that originated. > > Note that if you plan on using Test::More with the bioperl-run test suite, > you should add it to the bioperl-run CVS distribution directory in 't/lib'. > Most people will have it installed, but you never know. > > Chris > > > What is the reason for including Test::More in 't/lib' rather than having it as a prereq? -- > A: Yes. >> Q: Are you sure? >> >>> A: Because it reverses the logical flow of conversation. >>> >>>> Q: Why is top posting frowned upon? >>>> Get Thunderbird From n.haigh at sheffield.ac.uk Fri Oct 20 05:27:19 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Fri, 20 Oct 2006 10:27:19 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <000f01c6f3de$c3d91170$15327e82@pyrimidine> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> Message-ID: <45389677.1000709@sheffield.ac.uk> Is it really necessary to specify the number of tests that are to be conducted in advance? It seems a bit annoying to have to count the number of tests in the script or to run the test just to see how many tests were done, we could just use: use Test::More 'no_plan'; And then it's up to Test::More to keep a track of how many tests it's run. The only thing then to worry about is how many tests are in a SKIP block if the skip criteria are met. This is unless there is a good reason to use it that I am unaware of. Thanks Nath From bix at sendu.me.uk Fri Oct 20 06:01:09 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 20 Oct 2006 11:01:09 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <45389677.1000709@sheffield.ac.uk> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> <45389677.1000709@sheffield.ac.uk> Message-ID: <45389E65.6080908@sendu.me.uk> Nathan Haigh wrote: > Is it really necessary to specify the number of tests that are to be > conducted in advance? It seems a bit annoying to have to count the > number of tests in the script or to run the test just to see how many > tests were done, we could just use: > use Test::More 'no_plan'; It's very important to have a plan. That way you know all the tests actually ran and weren't skipped (either due to an actual SKIP block or an if block that returned false due to a bug, or a for/foreach/while that didn't loop enough times due to a bug, or any number of other reasons). From bix at sendu.me.uk Fri Oct 20 06:04:48 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 20 Oct 2006 11:04:48 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <45388A37.7040505@sheffield.ac.uk> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> <45388A37.7040505@sheffield.ac.uk> Message-ID: <45389F40.5060601@sendu.me.uk> Nathan S. Haigh wrote: > Chris Fields wrote: > >> Note that if you plan on using Test::More with the bioperl-run test suite, >> you should add it to the bioperl-run CVS distribution directory in 't/lib'. >> Most people will have it installed, but you never know. > > What is the reason for including Test::More in 't/lib' rather than > having it as a prereq? Because we want to ensure that the test suite runs and tells you real problems (if any) about the code (Bioperl) that it is testing, not problems about actually running the tests (which are NOT required for using Bioperl, so cannot be considered 'pre-requisites'). From n.haigh at sheffield.ac.uk Fri Oct 20 06:54:30 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Fri, 20 Oct 2006 11:54:30 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <45389E65.6080908@sendu.me.uk> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> <45389677.1000709@sheffield.ac.uk> <45389E65.6080908@sendu.me.uk> Message-ID: <4538AAE6.5070600@sheffield.ac.uk> If there are known bugs in a particular version of software, what is the best approach for dealing with tests that would fail due to this bug? Simply skip those tests that would be affected by the bug, or to fail if the affected version is detected and report the reason so the user is informed? Or simply bump the minimum version to one above the affected versions? For example, t/Clustalw has a test for at least version 1.8. It then has some profile alignment tests that are only run if version > 1.82 is installed. It states that versions 1.81 and 1.82 are affected by a profile alignment bug - which i assume would make the tests fail. Cheers Nath From bix at sendu.me.uk Fri Oct 20 07:06:07 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 20 Oct 2006 12:06:07 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <4538AAE6.5070600@sheffield.ac.uk> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> <45389677.1000709@sheffield.ac.uk> <45389E65.6080908@sendu.me.uk> <4538AAE6.5070600@sheffield.ac.uk> Message-ID: <4538AD9F.8040003@sendu.me.uk> Nathan Haigh wrote: > If there are known bugs in a particular version of software, what is the > best approach for dealing with tests that would fail due to this bug? > Simply skip those tests that would be affected by the bug, or to fail if > the affected version is detected and report the reason so the user is > informed? Or simply bump the minimum version to one above the affected > versions? > > For example, t/Clustalw has a test for at least version 1.8. It then has > some profile alignment tests that are only run if version > 1.82 is > installed. It states that versions 1.81 and 1.82 are affected by a > profile alignment bug - which i assume would make the tests fail. Specific cases like this, I'd discuss on the list/ with the author of the module in question. Maybe there is some great need to allow usage with <1.81? My view, based purely on what you've said above, bump the pre-requisite to a version that works. From cjfields at uiuc.edu Fri Oct 20 08:36:37 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 20 Oct 2006 07:36:37 -0500 Subject: [Bioperl-l] test::more template In-Reply-To: <45388A37.7040505@sheffield.ac.uk> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> <45388A37.7040505@sheffield.ac.uk> Message-ID: <80A2D210-B0DB-4CD2-9B56-A38097F4F63F@uiuc.edu> >> ,,, >> > What is the reason for including Test::More in 't/lib' rather than > having it as a prereq? We could do that. Many CPAN modules include it in 't/lib' b/c it is only needed for testing purposes. Chris > > -- >> A: Yes. >>> Q: Are you sure? >>> >>>> A: Because it reverses the logical flow of conversation. >>>> >>>>> Q: Why is top posting frowned upon? >>>>> > Get Thunderbird Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Fri Oct 20 10:44:29 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 20 Oct 2006 15:44:29 +0100 Subject: [Bioperl-l] Updated Makefile.PL Message-ID: <4538E0CD.1030908@sendu.me.uk> Hi, I've just committed an updated Makefile.PL to HEAD for bioperl-live. Could some people test it on multiple platforms and confirm it is ok (try out the different possible options as well)? (NB. in the below, 'pre-reqs' are things the makefile considers optional dependencies) Note that some pre-reqs have been removed: # DBD::mysql (nothing in Bioperl-live uses it; Bioperl modules may end up requiring it but only after the user makes an explicit choice by typing 'DBD::mysql' in their own code to supply as an option to Bioperl code) # File::Temp (standard in 5.6.1) This pre-req was wrong: # Data::Stag::Writer and has been replaced with: Data::Stag::XMLWriter Also, I note that very many Bioperl modules need IO::String, including Bio::SeqIO, so I'm not sure to what extent we can pretend it is an optional module. I didn't make any change though. I don't know if these changes affect the Windows ppm Nathan, or anything else (Bundle?)? The INSTALL docs need updating with these new and improved pre-reqs (note that some pre-reqs had wrong/not enough Bioperl modules listed as needing them); does someone want to correct the wiki (based on the new Makefile.PL) and then Chris can re-create the text version? From hlapp at gmx.net Fri Oct 20 11:03:34 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 20 Oct 2006 11:03:34 -0400 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <4538E0CD.1030908@sendu.me.uk> References: <4538E0CD.1030908@sendu.me.uk> Message-ID: <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net> On Oct 20, 2006, at 10:44 AM, Sendu Bala wrote: > Also, I note that very many Bioperl modules need IO::String, including > Bio::SeqIO, so I'm not sure to what extent we can pretend it is an > optional module. I didn't make any change though. I agree. There's really not that many terribly useful things you can do with Bioperl w/o having IO::String installed, which is in stark contrast to many other dependencies. I don't have a problem with making it (and a few others used all over the place) required, to better contrast them with the dependencies that are really optional (and not needed for 90% of users). -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Fri Oct 20 11:18:32 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 20 Oct 2006 10:18:32 -0500 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <4538E0CD.1030908@sendu.me.uk> Message-ID: <001501c6f45b$019103c0$15327e82@pyrimidine> > Hi, > I've just committed an updated Makefile.PL to HEAD for bioperl-live. > Could some people test it on multiple platforms and confirm it is ok > (try out the different possible options as well)? > > (NB. in the below, 'pre-reqs' are things the makefile considers optional > dependencies) > > Note that some pre-reqs have been removed: > # DBD::mysql (nothing in Bioperl-live uses it; Bioperl modules may end > up requiring it but only after the user makes an explicit choice by > typing 'DBD::mysql' in their own code to supply as an option to Bioperl > code) > # File::Temp (standard in 5.6.1) I'll try it out on WinXP and Mac OS X. BTW, do any of Lincoln's Bio::DB* use DBD::mySQL? Bio::DB::GFF comes to mind. I don't think it should be an absolute requirement, though. If we plan on removing those, then we should also remove them from Bundle::Bioperl (if they are present). > This pre-req was wrong: > # Data::Stag::Writer > and has been replaced with: > Data::Stag::XMLWriter > > > Also, I note that very many Bioperl modules need IO::String, including > Bio::SeqIO, so I'm not sure to what extent we can pretend it is an > optional module. I didn't make any change though. Do they all require IO::String or is it an option? There are a few instances (WebDBSeqI-implementing, for instance) where this is presented as an option for most OS's (along with the default, pipeline, and tempfile). However, it is currently used by default with Windows due to lack of pipe/fork support at the time. BTW, the latter may now work with WinXP ActivePerl. ActiveState has been working on WinXP fork() emulation for a while, but I think it is still somewhat experimental. > I don't know if these changes affect the Windows ppm Nathan, or anything > else (Bundle?)? > > The INSTALL docs need updating with these new and improved pre-reqs > (note that some pre-reqs had wrong/not enough Bioperl modules listed as > needing them); does someone want to correct the wiki (based on the new > Makefile.PL) and then Chris can re-create the text version? Easier to just modify the text version based on what is changed in the wiki, at least for the time being. The text dumping from elinks/lynx isn't full-proof re: tables and such, which is one reason I think we should move the prereqs to a separate file as it's easier to maintain long-term (this seems to be where most changes occur anyway). Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Fri Oct 20 11:23:38 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 20 Oct 2006 16:23:38 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <1161270180.453793a432e4f@webmail.shef.ac.uk> References: <45375615.1020603@sheffield.ac.uk> <45379BBB.1040400@sheffield.ac.uk> <1161270180.453793a432e4f@webmail.shef.ac.uk> Message-ID: <4538E9FA.60701@sendu.me.uk> Nathan Haigh wrote: > I thought I'd have my first proper try at writing some tests. I was wondering if there is a template test file that I should use/study in order to be > consistent with other tests. > > Failing that - Is there a good test writing style I should follow in one of the other test files? I originally based mine on one of Chris's EUtilities tests, but now refer to t/ESEfinder.t since it is small and demonstrates all the major tricky things you might have to do - skip remote tests if no BIOPERLDEBUG, skip remote tests on remote server failure, skip all tests under some condition, fall-back to t/lib for Test::More if necessary. (Though I just spotted an oops in the latter...) From cjfields at uiuc.edu Fri Oct 20 11:38:02 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 20 Oct 2006 10:38:02 -0500 Subject: [Bioperl-l] test::more template In-Reply-To: <4538E9FA.60701@sendu.me.uk> Message-ID: <001601c6f45d$bb824350$15327e82@pyrimidine> > Nathan Haigh wrote: > > I thought I'd have my first proper try at writing some tests. I was > wondering if there is a template test file that I should use/study in > order to be > > consistent with other tests. > > > > Failing that - Is there a good test writing style I should follow in one > of the other test files? > > I originally based mine on one of Chris's EUtilities tests, but now > refer to t/ESEfinder.t since it is small and demonstrates all the major > tricky things you might have to do - skip remote tests if no > BIOPERLDEBUG, skip remote tests on remote server failure, skip all tests > under some condition, fall-back to t/lib for Test::More if necessary. > > (Though I just spotted an oops in the latter...) I agree. The EUtilities tests are quite long. I plan on eventually cutting out some of them Making them somewhat less prone to changes in returned XML data has also been a pain, as demonstrated by some of the tests from MAIN now failing... d'oh! Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Fri Oct 20 11:39:32 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 20 Oct 2006 16:39:32 +0100 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <001501c6f45b$019103c0$15327e82@pyrimidine> References: <001501c6f45b$019103c0$15327e82@pyrimidine> Message-ID: <4538EDB4.3030500@sendu.me.uk> Chris Fields wrote: > BTW, do any of Lincoln's Bio::DB* > use DBD::mySQL? Bio::DB::GFF comes to mind. No, just a require on a user-passed variable as I described. >> Also, I note that very many Bioperl modules need IO::String, including >> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an >> optional module. I didn't make any change though. > > Do they all require IO::String or is it an option? Oops, I take that back. Bio::SeqIO doesn't use IO::String. That's what you get for relying on grep output... It's still many modules that use it, but I suppose you could do useful things without. So actually, let's keep it optional. From cjfields at uiuc.edu Fri Oct 20 16:32:32 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 20 Oct 2006 15:32:32 -0500 Subject: [Bioperl-l] Error retrieving sequence from BioSQL Message-ID: <000001c6f486$df508930$15327e82@pyrimidine> Seth, Did you work out the problem here? There was a recent CVS update to OBDA tests (16obda.t) that fixed similar problems on Mac OS X. Old OBDA tests apparently left data from tests in the database, which caused problems with repeated test runs. Chris > > -----Original Message----- > > From: bioperl-l-bounces lists.open-bio.org [mailto:bioperl-l- > > bounces lists.open-bio.org] On Behalf Of Seth Johnson > > Sent: Saturday, September 30, 2006 6:35 PM > > To: Hilmar Lapp > > Cc: Chris Fields; Bioperl List > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > > > Here're complete test details: > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > ... > > > FAILED tests 10-12 > > Failed 3/12 tests, 75.00% okay > > Failed Test Stat Wstat Total Fail Failed List of Failed > > > -------------------------------------------------------------------------- > > ----- > > t\02species.t 65 2 3.08% 63 65 > > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > > t\12ontology.t 2 512 738 1471 199.32% 3-738 > > t\16obda.t 12 3 25.00% 10-12 > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- Best Regards, Seth Johnson Senior Bioinformatics Associate Ph: (202) 470-0900 Fx: (775) 251-0358 Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From olenka.m at gmail.com Fri Oct 20 17:47:15 2006 From: olenka.m at gmail.com (Olena Morozova) Date: Fri, 20 Oct 2006 14:47:15 -0700 Subject: [Bioperl-l] GO annotations Message-ID: <259a224c0610201447i44bd5226x3d66c31dbf3f715d@mail.gmail.com> Dear all, Does anyone know an easy way to get GO-BP annotations for ensembl genes? Thank you very much for your help, Olena From olenka.m at gmail.com Fri Oct 20 17:47:15 2006 From: olenka.m at gmail.com (Olena Morozova) Date: Fri, 20 Oct 2006 14:47:15 -0700 Subject: [Bioperl-l] GO annotations Message-ID: <259a224c0610201447i44bd5226x3d66c31dbf3f715d@mail.gmail.com> Dear all, Does anyone know an easy way to get GO-BP annotations for ensembl genes? Thank you very much for your help, Olena From sdavis2 at mail.nih.gov Sat Oct 21 11:05:26 2006 From: sdavis2 at mail.nih.gov (Davis, Sean (NIH/NCI) [E]) Date: Sat, 21 Oct 2006 11:05:26 -0400 Subject: [Bioperl-l] GO annotations References: <259a224c0610201447i44bd5226x3d66c31dbf3f715d@mail.gmail.com> Message-ID: <014DBF86B19310419F0DF8910FC56457240CE3@nihcesmlbx10.nih.gov> You can use the ensembl perl API, or (more simply) use the Ensembl MART interface: http://www.ensembl.org/Multi/martview Sean -----Original Message----- From: Olena Morozova [mailto:olenka.m at gmail.com] Sent: Fri 10/20/2006 5:47 PM To: bioperl-l Subject: [Bioperl-l] GO annotations Dear all, Does anyone know an easy way to get GO-BP annotations for ensembl genes? Thank you very much for your help, Olena _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From n.haigh at sheffield.ac.uk Sun Oct 22 06:34:51 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sun, 22 Oct 2006 10:34:51 +0000 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net> References: <4538E0CD.1030908@sendu.me.uk> <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net> Message-ID: <453B494B.7040702@sheffield.ac.uk> Hilmar Lapp wrote: > On Oct 20, 2006, at 10:44 AM, Sendu Bala wrote: > > >> Also, I note that very many Bioperl modules need IO::String, including >> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an >> optional module. I didn't make any change though. >> > > I agree. There's really not that many terribly useful things you can > do with Bioperl w/o having IO::String installed, which is in stark > contrast to many other dependencies. > > I don't have a problem with making it (and a few others used all over > the place) required, to better contrast them with the dependencies > that are really optional (and not needed for 90% of users). > > -hilmar > > Is it possible to make a distinction in Makefile.PL between those modules that are an absolute must for Bioperl-core and those which are optional and should go into Bundle::BioPerl? Once I'm sure what should be "option" I'll do the Bundle::BioPerl package and PPD's. Cheers Nath From vitacolonna at appliedgenomics.org Sun Oct 22 09:04:48 2006 From: vitacolonna at appliedgenomics.org (Nicola Vitacolonna) Date: Sun, 22 Oct 2006 15:04:48 +0200 Subject: [Bioperl-l] Submission proposal: ABIF module Message-ID: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org> Hi everybody, I would like to submit to CPAN a module for reading and parsing the ABIF files (with .ab1 suffix) produced by Applied Biosequence sequencers. The need for such a module arose in our lab because the existing ABI module we found on CPAN had too limited functionality. As an example, our module allows us to easily produce analysis reports similar to the ones generated by the Sequencing Analysis software. May I call the module Bio::ABIF? Or should I follow other conventions? Nicola From cjfields at uiuc.edu Sun Oct 22 09:54:51 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 22 Oct 2006 08:54:51 -0500 Subject: [Bioperl-l] Submission proposal: ABIF module In-Reply-To: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org> References: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org> Message-ID: On Oct 22, 2006, at 8:04 AM, Nicola Vitacolonna wrote: > Hi everybody, > I would like to submit to CPAN a module for reading and parsing the > ABIF files (with .ab1 suffix) produced by Applied Biosequence > sequencers. The need for such a module arose in our lab because the > existing ABI module we found on CPAN had too limited functionality. > As an example, our module allows us to easily produce analysis > reports similar to the ones generated by the Sequencing Analysis > software. > > May I call the module Bio::ABIF? Or should I follow other conventions? > > Nicola It depends. Does it interact with bioperl in any way? Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Sun Oct 22 09:57:18 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 22 Oct 2006 08:57:18 -0500 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <453B494B.7040702@sheffield.ac.uk> References: <4538E0CD.1030908@sendu.me.uk> <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net> <453B494B.7040702@sheffield.ac.uk> Message-ID: <7E563B1E-11D7-44F4-9E3D-3E319CE8DDA4@uiuc.edu> On Oct 22, 2006, at 5:34 AM, Nathan S. Haigh wrote: > Is it possible to make a distinction in Makefile.PL between those > modules that are an absolute must for Bioperl-core and those which are > optional and should go into Bundle::BioPerl? > > Once I'm sure what should be "option" I'll do the Bundle::BioPerl > package and PPD's. > > Cheers > Nath We probably should steer this way eventually. Do you aim on placing prereqs required for bioperl core in the bioperl PPD and the 'optional' ones with the bundle? Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From vitacolonna at appliedgenomics.org Sun Oct 22 10:16:26 2006 From: vitacolonna at appliedgenomics.org (Nicola Vitacolonna) Date: Sun, 22 Oct 2006 16:16:26 +0200 Subject: [Bioperl-l] Submission proposal: ABIF module In-Reply-To: References: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org> Message-ID: <8619131E-0F3D-4718-B7E3-7B5118B1139E@appliedgenomics.org> On 22/ott/06, at 15:54, Chris Fields wrote: > > On Oct 22, 2006, at 8:04 AM, Nicola Vitacolonna wrote: > >> Hi everybody, >> I would like to submit to CPAN a module for reading and parsing the >> ABIF files (with .ab1 suffix) [...] >> May I call the module Bio::ABIF? Or should I follow other >> conventions? > > It depends. Does it interact with bioperl in any way? No. Can you suggest a suitable pattern for the name? Nicola From cjfields at uiuc.edu Sun Oct 22 10:55:46 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 22 Oct 2006 09:55:46 -0500 Subject: [Bioperl-l] Submission proposal: ABIF module In-Reply-To: <8619131E-0F3D-4718-B7E3-7B5118B1139E@appliedgenomics.org> References: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org> <8619131E-0F3D-4718-B7E3-7B5118B1139E@appliedgenomics.org> Message-ID: On Oct 22, 2006, at 9:16 AM, Nicola Vitacolonna wrote: > On 22/ott/06, at 15:54, Chris Fields wrote: > >> >> On Oct 22, 2006, at 8:04 AM, Nicola Vitacolonna wrote: >> >>> Hi everybody, >>> I would like to submit to CPAN a module for reading and parsing the >>> ABIF files (with .ab1 suffix) [...] >>> May I call the module Bio::ABIF? Or should I follow other >>> conventions? >> >> It depends. Does it interact with bioperl in any way? > > No. Can you suggest a suitable pattern for the name? > > Nicola I don't think it will be a problem to name it Bio::ABIF; there is already a Bio::ASN1::EntrezGene, and Rutger Vos's Bio::Phylo modules (the latter doesn't require BioPerl either). Saying that, if you plan on contributing more CPAN modules with similar functionality (such as parsing other trace files), you might want to consider using a namespace that isn't limiting but doesn't conflict with Bioperl core (like Bio::Trace or similar, then name your module Bio::Trace::ABIF). You can use search.cpan.org to check namespaces for conflicts. Just as an note: we have bioperl-ext, which also parses ABI and other trace file formats. It's a bit old now and needs updating, but is supposed to be quite fast (it uses the Staden io_lib C library via PerlXS). -c Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From arareko at campus.iztacala.unam.mx Sun Oct 22 13:26:37 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Sun, 22 Oct 2006 12:26:37 -0500 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <4538E0CD.1030908@sendu.me.uk> References: <4538E0CD.1030908@sendu.me.uk> Message-ID: <453BA9CD.4060107@campus.iztacala.unam.mx> Works fine on FreeBSD. Mauricio. Sendu Bala wrote: > Hi, > I've just committed an updated Makefile.PL to HEAD for bioperl-live. > Could some people test it on multiple platforms and confirm it is ok > (try out the different possible options as well)? > > (NB. in the below, 'pre-reqs' are things the makefile considers optional > dependencies) > > Note that some pre-reqs have been removed: > # DBD::mysql (nothing in Bioperl-live uses it; Bioperl modules may end > up requiring it but only after the user makes an explicit choice by > typing 'DBD::mysql' in their own code to supply as an option to Bioperl > code) > # File::Temp (standard in 5.6.1) > > > This pre-req was wrong: > # Data::Stag::Writer > and has been replaced with: > Data::Stag::XMLWriter > > > Also, I note that very many Bioperl modules need IO::String, including > Bio::SeqIO, so I'm not sure to what extent we can pretend it is an > optional module. I didn't make any change though. > > > I don't know if these changes affect the Windows ppm Nathan, or anything > else (Bundle?)? > > The INSTALL docs need updating with these new and improved pre-reqs > (note that some pre-reqs had wrong/not enough Bioperl modules listed as > needing them); does someone want to correct the wiki (based on the new > Makefile.PL) and then Chris can re-create the text version? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From n.haigh at sheffield.ac.uk Sun Oct 22 15:37:07 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sun, 22 Oct 2006 20:37:07 +0100 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <7E563B1E-11D7-44F4-9E3D-3E319CE8DDA4@uiuc.edu> References: <4538E0CD.1030908@sendu.me.uk> <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net> <453B494B.7040702@sheffield.ac.uk> <7E563B1E-11D7-44F4-9E3D-3E319CE8DDA4@uiuc.edu> Message-ID: <453BC863.4090803@sheffield.ac.uk> Chris Fields wrote: > > On Oct 22, 2006, at 5:34 AM, Nathan S. Haigh wrote: > >> Is it possible to make a distinction in Makefile.PL between those >> modules that are an absolute must for Bioperl-core and those which are >> optional and should go into Bundle::BioPerl? >> >> Once I'm sure what should be "option" I'll do the Bundle::BioPerl >> package and PPD's. >> >> Cheers >> Nath > > We probably should steer this way eventually. Do you aim on placing > prereqs required for bioperl core in the bioperl PPD and the > 'optional' ones with the bundle? > That's correct. However, PPM will always try to update packages to the latest available. Therefore, if at some point in the future, a dependency is removed, and thus removed from Bundle::BioPerl, a situation may arise where an older version of BioPerl is running with the a recent version of Bundle::BioPerl and could have missing dependencies - not ideal but it is how things currently stand. The process of making the Bundle::BioPerl PPD would be simplified if these "optional" dependencies are separated from the "core" dependencies. If one of the following solutions is possible (i'm not sure if they are), it would be very useful: 1) Maintain 2 hashes in Makefile.PL that contain the "core" and "optional" dependencies. In unsure of the way dependencies are ordered during a "make ppd", but it may be possible to pass hash references of both to PREREQS_PM in MakeMakefile and have the "optional" depenencies grouped separately from "core" depenedcies in the ppd file - thus making it easy to stip them out into a Bundle::BioPerl ppd. 2) Again, maintain 2 hashes in Makefile.PL that contain the "core" and "optional" dependencies. Have some Makefile setup that allows the generation of a Bundle::BioPerl ppd separately from the main Bioperl ppd. Like I said, these are just some thoughts and I'm not sure if they are even viable options. Nath From chhalling at alumni.ls.berkeley.edu Sun Oct 22 19:45:33 2006 From: chhalling at alumni.ls.berkeley.edu (Conrad Halling) Date: Sun, 22 Oct 2006 19:45:33 -0400 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl Message-ID: <453C029D.1070708@alumni.ls.berkeley.edu> I have found three misspellings in Bundle::BioPerl 2.1.6 of 17-Oct-2006 that prevent these modules from being installed: Data::Stag::Writer (listed as Data::Stag::writer) HTTP::Request::Common (listed as HTTP::Request::Common-) Spreadsheet::ParseExcel (listed as Spreadhseet::ParseExcel) -- Conrad Halling chhalling at alumni.ls.berkeley.edu From cjfields at uiuc.edu Sun Oct 22 22:24:07 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 22 Oct 2006 21:24:07 -0500 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C029D.1070708@alumni.ls.berkeley.edu> References: <453C029D.1070708@alumni.ls.berkeley.edu> Message-ID: <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> Thanks for letting us know! Did PPM4 throw errors or just silently pass them over? Chris On Oct 22, 2006, at 6:45 PM, Conrad Halling wrote: > I have found three misspellings in Bundle::BioPerl 2.1.6 of 17- > Oct-2006 > that prevent these modules from being installed: > > Data::Stag::Writer (listed as Data::Stag::writer) > HTTP::Request::Common (listed as HTTP::Request::Common-) > Spreadsheet::ParseExcel (listed as Spreadhseet::ParseExcel) > > -- > Conrad Halling > chhalling at alumni.ls.berkeley.edu > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Mon Oct 23 02:45:29 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 06:45:29 +0000 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> Message-ID: <453C6509.90005@sheffield.ac.uk> Chris Fields wrote: > Thanks for letting us know! Did PPM4 throw errors or just silently > pass them over? > > Chris > > On Oct 22, 2006, at 6:45 PM, Conrad Halling wrote: > > I believe he is talking about the bundle on cpan and not the ppd. I will get this updated as soon as possible. Sendu/Chris - can you confirm to me which Bioperl modules are essential to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any reason for not putting *all* dependencies into the bundle? Nath From bix at sendu.me.uk Mon Oct 23 02:43:36 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 07:43:36 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C029D.1070708@alumni.ls.berkeley.edu> References: <453C029D.1070708@alumni.ls.berkeley.edu> Message-ID: <453C6498.5@sendu.me.uk> Conrad Halling wrote: > I have found three misspellings in Bundle::BioPerl 2.1.6 of 17-Oct-2006 > that prevent these modules from being installed: > > Data::Stag::Writer (listed as Data::Stag::writer) This should be Data::Stag::XMLWriter > HTTP::Request::Common (listed as HTTP::Request::Common-) > Spreadsheet::ParseExcel (listed as Spreadhseet::ParseExcel) From bix at sendu.me.uk Mon Oct 23 02:52:47 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 07:52:47 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C6509.90005@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> Message-ID: <453C66BF.1060008@sendu.me.uk> Nathan S. Haigh wrote: > Sendu/Chris - can you confirm to me which Bioperl modules are essential > to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any > reason for not putting *all* dependencies into the bundle? AFAIK, there are no essential external dependencies. Everything in %packages in Makefile.PL, for example, is optional. We had the discussion about making all the easy-to-install ones a forced requirement anyway (so that most things work out of the box), but perhaps we'll hold off on making such a change until after 1.5.2. From jyotikshah at gmail.com Mon Oct 23 03:10:43 2006 From: jyotikshah at gmail.com (Jyoti Shah) Date: Mon, 23 Oct 2006 00:10:43 -0700 Subject: [Bioperl-l] short motif searches Message-ID: <769931430610230010t5f68e54eg7fbc1be658c2cb3a@mail.gmail.com> Hi, I am interested in searching motifs as small as 6 or 7 nucleotides in genomic databases. I need exact matches. Is there any bioperl module available which can help me do this? I tried WU BLAST with word size one, but I am getting warning messages such as "WARNING: the maximum achievable score of 7 in context 0 (frame +1) is less than the ungapped cutoff score S2 (=13). Exit code 0...". Any suggestions? Thanks in advance, Jyoti From bix at sendu.me.uk Mon Oct 23 03:55:40 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 08:55:40 +0100 Subject: [Bioperl-l] short motif searches In-Reply-To: <769931430610230010t5f68e54eg7fbc1be658c2cb3a@mail.gmail.com> References: <769931430610230010t5f68e54eg7fbc1be658c2cb3a@mail.gmail.com> Message-ID: <453C757C.1010408@sendu.me.uk> Jyoti Shah wrote: > Hi, > > I am interested in searching motifs as small as 6 or 7 nucleotides in > genomic databases. I need exact matches. Is there any bioperl module > available which can help me do this? At 6 or 7bp long doing a simple exact match I should point out you're going to get very many hits; are you sure this is an appropriate thing to do for your purposes? Assuming yes, you can use Bio::SeqIO, Bio::Index or Bio::DB:: to get your genomic sequences of interest, then simply use a normal perl regexp on the resulting $seq->seq strings. If your motifs are anything like transcription factor binding sites, and you have more information than just a single sequence string for the motif, investigate Bio::Matrix::PSM. From bix at sendu.me.uk Mon Oct 23 04:29:52 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 09:29:52 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C7648.8030004@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> Message-ID: <453C7D80.80207@sendu.me.uk> Nathan S. Haigh wrote: > Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> Sendu/Chris - can you confirm to me which Bioperl modules are essential >>> to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any >>> reason for not putting *all* dependencies into the bundle? >> AFAIK, there are no essential external dependencies. Everything in >> %packages in Makefile.PL, for example, is optional. >> >> We had the discussion about making all the easy-to-install ones a >> forced requirement anyway (so that most things work out of the box), >> but perhaps we'll hold off on making such a change until after 1.5.2. > > How are they forced? They're not. Right now they're optional. I'm suggesting we might change that in the future. If you're asking how we /would/ force them, probably by adding PREREQ_FATAL to the WriteMakefile() call. As it is, Bioperl installs successfully (or should!) without its optional dependencies given in PREREQ_PM because make test succeeds (because tests skip ok when the optional dependency isn't there). I don't really know how CPAN discovers dependencies and auto-installs them before a dependent module though. Anyone care to explain? From n.haigh at sheffield.ac.uk Mon Oct 23 06:09:12 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 10:09:12 +0000 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C7D80.80207@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> Message-ID: <453C94C8.5040900@sheffield.ac.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: >> Sendu Bala wrote: >>> Nathan S. Haigh wrote: >>>> Sendu/Chris - can you confirm to me which Bioperl modules are >>>> essential >>>> to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any >>>> reason for not putting *all* dependencies into the bundle? >>> AFAIK, there are no essential external dependencies. Everything in >>> %packages in Makefile.PL, for example, is optional. >>> >>> We had the discussion about making all the easy-to-install ones a >>> forced requirement anyway (so that most things work out of the box), >>> but perhaps we'll hold off on making such a change until after 1.5.2. > > >> How are they forced? > > They're not. Right now they're optional. I'm suggesting we might > change that in the future. > If you're asking how we /would/ force them, probably by adding > PREREQ_FATAL to the WriteMakefile() call. As it is, Bioperl installs > successfully (or should!) without its optional dependencies given in > PREREQ_PM because make test succeeds (because tests skip ok when the > optional dependency isn't there). > > I don't really know how CPAN discovers dependencies and auto-installs > them before a dependent module though. Anyone care to explain? I thought so! I misunderstood something earlier which confused me. Just to clarify for my own sanities sake: 1) Currently all dependencies are optional. 2) All dependencies are in %packages 3) all these are passed to PREREQ_PM As far as CPAN discovering dependencies, here is a snip from the CPAN FAQ's: --snip-- I installed a Bundle and had a couple of fails. When I retried, everything resolved nicely. Can this be fixed to work on first try? The reason for this is that CPAN does not know the dependencies of all modules when it starts out. To decide about the additional items to install, it just uses data found in the META.yml file or the generated Makefile. An undetected missing piece breaks the process. But it may well be that your Bundle installs some prerequisite later than some depending item and thus your second try is able to resolve everything. Please note, CPAN.pm does not know the dependency tree in advance and cannot sort the queue of things to install in a topologically correct order. It resolves perfectly well IF all modules declare the prerequisites correctly with the PREREQ_PM attribute to MakeMaker or the |requires| stanza of Module::Build. For bundles which fail and you need to install often, it is recommended to sort the Bundle definition file manually. --snip-- Therefore, recent modifications to Makefile.PL should result in a fully operational Bioperl installation, if installed via CPAN. Although only Bioperl 1.4 is available via CPAN currently. It is possible to upload a developer release to CPAN which can only be ownloaded via CPAN if specifically asked for - would be good for 1.5.x.: --snip-- How do I install a "DEVELOPER RELEASE" of a module? By default, CPAN will install the latest non-developer release of a module. If you want to install a dev release, you have to specify the partial path starting with the author id to the tarball you wish to install, like so: cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz Note that you can use the |ls| command to get this path listed. --snip-- HTH Nath From bix at sendu.me.uk Mon Oct 23 05:41:52 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 10:41:52 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C94C8.5040900@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> Message-ID: <453C8E60.7000105@sendu.me.uk> Nathan S. Haigh wrote: > Sendu Bala wrote: > >> I don't really know how CPAN discovers dependencies and auto-installs >> them before a dependent module though. Anyone care to explain? > > I thought so! I misunderstood something earlier which confused me. Just > to clarify for my own sanities sake: > > 1) Currently all dependencies are optional. > 2) All dependencies are in %packages > 3) all these are passed to PREREQ_PM All correct. > As far as CPAN discovering dependencies, here is a snip from the CPAN FAQ's: > --snip-- > > I installed a Bundle and had a couple of fails. When I retried, > everything resolved nicely. Can this be fixed to work on first try? > > The reason for this is that CPAN does not know the dependencies of > all modules when it starts out. To decide about the additional items > to install, it just uses data found in the META.yml file or the > generated Makefile. An undetected missing piece breaks the process. > But it may well be that your Bundle installs some prerequisite later > than some depending item and thus your second try is able to resolve > everything. Please note, CPAN.pm does not know the dependency tree > in advance and cannot sort the queue of things to install in a > topologically correct order. It resolves perfectly well IF all > modules declare the prerequisites correctly with the PREREQ_PM > attribute to MakeMaker or the |requires| stanza of Module::Build. > For bundles which fail and you need to install often, it is > recommended to sort the Bundle definition file manually. > > --snip-- > > Therefore, recent modifications to Makefile.PL should result in a fully > operational Bioperl installation, if installed via CPAN. Right, thanks for that. > Although only Bioperl 1.4 is available via CPAN currently. It is possible to upload a > developer release to CPAN which can only be ownloaded via CPAN if > specifically asked for - would be good for 1.5.x.: > --snip-- > > How do I install a "DEVELOPER RELEASE" of a module? > > By default, CPAN will install the latest non-developer release of a > module. If you want to install a dev release, you have to specify > the partial path starting with the author id to the tarball you wish > to install, like so: > > cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz > > Note that you can use the |ls| command to get this path listed. > > --snip-- That's the user point of view - how does the developer actually tell CPAN that something is a developer release so that normal users don't automatically install it? From bix at sendu.me.uk Mon Oct 23 05:59:52 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 10:59:52 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C8E60.7000105@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> Message-ID: <453C9298.9000900@sendu.me.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: >> As far as CPAN discovering dependencies, here is a snip from the CPAN >> FAQ's: >> --snip-- >> >> I installed a Bundle and had a couple of fails. When I retried, >> everything resolved nicely. Can this be fixed to work on first try? >> >> The reason for this is that CPAN does not know the dependencies of >> all modules when it starts out. To decide about the additional items >> to install, it just uses data found in the META.yml file or the >> generated Makefile. An undetected missing piece breaks the process. >> But it may well be that your Bundle installs some prerequisite later >> than some depending item and thus your second try is able to resolve >> everything. Please note, CPAN.pm does not know the dependency tree >> in advance and cannot sort the queue of things to install in a >> topologically correct order. It resolves perfectly well IF all >> modules declare the prerequisites correctly with the PREREQ_PM >> attribute to MakeMaker or the |requires| stanza of Module::Build. >> For bundles which fail and you need to install often, it is >> recommended to sort the Bundle definition file manually. >> >> --snip-- >> >> Therefore, recent modifications to Makefile.PL should result in a fully >> operational Bioperl installation, if installed via CPAN. > > Right, thanks for that. Oh, so this effectively means that our 'optional' dependencies are installed for CPAN users, which matches up to my 'force the optional ones anyway' desire, leaving Bundle::BioPerl without any use. Makefile.PL could be altered again to remove from PREREQ_PM those modules the user didn't already have installed, thus CPAN would only install Bioperl itself and nothing optional. The user could then install Bundle::BioPerl if they wanted a quick way of getting all the optional stuff to work. I'm happy either way; what do other people think? From n.haigh at sheffield.ac.uk Mon Oct 23 07:22:17 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 11:22:17 +0000 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C9298.9000900@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453C9298.9000900@sendu.me.uk> Message-ID: <453CA5E9.1060406@sheffield.ac.uk> Sendu Bala wrote: > Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> As far as CPAN discovering dependencies, here is a snip from the >>> CPAN FAQ's: >>> --snip-- >>> >>> I installed a Bundle and had a couple of fails. When I retried, >>> everything resolved nicely. Can this be fixed to work on first try? >>> >>> The reason for this is that CPAN does not know the dependencies of >>> all modules when it starts out. To decide about the additional >>> items >>> to install, it just uses data found in the META.yml file or the >>> generated Makefile. An undetected missing piece breaks the process. >>> But it may well be that your Bundle installs some prerequisite >>> later >>> than some depending item and thus your second try is able to >>> resolve >>> everything. Please note, CPAN.pm does not know the dependency tree >>> in advance and cannot sort the queue of things to install in a >>> topologically correct order. It resolves perfectly well IF all >>> modules declare the prerequisites correctly with the PREREQ_PM >>> attribute to MakeMaker or the |requires| stanza of Module::Build. >>> For bundles which fail and you need to install often, it is >>> recommended to sort the Bundle definition file manually. >>> >>> --snip-- >>> >>> Therefore, recent modifications to Makefile.PL should result in a fully >>> operational Bioperl installation, if installed via CPAN. >> >> Right, thanks for that. > > Oh, so this effectively means that our 'optional' dependencies are > installed for CPAN users, which matches up to my 'force the optional > ones anyway' desire, leaving Bundle::BioPerl without any use. > > Makefile.PL could be altered again to remove from PREREQ_PM those > modules the user didn't already have installed, thus CPAN would only > install Bioperl itself and nothing optional. The user could then > install Bundle::BioPerl if they wanted a quick way of getting all the > optional stuff to work. > > I'm happy either way; what do other people think? >From my point of view, removing them from PREREQ_PM means building the Bundle::BioPerl a bit of a pain :o( I prefer the way it is currently set up - most people have fast internet connections and GB of harddrive space. Other than the reason "why install something I won't ever need" I don't see much point maintaining Bundle::BioPerl and having "optional" dependencies. I think if there are any modules which are not going to be used by the majority of users, then this could be used as the rationale for removing them from bioperl-core into another package? Nath From n.haigh at sheffield.ac.uk Mon Oct 23 07:38:05 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 11:38:05 +0000 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C8E60.7000105@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> Message-ID: <453CA99D.9060009@sheffield.ac.uk> >> Although only Bioperl 1.4 is available via CPAN currently. It is >> possible to upload a >> developer release to CPAN which can only be ownloaded via CPAN if >> specifically asked for - would be good for 1.5.x.: >> --snip-- >> >> How do I install a "DEVELOPER RELEASE" of a module? >> >> By default, CPAN will install the latest non-developer release of a >> module. If you want to install a dev release, you have to specify >> the partial path starting with the author id to the tarball you wish >> to install, like so: >> >> cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz >> >> Note that you can use the |ls| command to get this path listed. >> >> --snip-- > > That's the user point of view - how does the developer actually tell > CPAN that something is a developer release so that normal users don't > automatically install it? I found this: http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt Is says that $VERSION should simply be changed from a naked number into a single quoted number and this should be recognized by the CPAN indexer. Nath From bix at sendu.me.uk Mon Oct 23 06:47:38 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 11:47:38 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> <4535EBF9.1090706@sendu.me.uk> <4536113D.1080307@sheffield.ac.uk> Message-ID: <453C9DCA.4020802@sendu.me.uk> Hilmar Lapp wrote: > On Oct 18, 2006, at 7:34 AM, Nathan Haigh wrote: > >> For example, I have made no effort to setup biosql-schema but I >> thought that maybe there would be a test that would detect this > > I'm afraid there isn't. Bioperl-db is meaningless without > biosql-schema. Can you suggest a way we might detect if biosql-schema has been installed prior to running the test suite, so we can give some meaningful error message? From bix at sendu.me.uk Mon Oct 23 08:43:30 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 13:43:30 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453CA5E9.1060406@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk> Message-ID: <453CB8F2.7070703@sendu.me.uk> Nathan S. Haigh wrote: > Sendu Bala wrote: > >> Makefile.PL could be altered again to remove from PREREQ_PM those >> modules the user didn't already have installed, thus CPAN would only >> install Bioperl itself and nothing optional. The user could then >> install Bundle::BioPerl if they wanted a quick way of getting all the >> optional stuff to work. >> >> I'm happy either way; what do other people think? > > From my point of view, removing them from PREREQ_PM means building the > Bundle::BioPerl a bit of a pain :o( Can I ask how you're generating Bundle::BioPerl? That is, how did the typos get in there? Is there a way to certainly avoid typos in the future? From n.haigh at sheffield.ac.uk Mon Oct 23 09:46:17 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 13:46:17 +0000 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453CB8F2.7070703@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk> <453CB8F2.7070703@sendu.me.uk> Message-ID: <453CC7A9.6090609@sheffield.ac.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: >> Sendu Bala wrote: >> >>> Makefile.PL could be altered again to remove from PREREQ_PM those >>> modules the user didn't already have installed, thus CPAN would only >>> install Bioperl itself and nothing optional. The user could then >>> install Bundle::BioPerl if they wanted a quick way of getting all the >>> optional stuff to work. >>> >>> I'm happy either way; what do other people think? > > >> From my point of view, removing them from PREREQ_PM means building the >> Bundle::BioPerl a bit of a pain :o( > > Can I ask how you're generating Bundle::BioPerl? That is, how did the > typos get in there? Is there a way to certainly avoid typos in the > future? I just modified the list by hand a while back :o( - I'm sure there must be a better way. From bix at sendu.me.uk Mon Oct 23 08:58:13 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 13:58:13 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453CC7A9.6090609@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk> <453CB8F2.7070703@sendu.me.uk> <453CC7A9.6090609@sheffield.ac.uk> Message-ID: <453CBC65.2020202@sendu.me.uk> Nathan S. Haigh wrote: > Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> Sendu Bala wrote: >>> >>>> Makefile.PL could be altered again to remove from PREREQ_PM those >>>> modules the user didn't already have installed, thus CPAN would only >>>> install Bioperl itself and nothing optional. The user could then >>>> install Bundle::BioPerl if they wanted a quick way of getting all the >>>> optional stuff to work. >>>> >>>> I'm happy either way; what do other people think? >>> >>> From my point of view, removing them from PREREQ_PM means building the >>> Bundle::BioPerl a bit of a pain :o( >> >> Can I ask how you're generating Bundle::BioPerl? That is, how did the >> typos get in there? Is there a way to certainly avoid typos in the >> future? > > I just modified the list by hand a while back :o( - I'm sure there must > be a better way. I'm not sure I understand why removing things from PREREQ_PM would be a problem for you then; the %packages hash would remain unchanged (ie. have everything) so you have something to refer to when manually editing the Bundle. http://www.cpan.org/misc/cpan-faq.html#How_make_bundle might be helpful? I didn't really pay too much attention to the advice - does it offer a typo-avoiding solution? From n.haigh at sheffield.ac.uk Mon Oct 23 10:04:12 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 14:04:12 +0000 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453CBC65.2020202@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk> <453CB8F2.7070703@sendu.me.uk> <453CC7A9.6090609@sheffield.ac.uk> <453CBC65.2020202@sendu.me.uk> Message-ID: <453CCBDC.6030904@sheffield.ac.uk> > I'm not sure I understand why removing things from PREREQ_PM would be > a problem for you then; the %packages hash would remain unchanged (ie. > have everything) so you have something to refer to when manually > editing the Bundle. > > http://www.cpan.org/misc/cpan-faq.html#How_make_bundle > might be helpful? I didn't really pay too much attention to the advice > - does it offer a typo-avoiding solution? It's helpful in producing the Bundle PPD as all the XML tags are present in the Bioperl PPD and they simply need to be copied over to a Bundle-BioPerl PPD file. Looks like manual editing of the relevant file is required for making a CPAN bundle. Unfortunately - no typo-avoiding solution. :o( From dhoworth at mrc-lmb.cam.ac.uk Mon Oct 23 08:46:29 2006 From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth) Date: Mon, 23 Oct 2006 13:46:29 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453CA99D.9060009@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> Message-ID: <453CB9A5.2020409@mrc-lmb.cam.ac.uk> >> That's the user point of view - how does the developer actually tell >> CPAN that something is a developer release so that normal users don't >> automatically install it? > > I found this: > http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt > > Is says that $VERSION should simply be changed from a naked number into > a single quoted number and this should be recognized by the CPAN indexer. Cheers, Dave From hlapp at gmx.net Mon Oct 23 09:40:29 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 23 Oct 2006 09:40:29 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <453C9DCA.4020802@sendu.me.uk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> <4535EBF9.1090706@sendu.me.uk> <4536113D.1080307@sheffield.ac.uk> <453C9DCA.4020802@sendu.me.uk> Message-ID: <5C22B9C8-CEF0-457B-8565-793D56389A86@gmx.net> You would need a lot of information to make that determination (host, port, db driver, db name, user, password; i.e., the entire connection information, and there is no 'standard'). You might just ask a simple question in Makefile.PL as to whether biosql is installed or not, similar to the DB::GFF tests. -hilmar On Oct 23, 2006, at 6:47 AM, Sendu Bala wrote: > Hilmar Lapp wrote: >> On Oct 18, 2006, at 7:34 AM, Nathan Haigh wrote: >> >>> For example, I have made no effort to setup biosql-schema but I >>> thought that maybe there would be a test that would detect this >> >> I'm afraid there isn't. Bioperl-db is meaningless without >> biosql-schema. > > Can you suggest a way we might detect if biosql-schema has been > installed prior to running the test suite, so we can give some > meaningful error message? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Mon Oct 23 09:59:23 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 14:59:23 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CB9A5.2020409@mrc-lmb.cam.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> Message-ID: <453CCABB.2060308@sendu.me.uk> Dave Howorth wrote: >>> That's the user point of view - how does the developer actually tell >>> CPAN that something is a developer release so that normal users don't >>> automatically install it? >> I found this: >> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt >> >> Is says that $VERSION should simply be changed from a naked number into >> a single quoted number and this should be recognized by the CPAN indexer. > > Thanks for that. I guess from that the 1.5.2 version number should be: $VERSION = 1.05_02 And 1.6 would be $VERSION = 1.06 But will this cause a problem wrt 1.4? 1.4 has: $VERSION = 1.4; Is 1.4 lower than 1.06? Should we keep to a single digit version, so 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them version fifty and version sixty? 1.50_02, 1.60? From cjfields at uiuc.edu Mon Oct 23 10:12:16 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 09:12:16 -0500 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C9298.9000900@sendu.me.uk> Message-ID: <002001c6f6ad$3f68dd90$15327e82@pyrimidine> ... > > Right, thanks for that. > > Oh, so this effectively means that our 'optional' dependencies are > installed for CPAN users, which matches up to my 'force the optional > ones anyway' desire, leaving Bundle::BioPerl without any use. > > Makefile.PL could be altered again to remove from PREREQ_PM those > modules the user didn't already have installed, thus CPAN would only > install Bioperl itself and nothing optional. The user could then install > Bundle::BioPerl if they wanted a quick way of getting all the optional > stuff to work. > > I'm happy either way; what do other people think? I think that we should have it so Bioperl installs as-is (no additional reqs) and have Bundle::BioPerl used as a convenient way to install all optional modules for full functionality. The catch is to make sure that any optional installations do not crash tests during a CPAN bioperl installation, otherwise they aren't considered optional by CPAN, and the install won't work without forcing it. Frankly, most users will find themselves wanting to install the Bundle anyway to get full functionality, so we could always 'strongly recommend' preceding the bioperl installation with a Bundle::Bioperl CPAN installation to avoid problems, at least for this release. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 23 10:23:04 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 09:23:04 -0500 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453CA5E9.1060406@sheffield.ac.uk> Message-ID: <002101c6f6ae$c14d7860$15327e82@pyrimidine> ... > >> Right, thanks for that. > > > > Oh, so this effectively means that our 'optional' dependencies are > > installed for CPAN users, which matches up to my 'force the optional > > ones anyway' desire, leaving Bundle::BioPerl without any use. > > > > Makefile.PL could be altered again to remove from PREREQ_PM those > > modules the user didn't already have installed, thus CPAN would only > > install Bioperl itself and nothing optional. The user could then > > install Bundle::BioPerl if they wanted a quick way of getting all the > > optional stuff to work. > > > > I'm happy either way; what do other people think? > >From my point of view, removing them from PREREQ_PM means building the > Bundle::BioPerl a bit of a pain :o( > > I prefer the way it is currently set up - most people have fast internet > connections and GB of harddrive space. Other than the reason "why > install something I won't ever need" I don't see much point maintaining > Bundle::BioPerl and having "optional" dependencies. I think if there are > any modules which are not going to be used by the majority of users, > then this could be used as the rationale for removing them from > bioperl-core into another package? > > Nath I think you'll likely find it much easier to maintain a Bundle package long-term and indicate that it should be installed along with bioperl, than to have users complain about a particular Bioperl module failing b/c a particular dependency wasn't installed. If we have the Bundle around in CPAN and in PPM for Win32 users, and indicate in the INSTALL docs and the wiki our preference that it be installed prior to or along with a Bioperl installation for beginners, we can mitigate most of those problems. Nip it in the bud, to quote a Mr. Barney Fife. My 2c Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 23 10:29:33 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 09:29:33 -0500 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CCABB.2060308@sendu.me.uk> Message-ID: <002201c6f6af$a91e4200$15327e82@pyrimidine> > Dave Howorth wrote: > >>> That's the user point of view - how does the developer actually tell > >>> CPAN that something is a developer release so that normal users don't > >>> automatically install it? > >> I found this: > >> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt > >> > >> Is says that $VERSION should simply be changed from a naked number into > >> a single quoted number and this should be recognized by the CPAN > indexer. > > > > 5.8.8/pod/perlmodstyle.pod#Version_numbering> > > Thanks for that. > > I guess from that the 1.5.2 version number should be: > > $VERSION = 1.05_02 > > And 1.6 would be > > $VERSION = 1.06 > > But will this cause a problem wrt 1.4? 1.4 has: > > $VERSION = 1.4; > > Is 1.4 lower than 1.06? Should we keep to a single digit version, so > 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them > version fifty and version sixty? 1.50_02, 1.60? Doesn't perl 5.6.1 and up use the 'x.y.z' versioning syntax? It would be much simpler to use that. Simon Cozens wrote about this a while back: http://www.perl.com/pub/a/2000/04/whatsnew.html ... Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Mon Oct 23 10:41:24 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 15:41:24 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <002201c6f6af$a91e4200$15327e82@pyrimidine> References: <002201c6f6af$a91e4200$15327e82@pyrimidine> Message-ID: <453CD494.8070905@sendu.me.uk> Chris Fields wrote: >> Dave Howorth wrote: >>>>> That's the user point of view - how does the developer actually tell >>>>> CPAN that something is a developer release so that normal users don't >>>>> automatically install it? >>>> I found this: >>>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt >>>> >>>> Is says that $VERSION should simply be changed from a naked number into >>>> a single quoted number and this should be recognized by the CPAN >> indexer. >>> > 5.8.8/pod/perlmodstyle.pod#Version_numbering> >> >> Thanks for that. >> >> I guess from that the 1.5.2 version number should be: >> >> $VERSION = 1.05_02 >> >> And 1.6 would be >> >> $VERSION = 1.06 >> >> But will this cause a problem wrt 1.4? 1.4 has: >> >> $VERSION = 1.4; >> >> Is 1.4 lower than 1.06? Should we keep to a single digit version, so >> 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them >> version fifty and version sixty? 1.50_02, 1.60? > > Doesn't perl 5.6.1 and up use the 'x.y.z' versioning syntax? It would be > much simpler to use that. That does not present us with a way to have 1.5.2 marked as a developer release in CPAN. Also, see the discussion here: http://perldoc.perl.org/functions/require.html Since we require 5.6.1 the backwards-compatible issues maybe don't apply to us, but do these ideas work with modules, or just Perl itself? Is CPAN et al. happy with this form of versioning? /Something/ needs to be done about Bioperl versioning, because the current 1.4 or 1.5 is completely inadequate. From bix at sendu.me.uk Mon Oct 23 10:51:25 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 15:51:25 +0100 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <002001c6f6ad$3f68dd90$15327e82@pyrimidine> References: <002001c6f6ad$3f68dd90$15327e82@pyrimidine> Message-ID: <453CD6ED.5050507@sendu.me.uk> Chris Fields wrote: [option 1] >> Oh, so this effectively means that our 'optional' dependencies are >> installed for CPAN users, which matches up to my 'force the >> optional ones anyway' desire, leaving Bundle::BioPerl without any >> use. [option 2] >> Makefile.PL could be altered again to remove from PREREQ_PM those >> modules the user didn't already have installed, thus CPAN would >> only install Bioperl itself and nothing optional. The user could >> then install Bundle::BioPerl if they wanted a quick way of getting >> all the optional stuff to work. >> >> I'm happy either way; what do other people think? > > I think that we should have it so Bioperl installs as-is (no > additional reqs) and have Bundle::BioPerl used as a convenient way to > install all optional modules for full functionality. Note we're specifically considering a CPAN install here. If you download the tar.gz or use cvs, [option 1] doesn't affect you. Bundle::Bioperl is still needed as a convenience if you want to install the optional external dependencies. > The catch is to make sure that any optional installations do not > crash tests during a CPAN bioperl installation, otherwise they aren't > considered optional by CPAN, and the install won't work without > forcing it. I'm pretty sure this isn't a problem, though it would be nice if someone could test it on a clean system: does 'make test' pass all ok with none of the optional modules installed? Anyway, to reiterate the question: Do we care if CPAN users get all the optional external dependencies installed for them automatically, or do we want to force them to install Bundle? The current situation is: CPAN users will get all optional external dependencies without using Bundle::BioPerl. Manual installers of bioperl (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to get full functionality. From n.haigh at sheffield.ac.uk Mon Oct 23 12:30:34 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 16:30:34 +0000 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CCABB.2060308@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> Message-ID: <453CEE2A.8000002@sheffield.ac.uk> Sendu Bala wrote: > Dave Howorth wrote: > >>>> That's the user point of view - how does the developer actually tell >>>> CPAN that something is a developer release so that normal users don't >>>> automatically install it? >>>> >>> I found this: >>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt >>> >>> Is says that $VERSION should simply be changed from a naked number into >>> a single quoted number and this should be recognized by the CPAN indexer. >>> >> >> > > Thanks for that. > > I guess from that the 1.5.2 version number should be: > > $VERSION = 1.05_02 > > And 1.6 would be > > $VERSION = 1.06 > > But will this cause a problem wrt 1.4? 1.4 has: > > $VERSION = 1.4; > > Is 1.4 lower than 1.06? Should we keep to a single digit version, so > 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them > version fifty and version sixty? 1.50_02, 1.60? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > I believe the link to the documentation above describes a common CPAN versioning scheme as follows: 1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32 Therefore version 1.5 of Bioperl would be: 1.50 and version 1.5.2 would be better as 1.52. Then to indicate that the 1.5 series is a developer release, you append the underscore and at least 2 digits. Thus resulting in the following: Bioperl 1.5 would be 1.50_01 and 1.5.2 would be 1.52_01. The only thing i'm unsure about would be when does the _01 get incremented? I suspect we would probably not increment this number since each release would be an increment of the minor release number e.g. 1.52_01, 1.53_01, 1.54_01 etc. Although I'm still not sure how this versioning would affect bioperl 1.4 since 1.4 uses a non-standard versioning scheme :o( As I understand it, the versioning of the Perl releases uses the x.y.z scheme. But apparently CPAN modules should use the above versioning scheme. Nath From cjfields at uiuc.edu Mon Oct 23 11:36:37 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 10:36:37 -0500 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <453CD6ED.5050507@sendu.me.uk> Message-ID: <000c01c6f6b9$0781af40$15327e82@pyrimidine> ... > > Note we're specifically considering a CPAN install here. If you download > the tar.gz or use cvs, [option 1] doesn't affect you. Bundle::Bioperl is > still needed as a convenience if you want to install the optional > external dependencies. > Agreed. I don't think the Bundle is dispensable. For instance, it's very easy for us to just state to beginners to install Bundle::Bioperl before installing bioperl itself, as opposed to having them inundate the mail list with requests on why x.pl script didn't work, which could be simply from lack of the required module. > I'm pretty sure this isn't a problem, though it would be nice if someone > could test it on a clean system: does 'make test' pass all ok with none > of the optional modules installed? So far on WinXP everything passes; I ran a clean perl installation a while ago using nmake and tests passed. > Anyway, to reiterate the question: Do we care if CPAN users get all the > optional external dependencies installed for them automatically, or do > we want to force them to install Bundle? > > The current situation is: CPAN users will get all optional external > dependencies without using Bundle::BioPerl. Manual installers of bioperl > (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to > get full functionality. I don't think forcing is necessary, so a CPAN installation shouldn't force someone to install optional modules. Graph.pm, for instance has a few optional modules, and the tests which use those get skipped and pass so the installation proceeds w/o problems. We could do the same (any tests using those optional modules display the reason why they are skipped). I would strongly state in the INSTALL and INSTALL.WIN docs that (new) users should install Bundle::Bioperl before installing Bioperl core for full functionality. If you are an advanced user and know your way around CPAN/Perl, then you can install the various independent requirements depending on your particular requirements. Chris From n.haigh at sheffield.ac.uk Mon Oct 23 12:38:00 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 16:38:00 +0000 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <453CD6ED.5050507@sendu.me.uk> References: <002001c6f6ad$3f68dd90$15327e82@pyrimidine> <453CD6ED.5050507@sendu.me.uk> Message-ID: <453CEFE8.4000704@sheffield.ac.uk> Sendu Bala wrote: > Chris Fields wrote: > > [option 1] > >>> Oh, so this effectively means that our 'optional' dependencies are >>> installed for CPAN users, which matches up to my 'force the >>> optional ones anyway' desire, leaving Bundle::BioPerl without any >>> use. >>> > > [option 2] > >>> Makefile.PL could be altered again to remove from PREREQ_PM those >>> modules the user didn't already have installed, thus CPAN would >>> only install Bioperl itself and nothing optional. The user could >>> then install Bundle::BioPerl if they wanted a quick way of getting >>> all the optional stuff to work. >>> >>> I'm happy either way; what do other people think? >>> >> I think that we should have it so Bioperl installs as-is (no >> additional reqs) and have Bundle::BioPerl used as a convenient way to >> install all optional modules for full functionality. >> > > Note we're specifically considering a CPAN install here. If you download > the tar.gz or use cvs, [option 1] doesn't affect you. Bundle::Bioperl is > still needed as a convenience if you want to install the optional > external dependencies. > > > >> The catch is to make sure that any optional installations do not >> crash tests during a CPAN bioperl installation, otherwise they aren't >> considered optional by CPAN, and the install won't work without >> forcing it. >> > > I'm pretty sure this isn't a problem, though it would be nice if someone > could test it on a clean system: does 'make test' pass all ok with none > of the optional modules installed? > > I could definitely do this on WinXP and *possibly* on a Linux system. > Anyway, to reiterate the question: Do we care if CPAN users get all the > optional external dependencies installed for them automatically, or do > we want to force them to install Bundle? > > I'd prefer any dependencies, whether the are seen as vital to the main functionality of Bioperl or not actually specified in PREREQ_PM (as they currently are). A dependency is a dependency - is it not? If a distinction is to be made based on whether the requiring module is simply adding additional functionality to Bioperl-core, then shouldn't it be moved out of core and into another package as with the run modules if we are to have "optional" dependencies? my 2p Nath > The current situation is: CPAN users will get all optional external > dependencies without using Bundle::BioPerl. Manual installers of bioperl > (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to > get full functionality. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at uiuc.edu Mon Oct 23 11:39:09 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 10:39:09 -0500 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CD494.8070905@sendu.me.uk> Message-ID: <000d01c6f6b9$62033d80$15327e82@pyrimidine> ... > That does not present us with a way to have 1.5.2 marked as a developer > release in CPAN. > > Also, see the discussion here: > http://perldoc.perl.org/functions/require.html > > Since we require 5.6.1 the backwards-compatible issues maybe don't apply > to us, but do these ideas work with modules, or just Perl itself? Is > CPAN et al. happy with this form of versioning? > > /Something/ needs to be done about Bioperl versioning, because the > current 1.4 or 1.5 is completely inadequate. I think using 'require Foo x.y.z' is applicable to modules as well. There is something in Programming Perl about this, just don't have it on hand... Not sure about CPAN, so we need to look into it. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Mon Oct 23 11:42:15 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 16:42:15 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CEE2A.8000002@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> Message-ID: <453CE2D7.5080608@sendu.me.uk> Nathan S. Haigh wrote: > I believe the link to the documentation above describes a common CPAN > versioning scheme as follows: > > 1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32 > > Therefore version 1.5 of Bioperl would be: 1.50 and version 1.5.2 would > be better as 1.52. Then to indicate that the 1.5 series is a developer > release, you append the underscore and at least 2 digits. Thus resulting > in the following: Bioperl 1.5 would be 1.50_01 and 1.5.2 would be > 1.52_01. The only thing i'm unsure about would be when does the _01 get > incremented? I suspect we would probably not increment this number since > each release would be an increment of the minor release number e.g. > 1.52_01, 1.53_01, 1.54_01 etc. > > Although I'm still not sure how this versioning would affect bioperl 1.4 > since 1.4 uses a non-standard versioning scheme :o( Ok, I'm going to go ahead and call it 1.52_01 then. Surely 1.60 will be treated higher than 1.4? Anyway, we can cross that bridge when we get there, but this seems appropriate now. Cheers, Sendu. From bix at sendu.me.uk Mon Oct 23 11:59:01 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 16:59:01 +0100 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <000c01c6f6b9$0781af40$15327e82@pyrimidine> References: <000c01c6f6b9$0781af40$15327e82@pyrimidine> Message-ID: <453CE6C5.6000108@sendu.me.uk> Chris Fields wrote: > ... >> The current situation is: CPAN users will get all optional external >> dependencies without using Bundle::BioPerl. Manual installers of bioperl >> (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to >> get full functionality. > > I don't think forcing is necessary, so a CPAN installation shouldn't force > someone to install optional modules. Graph.pm, for instance has a few > optional modules, and the tests which use those get skipped and pass so the > installation proceeds w/o problems. We could do the same (any tests using > those optional modules display the reason why they are skipped). I should clarify and say that that's what happens in Bioperl as well. The 'forcing' that I talk about is simply what I assume will happen if the user has CPAN set to automatically install dependencies. The user could say 'no' to every question regarding the installation of dependencies that CPAN discovers and Bioperl would still install fine. So really the difference between the current situation and, say, the situation when 1.5.1 was released, is that the CPAN user doesn't have to use Bundle::BioPerl for full functionality anymore, but can still chose not to install all the optional external modules. The difference is the possible default behaviour. Those users that auto-install dependencies get all the optional ones, whereas in the past they would not have. I have to point out the benefit of this behaviour: those people that don't care and just want it to work are more likely to get an installation that does just work. People who know what they're doing can still do what they want. Before we decide what to do I guess we need hard confirmation of how CPAN will actually behave with the current Makefile.PL. Any ideas how we can find out? It would also be good to have more options to break the current tie (Nathan is for keeping PREREQ_PM populated, Chris is for having it empty, I can go either way)... From dhoworth at mrc-lmb.cam.ac.uk Mon Oct 23 11:55:42 2006 From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth) Date: Mon, 23 Oct 2006 16:55:42 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CD494.8070905@sendu.me.uk> References: <002201c6f6af$a91e4200$15327e82@pyrimidine> <453CD494.8070905@sendu.me.uk> Message-ID: <453CE5FE.9070001@mrc-lmb.cam.ac.uk> Sendu Bala wrote: > Chris Fields wrote: >>> Dave Howorth wrote: >>>>>> That's the user point of view - how does the developer actually tell >>>>>> CPAN that something is a developer release so that normal users don't >>>>>> automatically install it? >>>>> I found this: >>>>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt >>>>> >>>>> Is says that $VERSION should simply be changed from a naked number into >>>>> a single quoted number and this should be recognized by the CPAN >>> indexer. >>>> >> 5.8.8/pod/perlmodstyle.pod#Version_numbering> >>> >>> Thanks for that. >>> >>> I guess from that the 1.5.2 version number should be: >>> >>> $VERSION = 1.05_02 I believe so - the underscore is key. Look at your favourite CPAN modules and see what they do. >>> And 1.6 would be >>> >>> $VERSION = 1.06 >>> >>> But will this cause a problem wrt 1.4? 1.4 has: I think it will cause a problem, yes. 1.4 > 1.06 As a workaround, you could remove 1.4 from CPAN and require everybody who installs from CPAN to uninstall it before installing 1.06. >>> $VERSION = 1.4; >>> >>> Is 1.4 lower than 1.06? Should we keep to a single digit version, so >>> 1.5_02 and 1.6? Does this really not work with CPAN? I think that would work but see at the end. >> Should we call them >>> version fifty and version sixty? 1.50_02, 1.60? Then you can count 1.50_02, 1.50_03, 1.52, 1.53_01 ... if you wish. >> Doesn't perl 5.6.1 and up use the 'x.y.z' versioning syntax? It would be >> much simpler to use that. > > That does not present us with a way to have 1.5.2 marked as a developer > release in CPAN. > > Also, see the discussion here: > http://perldoc.perl.org/functions/require.html > > Since we require 5.6.1 the backwards-compatible issues maybe don't apply > to us, but do these ideas work with modules, or just Perl itself? Is > CPAN et al. happy with this form of versioning? I'm not an expert :( It's my understanding that there is an awful lot of flexibility in Perl module version numbering (as you might expect :) However, I believe there are some gotchas. So I would recommend (a) finding an expert and (b) trying an experiment! > /Something/ needs to be done about Bioperl versioning, because the > current 1.4 or 1.5 is completely inadequate. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From n.haigh at sheffield.ac.uk Mon Oct 23 13:37:13 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 17:37:13 +0000 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <453CE6C5.6000108@sendu.me.uk> References: <000c01c6f6b9$0781af40$15327e82@pyrimidine> <453CE6C5.6000108@sendu.me.uk> Message-ID: <453CFDC9.8030107@sheffield.ac.uk> Sendu Bala wrote: > Chris Fields wrote: > >> ... >> >>> The current situation is: CPAN users will get all optional external >>> dependencies without using Bundle::BioPerl. Manual installers of bioperl >>> (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to >>> get full functionality. >>> >> I don't think forcing is necessary, so a CPAN installation shouldn't force >> someone to install optional modules. Graph.pm, for instance has a few >> optional modules, and the tests which use those get skipped and pass so the >> installation proceeds w/o problems. We could do the same (any tests using >> those optional modules display the reason why they are skipped). >> > > I should clarify and say that that's what happens in Bioperl as well. > The 'forcing' that I talk about is simply what I assume will happen if > the user has CPAN set to automatically install dependencies. The user > could say 'no' to every question regarding the installation of > dependencies that CPAN discovers and Bioperl would still install fine. > > So really the difference between the current situation and, say, the > situation when 1.5.1 was released, is that the CPAN user doesn't have to > use Bundle::BioPerl for full functionality anymore, but can still chose > not to install all the optional external modules. > > --snip-- Obviously, we could maintain a Bundle::BioPerl which includes all dependencies required for a fully functional Bioperl. I think the whole idea for a Bundle is to provide a common environment for a particular package. If for example, someone chooses not to install the dependencies through CPAN (in the current setup), that can easily go back and install Bundle::BioPerl and it would retrieve any missing dependencies for a fully functional Bioperl-core. Nath From n.haigh at sheffield.ac.uk Mon Oct 23 14:06:16 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 18:06:16 +0000 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CE2D7.5080608@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> Message-ID: <453D0498.8050206@sheffield.ac.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: > >> I believe the link to the documentation above describes a common CPAN >> versioning scheme as follows: >> >> 1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32 >> >> Therefore version 1.5 of Bioperl would be: 1.50 and version 1.5.2 would >> be better as 1.52. Then to indicate that the 1.5 series is a developer >> release, you append the underscore and at least 2 digits. Thus resulting >> in the following: Bioperl 1.5 would be 1.50_01 and 1.5.2 would be >> 1.52_01. The only thing i'm unsure about would be when does the _01 get >> incremented? I suspect we would probably not increment this number since >> each release would be an increment of the minor release number e.g. >> 1.52_01, 1.53_01, 1.54_01 etc. >> >> Although I'm still not sure how this versioning would affect bioperl 1.4 >> since 1.4 uses a non-standard versioning scheme :o( >> > > Ok, I'm going to go ahead and call it 1.52_01 then. Surely 1.60 will be > treated higher than 1.4? Anyway, we can cross that bridge when we get > there, but this seems appropriate now. > > > Cheers, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > Just tried the suggested: perl -MExtUtils::MakeMaker -le 'print MM->parse_version(shift)' bioperl-1-5-2/Bio/Root/Version.pm To see how it parses the various different version schemes - here are the results: 1.5 -> 1.5 1.4 -> 1.4 1.60 -> 1.60 1.05_01 -> 1.0501 1.5_01 -> 1.501 1.50_01 -> 1.5001 Nath From cjfields at uiuc.edu Mon Oct 23 13:15:44 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 12:15:44 -0500 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <453CE6C5.6000108@sendu.me.uk> Message-ID: <002701c6f6c6$e2622c40$15327e82@pyrimidine> ... > I should clarify and say that that's what happens in Bioperl as well. > The 'forcing' that I talk about is simply what I assume will happen if > the user has CPAN set to automatically install dependencies. The user > could say 'no' to every question regarding the installation of > dependencies that CPAN discovers and Bioperl would still install fine. > > So really the difference between the current situation and, say, the > situation when 1.5.1 was released, is that the CPAN user doesn't have to > use Bundle::BioPerl for full functionality anymore, but can still chose > not to install all the optional external modules. > > The difference is the possible default behaviour. Those users that > auto-install dependencies get all the optional ones, whereas in the past > they would not have. I have to point out the benefit of this behaviour: > those people that don't care and just want it to work are more likely to > get an installation that does just work. People who know what they're > doing can still do what they want. OK with me. Any way we go about it, we have to assume that anyone who set CPAN to automatically install dependencies would want this behavior. > Before we decide what to do I guess we need hard confirmation of how > CPAN will actually behave with the current Makefile.PL. Any ideas how we > can find out? > > It would also be good to have more options to break the current tie > (Nathan is for keeping PREREQ_PM populated, Chris is for having it > empty, I can go either way)... Frankly I'm for whatever is easiest for the end-user. I think we should continue maintaining Bundle::Bioperl b/c of its convenience (easier for us to say 'install Bundle::Bioperl' as opposed to 'install modules a b d d e f g...' ). I should note that Chris D. maintains Bundle::Bioperl via CPAN and can easily add/remove modules as needed, so all that would be necessary prior to a release is to make sure the various modules present in the Bundle are up-to-date. The only difficulty would updating the bundle PPM version for Win32; I agree with Nathan that it would be nice if it were easier to maintain. The PPD file generated using 'nmake ppd' needs modifications, likely b/c these are probably still generated as PPM3-compatible vs PPM4-compatible. I also think the idea of having the developer releases available via CPAN is a good one, as long as they are marked as such (which you are taking care of with versioning changes). It makes them a little more official, even if they are interim developer releases. Chris From cjfields at uiuc.edu Mon Oct 23 13:19:08 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 12:19:08 -0500 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <453CFDC9.8030107@sheffield.ac.uk> Message-ID: <002801c6f6c7$5a58ed60$15327e82@pyrimidine> ... > > So really the difference between the current situation and, say, the > > situation when 1.5.1 was released, is that the CPAN user doesn't have to > > use Bundle::BioPerl for full functionality anymore, but can still chose > > not to install all the optional external modules. > > > > > --snip-- > > Obviously, we could maintain a Bundle::BioPerl which includes all > dependencies required for a fully functional Bioperl. I think the whole > idea for a Bundle is to provide a common environment for a particular > package. If for example, someone chooses not to install the dependencies > through CPAN (in the current setup), that can easily go back and install > Bundle::BioPerl and it would retrieve any missing dependencies for a > fully functional Bioperl-core. > > Nath Succinctly put; I would've spent five paragraphs describing that! Too much coffee (from lab meetings...) Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 23 13:26:57 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 12:26:57 -0500 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: Message-ID: <002c01c6f6c8$7163dd20$15327e82@pyrimidine> Seth, Did you try this with a clean, taxonomy-installed database? There may be some junk left over tfrom the previous test runs. I'm looking into it this week; it may not make the developer release but we'll try to get it in. BTW, the 02sinmpleseq.t test failures have to do with a call to gzip. I'll look into a workaround for that. Sendu has posted a Bio::Root::Root fix which does get rid of some bugs but introduces others. One alternative which I found works is cygwin, but there's a catch: DBD-mysql is hard to install. If it isn't one thing it's another... Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign _____ From: Seth Johnson [mailto:johnson.biotech at gmail.com] Sent: Monday, October 23, 2006 11:37 AM To: Chris Fields Cc: bioperl-l Subject: Re: Error retrieving sequence from BioSQL Chris, There's definite improvement: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Failed Test Stat Wstat Total Fail Failed List of Failed ---------------------------------------------------------------------------- --- t/02species.t 65 2 3.08% 63 65 t/03simpleseq.t 1 256 59 106 179.66% 7-59 t/04swiss.t 52 14 26.92% 25 27-34 38-42 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There's some weirdness going on during the 'swiss.t' test. It almost seems to me that expectations of some tests are swapped (27 & 39, 28 & 40, 29 & 41, 31 & 27, 32 & 28, 33 & 29, 39 & 31): ================================ not ok 25 # Test 25 got: '10097078' (t/04swiss.t at line 79) # Expected: '91309150' ok 26 not ok 27 # Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t at line 85) # Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' not ok 28 # Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic mitochondrial matrix protein' (t/04swiss.t at line 86) # Expected: 'Functional expression of cloned human splicing factor SF2: homology to RNA-binding proteins, U1 70K, and Drosophila splicing regulators' not ok 29 # Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' (t/04swiss.t at line 87) # Expected: 'Cell 66 (2), 383-394 (1991)' not ok 30 # Test 30 got: (t/04swiss.t at line 88) # Expected: '91309150' not ok 31 # Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' (t/04swiss.t at line 85 fail #2) # Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., Celis, J.E. and Leffers,H.' not ok 32 # Test 32 got: 'Functional expression of cloned human splicing factor SF2: homology to RNA-binding proteins, U1 70K, and Drosophila splicing regulators' (t/04swiss.t at line 86 fail #2) # Expected: 'Cloning and expression of a cDNA covering the complete coding region of the P32 subunit of human pre-mRNA splicing factor SF2' not ok 33 # Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail #2) # Expected: 'Gene 134 (2), 283-287 (1993)' not ok 34 # Test 34 got: (t/04swiss.t at line 88 fail #2) # Expected: '94085792' ok 35 ok 36 ok 37 not ok 38 # Test 38 got: (t/04swiss.t at line 88 fail #3) # Expected: '94253723' not ok 39 # Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4) # Expected: 'Jiang,J., Zhang,Y., Krainer, A.R. and Xu,R.M.' not ok 40 # Test 40 got: 'Cloning and expression of a cDNA covering the complete coding region of the P32 subunit of human pre-mRNA splicing factor SF2' (t/04swiss.t at line 86 fail #4) # Expected: 'Crystal structure of human p32, a doughnut-shaped acidic mitochondrial matrix protein' not ok 41 # Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail #4) # Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' not ok 42 # Test 42 got: (t/04swiss.t at line 88 fail #4) # Expected: '99199225' ============================== On 10/20/06, Chris Fields < cjfields at uiuc.edu> wrote: Seth, Did you work out the problem here? There was a recent CVS update to OBDA tests (16obda.t) that fixed similar problems on Mac OS X. Old OBDA tests apparently left data from tests in the database, which caused problems with repeated test runs. Chris > > -----Original Message----- > > From: bioperl-l-bounces lists.open-bio.org [mailto: bioperl-l- > > bounces lists.open-bio.org] On Behalf Of Seth Johnson > > Sent: Saturday, September 30, 2006 6:35 PM > > To: Hilmar Lapp > > Cc: Chris Fields; Bioperl List > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > > > Here're complete test details: > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > ... > > > FAILED tests 10-12 > > Failed 3/12 tests, 75.00% okay > > Failed Test Stat Wstat Total Fail Failed List of Failed > > > -------------------------------------------------------------------------- > > ----- > > t\02species.t 65 2 3.08% 63 65 > > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > > t\12ontology.t 2 512 738 1471 199.32% 3-738 > > t\16obda.t 12 3 25.00% 10-12 > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From johnson.biotech at gmail.com Mon Oct 23 12:36:36 2006 From: johnson.biotech at gmail.com (Seth Johnson) Date: Mon, 23 Oct 2006 12:36:36 -0400 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: <000001c6f486$df508930$15327e82@pyrimidine> References: <000001c6f486$df508930$15327e82@pyrimidine> Message-ID: Chris, There's definite improvement: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Failed Test Stat Wstat Total Fail Failed List of Failed ------------------------------------------------------------------------------- t/02species.t 65 2 3.08% 63 65 t/03simpleseq.t 1 256 59 106 179.66% 7-59 t/04swiss.t 52 14 26.92% 25 27-34 38-42 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There's some weirdness going on during the 'swiss.t' test. It almost seems to me that expectations of some tests are swapped (27 & 39, 28 & 40, 29 & 41, 31 & 27, 32 & 28, 33 & 29, 39 & 31): ================================ not ok 25 # Test 25 got: '10097078' (t/04swiss.t at line 79) # Expected: '91309150' ok 26 not ok 27 # Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t at line 85) # Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' not ok 28 # Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic mitochondrial matrix protein' (t/04swiss.t at line 86) # Expected: 'Functional expression of cloned human splicing factor SF2: homology to RNA-binding proteins, U1 70K, and Drosophila splicing regulators' not ok 29 # Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' (t/04swiss.t at line 87) # Expected: 'Cell 66 (2), 383-394 (1991)' not ok 30 # Test 30 got: (t/04swiss.t at line 88) # Expected: '91309150' not ok 31 # Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' (t/04swiss.t at line 85 fail #2) # Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., Celis,J.E. and Leffers,H.' not ok 32 # Test 32 got: 'Functional expression of cloned human splicing factor SF2: homology to RNA-binding proteins, U1 70K, and Drosophila splicing regulators' (t/04swiss.t at line 86 fail #2) # Expected: 'Cloning and expression of a cDNA covering the complete coding region of the P32 subunit of human pre-mRNA splicing factor SF2' not ok 33 # Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail #2) # Expected: 'Gene 134 (2), 283-287 (1993)' not ok 34 # Test 34 got: (t/04swiss.t at line 88 fail #2) # Expected: '94085792' ok 35 ok 36 ok 37 not ok 38 # Test 38 got: (t/04swiss.t at line 88 fail #3) # Expected: '94253723' not ok 39 # Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4) # Expected: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' not ok 40 # Test 40 got: 'Cloning and expression of a cDNA covering the complete coding region of the P32 subunit of human pre-mRNA splicing factor SF2' (t/04swiss.t at line 86 fail #4) # Expected: 'Crystal structure of human p32, a doughnut-shaped acidic mitochondrial matrix protein' not ok 41 # Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail #4) # Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' not ok 42 # Test 42 got: (t/04swiss.t at line 88 fail #4) # Expected: '99199225' ============================== On 10/20/06, Chris Fields wrote: > > > > Seth, > > Did you work out the problem here? There was a recent CVS update to OBDA > tests (16obda.t) that fixed similar problems on Mac OS X. Old OBDA tests > apparently left data from tests in the database, which caused problems > with > repeated test runs. > > Chris > > > > -----Original Message----- > > > From: bioperl-l-bounces lists.open-bio.org [mailto: bioperl-l- > > > bounces lists.open-bio.org] On Behalf Of Seth Johnson > > > Sent: Saturday, September 30, 2006 6:35 PM > > > To: Hilmar Lapp > > > Cc: Chris Fields; Bioperl List > > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > > > > > Here're complete test details: > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > ... > > > > > FAILED tests 10-12 > > > Failed 3/12 tests, 75.00% okay > > > Failed Test Stat Wstat Total Fail Failed List of Failed > > > > > > -------------------------------------------------------------------------- > > > ----- > > > t\02species.t 65 2 3.08% 63 65 > > > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > > > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > > > t\12ontology.t 2 512 738 1471 199.32% 3-738 > > > t\16obda.t 12 3 25.00% 10-12 > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > From n.haigh at sheffield.ac.uk Mon Oct 23 16:08:00 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 20:08:00 +0000 Subject: [Bioperl-l] CPAN testing Service Message-ID: <453D2120.9010301@sheffield.ac.uk> We should also check the CPAN testing service (CPANTS) to see how "good" our package is for CPAN and try to increase the Kwalitee score. There only appears to be details for bioperl-1.2.3 for some reason: http://cpants.perl.org/dist/bioperl Nath From pabloivan at gmail.com Sun Oct 22 15:54:35 2006 From: pabloivan at gmail.com (Pablo Ivan) Date: Sun, 22 Oct 2006 16:54:35 -0300 Subject: [Bioperl-l] Bioperl installation under Windows Message-ID: Hello, I have been trying to install Bioperl 1.4 on a Windows XP system, but I didn't get too far; my perl installation was made using ActiveState 5.8.8build 816. I then tried the ppm method of searching for bioperl in the repositories and installing the core package 1.4. It says that the installation was made successfully, but the /Bio folder doesn't show up in /lib, and it's like nothing new was installed at all. I was wondering if using that version of ActiveState could be causing it, but the uninstall option for it isn't showing in Add/Remove, and I'm afraid just deleting the folders and installing version 5.6 of AS could somehow damage and make things worse. Or should I just forget about it and try using Cygwin? Thank you, Pablo. From cjfields at uiuc.edu Mon Oct 23 17:34:47 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 16:34:47 -0500 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: Message-ID: <000401c6f6eb$111df040$15327e82@pyrimidine> Don't know what that particular error is, but it looks ActivePerl-related (PPM generates HTML from the blib directory). You may need to run 'nmake clean' in between test cycles get rid of old blib and other files. The carryover issue from old test runs was a definite problem. Brian fixed that in the bioperl-db CVS recently. Also, I tried Sendu's fixes from CVS head to Bio::Root::Root and they seem to fix the problems with Bio::Root::Root. The issue came down to a use of indirect syntax (a bad perl practice). There are other errors popping up related to Bio::Species, but these seem fixable at least. I committed a few changes to bioperl-db CVS to fix 03simpleseq.t test failures due to a lack of gzip on WinXP (I didn't see them b/c I had a copy on GNU gzip in my path). These should pass w/o problems now on WinXP. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign _____ From: Seth Johnson [mailto:johnson.biotech at gmail.com] Sent: Monday, October 23, 2006 4:22 PM To: Chris Fields Cc: bioperl-l Subject: Re: Error retrieving sequence from BioSQL Chris, I have not cleaned my test database yet. I'll purge it and redo the tests. This error keeps popping up in unexpected places while running nmake during installation: "Undefined subroutine &main::UpdateHTML_blib called at -e line 1. NMAKE : fatal error U1077: 'C:\WINDOWS\system32\cmd.exe' : return code '0xff'" Is there a way around it?? Seth On 10/23/06, Chris Fields wrote: Seth, Did you try this with a clean, taxonomy-installed database? There may be some junk left over tfrom the previous test runs. I'm looking into it this week; it may not make the developer release but we'll try to get it in. BTW, the 02sinmpleseq.t test failures have to do with a call to gzip. I'll look into a workaround for that. Sendu has posted a Bio::Root::Root fix which does get rid of some bugs but introduces others. One alternative which I found works is cygwin, but there's a catch: DBD-mysql is hard to install. If it isn't one thing it's another... Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign _____ From: Seth Johnson [mailto:johnson.biotech at gmail.com] Sent: Monday, October 23, 2006 11:37 AM To: Chris Fields Cc: bioperl-l Subject: Re: Error retrieving sequence from BioSQL Chris, There's definite improvement: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Failed Test Stat Wstat Total Fail Failed List of Failed ---------------------------------------------------------------------------- --- t/02species.t 65 2 3.08% 63 65 t/03simpleseq.t 1 256 59 106 179.66% 7-59 t/04swiss.t 52 14 26.92% 25 27-34 38-42 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There's some weirdness going on during the 'swiss.t' test. It almost seems to me that expectations of some tests are swapped (27 & 39, 28 & 40, 29 & 41, 31 & 27, 32 & 28, 33 & 29, 39 & 31): ================================ not ok 25 # Test 25 got: '10097078' (t/04swiss.t at line 79) # Expected: '91309150' ok 26 not ok 27 # Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t at line 85) # Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' not ok 28 # Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic mitochondrial matrix protein' (t/04swiss.t at line 86) # Expected: 'Functional expression of cloned human splicing factor SF2: homology to RNA-binding proteins, U1 70K, and Drosophila splicing regulators' not ok 29 # Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' (t/04swiss.t at line 87) # Expected: 'Cell 66 (2), 383-394 (1991)' not ok 30 # Test 30 got: (t/04swiss.t at line 88) # Expected: '91309150' not ok 31 # Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' (t/04swiss.t at line 85 fail #2) # Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., Celis, J.E. and Leffers,H.' not ok 32 # Test 32 got: 'Functional expression of cloned human splicing factor SF2: homology to RNA-binding proteins, U1 70K, and Drosophila splicing regulators' (t/04swiss.t at line 86 fail #2) # Expected: 'Cloning and expression of a cDNA covering the complete coding region of the P32 subunit of human pre-mRNA splicing factor SF2' not ok 33 # Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail #2) # Expected: 'Gene 134 (2), 283-287 (1993)' not ok 34 # Test 34 got: (t/04swiss.t at line 88 fail #2) # Expected: '94085792' ok 35 ok 36 ok 37 not ok 38 # Test 38 got: (t/04swiss.t at line 88 fail #3) # Expected: '94253723' not ok 39 # Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4) # Expected: 'Jiang,J., Zhang,Y., Krainer, A.R. and Xu,R.M.' not ok 40 # Test 40 got: 'Cloning and expression of a cDNA covering the complete coding region of the P32 subunit of human pre-mRNA splicing factor SF2' (t/04swiss.t at line 86 fail #4) # Expected: 'Crystal structure of human p32, a doughnut-shaped acidic mitochondrial matrix protein' not ok 41 # Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail #4) # Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' not ok 42 # Test 42 got: (t/04swiss.t at line 88 fail #4) # Expected: '99199225' ============================== On 10/20/06, Chris Fields < cjfields at uiuc.edu > wrote: Seth, Did you work out the problem here? There was a recent CVS update to OBDA tests (16obda.t) that fixed similar problems on Mac OS X. Old OBDA tests apparently left data from tests in the database, which caused problems with repeated test runs. Chris > > -----Original Message----- > > From: bioperl-l-bounces lists.open-bio.org [mailto: bioperl-l- > > bounces lists.open-bio.org] On Behalf Of Seth Johnson > > Sent: Saturday, September 30, 2006 6:35 PM > > To: Hilmar Lapp > > Cc: Chris Fields; Bioperl List > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > > > Here're complete test details: > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > ... > > > FAILED tests 10-12 > > Failed 3/12 tests, 75.00% okay > > Failed Test Stat Wstat Total Fail Failed List of Failed > > > -------------------------------------------------------------------------- > > ----- > > t\02species.t 65 2 3.08% 63 65 > > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > > t\12ontology.t 2 512 738 1471 199.32% 3-738 > > t\16obda.t 12 3 25.00% 10-12 > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- Best Regards, Seth Johnson Senior Bioinformatics Associate Ph: (202) 470-0900 Fx: (775) 251-0358 From cjfields at uiuc.edu Mon Oct 23 17:53:27 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 16:53:27 -0500 Subject: [Bioperl-l] Bioperl installation under Windows In-Reply-To: References: Message-ID: <9994CFF6-FCA1-4C7F-9A33-31765C6AE255@uiuc.edu> It won't install in Perl\lib, but in Perl\site\lib. Check there. We are working intently on the next developer release for BioPerl and plan on having several PPMs available, but we only are supporting ActivePerl 5.8.8.819. I would suggest that you upgrade your ActivePerl installation to that if possible since PPM has undergone major changes (they use PPM4 now, which has a GUI by default). Most repositories are now moving over to using PPM4 so you'll likely be seeing less PPM3-compatible packages being made. Chris On Oct 22, 2006, at 2:54 PM, Pablo Ivan wrote: > Hello, > > I have been trying to install Bioperl 1.4 on a Windows XP system, > but I > didn't get too far; my perl installation was made using ActiveState > 5.8.8build 816. I then tried the ppm method of searching for bioperl > in the > repositories and installing the core package 1.4. It says that the > installation was made successfully, but the /Bio folder doesn't > show up in > /lib, and it's like nothing new was installed at all. I was > wondering if > using that version of ActiveState could be causing it, but the > uninstall > option for it isn't showing in Add/Remove, and I'm afraid just > deleting the > folders and installing version 5.6 of AS could somehow damage and make > things worse. Or should I just forget about it and try using Cygwin? > > Thank you, > > Pablo. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From johnson.biotech at gmail.com Mon Oct 23 17:22:13 2006 From: johnson.biotech at gmail.com (Seth Johnson) Date: Mon, 23 Oct 2006 17:22:13 -0400 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: <002c01c6f6c8$7163dd20$15327e82@pyrimidine> References: <002c01c6f6c8$7163dd20$15327e82@pyrimidine> Message-ID: Chris, I have not cleaned my test database yet. I'll purge it and redo the tests. This error keeps popping up in unexpected places while running nmake during installation: "Undefined subroutine &main::UpdateHTML_blib called at -e line 1. NMAKE : fatal error U1077: 'C:\WINDOWS\system32\cmd.exe' : return code '0xff'" Is there a way around it?? Seth On 10/23/06, Chris Fields wrote: > > Seth, > > Did you try this with a clean, taxonomy-installed database? There may be > some junk left over tfrom the previous test runs. > > I'm looking into it this week; it may not make the developer release but > we'll try to get it in. BTW, the 02sinmpleseq.t test failures have to do > with a call to gzip. I'll look into a workaround for that. > > Sendu has posted a Bio::Root::Root fix which does get rid of some bugs but > introduces others. One alternative which I found works is cygwin, but > there's a catch: DBD-mysql is hard to install. If it isn't one thing it's > another... > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > ------------------------------ > > *From:* Seth Johnson [mailto:johnson.biotech at gmail.com] > *Sent:* Monday, October 23, 2006 11:37 AM > *To:* Chris Fields > *Cc:* bioperl-l > *Subject:* Re: Error retrieving sequence from BioSQL > > > > Chris, > > There's definite improvement: > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Failed Test Stat Wstat Total Fail Failed List of Failed > ------------------------------------------------------------------------------- > > t/02species.t 65 2 3.08% 63 65 > t/03simpleseq.t 1 256 59 106 179.66% 7-59 > t/04swiss.t 52 14 26.92% 25 27-34 38-42 > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > There's some weirdness going on during the 'swiss.t' test. It almost > seems to me that expectations of some tests are swapped (27 & 39, 28 & 40, > 29 & 41, 31 & 27, 32 & 28, 33 & 29, 39 & 31): > ================================ > not ok 25 > # Test 25 got: '10097078' (t/04swiss.t at line 79) > # Expected: '91309150' > ok 26 > not ok 27 > # Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t > at line 85) > # Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' > not ok 28 > # Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic > mitochondrial matrix protein' (t/04swiss.t at line 86) > # Expected: 'Functional expression of cloned human splicing factor SF2: > homology to RNA-binding proteins, U1 70K, and Drosophila splicing > regulators' > not ok 29 > # Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' > (t/04swiss.t at line 87) > # Expected: 'Cell 66 (2), 383-394 (1991)' > not ok 30 > # Test 30 got: (t/04swiss.t at line 88) > # Expected: '91309150' > not ok 31 > # Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' > (t/04swiss.t at line 85 fail #2) > # Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., > Celis, J.E. and Leffers,H.' > not ok 32 > # Test 32 got: 'Functional expression of cloned human splicing factor SF2: > homology to RNA-binding proteins, U1 70K, and Drosophila splicing > regulators' (t/04swiss.t at line 86 fail #2) > # Expected: 'Cloning and expression of a cDNA covering the complete > coding region of the P32 subunit of human pre-mRNA splicing factor SF2' > not ok 33 > # Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail > #2) > # Expected: 'Gene 134 (2), 283-287 (1993)' > not ok 34 > # Test 34 got: (t/04swiss.t at line 88 fail #2) > # Expected: '94085792' > ok 35 > ok 36 > ok 37 > not ok 38 > # Test 38 got: (t/04swiss.t at line 88 fail #3) > # Expected: '94253723' > not ok 39 > # Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., > Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4) > # Expected: 'Jiang,J., Zhang,Y., Krainer, A.R. and Xu,R.M.' > not ok 40 > # Test 40 got: 'Cloning and expression of a cDNA covering the complete > coding region of the P32 subunit of human pre-mRNA splicing factor SF2' > (t/04swiss.t at line 86 fail #4) > # Expected: 'Crystal structure of human p32, a doughnut-shaped acidic > mitochondrial matrix protein' > not ok 41 > # Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail > #4) > # Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' > not ok 42 > # Test 42 got: (t/04swiss.t at line 88 fail #4) > # Expected: '99199225' > ============================== > > On 10/20/06, *Chris Fields* < cjfields at uiuc.edu> wrote: > > > > Seth, > > Did you work out the problem here? There was a recent CVS update to OBDA > tests (16obda.t) that fixed similar problems on Mac OS X. Old OBDA tests > apparently left data from tests in the database, which caused problems > with > repeated test runs. > > Chris > > > > -----Original Message----- > > > From: bioperl-l-bounces lists.open-bio.org [mailto: bioperl-l- > > > bounces lists.open-bio.org] On Behalf Of Seth Johnson > > > Sent: Saturday, September 30, 2006 6:35 PM > > > To: Hilmar Lapp > > > Cc: Chris Fields; Bioperl List > > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > > > > > Here're complete test details: > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > ... > > > > > FAILED tests 10-12 > > > Failed 3/12 tests, 75.00% okay > > > Failed Test Stat Wstat Total Fail Failed List of Failed > > > > > > -------------------------------------------------------------------------- > > > ----- > > > t\02species.t 65 2 3.08% 63 65 > > > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > > > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > > > t\12ontology.t 2 512 738 1471 199.32% 3-738 > > > t\16obda.t 12 3 25.00% 10-12 > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > -- Best Regards, Seth Johnson Senior Bioinformatics Associate Ph: (202) 470-0900 Fx: (775) 251-0358 From chhalling at alumni.ls.berkeley.edu Mon Oct 23 21:02:24 2006 From: chhalling at alumni.ls.berkeley.edu (Conrad Halling) Date: Mon, 23 Oct 2006 21:02:24 -0400 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C6509.90005@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> Message-ID: <453D6620.5020401@alumni.ls.berkeley.edu> Sorry, I should know better about giving all the details. This was using the Bundle::BioPerl on CPAN, installing for Perl 5.8.8 (a fresh compile) with Mac OS X 10.4.8. -- Conrad Nathan S. Haigh wrote: > Chris Fields wrote: > >> Thanks for letting us know! Did PPM4 throw errors or just silently >> pass them over? >> >> Chris >> >> On Oct 22, 2006, at 6:45 PM, Conrad Halling wrote: >> >> >> > I believe he is talking about the bundle on cpan and not the ppd. I will > get this updated as soon as possible. > > Sendu/Chris - can you confirm to me which Bioperl modules are essential > to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any > reason for not putting *all* dependencies into the bundle? > > Nath > > > > > > -- Conrad Halling chhalling at alumni.ls.berkeley.edu From n.haigh at sheffield.ac.uk Tue Oct 24 03:05:53 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 24 Oct 2006 08:05:53 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453D6620.5020401@alumni.ls.berkeley.edu> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453D6620.5020401@alumni.ls.berkeley.edu> Message-ID: <453DBB51.6010505@sheffield.ac.uk> Conrad Halling wrote: > Sorry, I should know better about giving all the details. > > This was using the Bundle::BioPerl on CPAN, installing for Perl 5.8.8 (a > fresh compile) with Mac OS X 10.4.8. > > -- Conrad > > My apologies Conrad, this was my bad! Are you in need of the corrections being made swiftly or can you wait until the Bioperl 1.5.2 release when I'll ensure the Bundle is updated correctly for that release? Cheers Nath From n.haigh at sheffield.ac.uk Tue Oct 24 05:57:25 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Tue, 24 Oct 2006 10:57:25 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CE2D7.5080608@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> Message-ID: <453DE385.8010700@sheffield.ac.uk> --snip-- > Ok, I'm going to go ahead and call it 1.52_01 then. Surely 1.60 will be > treated higher than 1.4? Anyway, we can cross that bridge when we get > there, but this seems appropriate now. > > > Cheers, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > Just been having a think about this versioning. Does this work well and is it intuitive with versioning the official 1.5.2 developer release and also the 1.6 stable release? I'd like to put forward the following versioning scheme for consideration (most is the same as what it is now, but with some clarification - hopefully): major-version . minor-version sub-version _ developer-release-version RC-version The sub-version represents bug-fixes and possibly some minor feature enhancements with no API changes. The minor-version represents some significant feature enhancements/API changes/bug fixes. The major-version represents significant rewrites of Bioperl. For an RC of a developer release the version would have _0x (where x=the RC number) For a non RC of a developer release the version would have _10 For an RC of a stable release the version would have _0x (where x=RC number) Fo a non RC of a stable release the version would not have the underscore suffix Therefore I would see the following $VERSION being applied: 1.5.2 RC1 = 1.52_01 1.5.2 RC2 = 1.52_02 1.5.2 RC3 = 1.52_03 1.5.2 = 1.52_10 1.6 RC1 = 1.60_01 1.6 RC2 = 1.60_02 1.6 = 1.60 1.6.1 RC1 = 1.61_01 1.6.1 = 1.61 This should satisfy the requirement of CPAN for having underscores in versions to indicate a developer release, which here is a Bioperl release with an odd minor version number or any RC whether it be of a developer release or a stable release. This should mean that we could have the RC's on CPAN, but by default, CPAN would only install the latest "non developer release" (i.e. the last package without an underscore in the version). If we are going ahead with the new $VERSION scheme (as it currently is in HEAD), we should, for the sake of clarity, try to talk about Bioperl 1.52 instead of Bioperl 1.5.2 and make an effort to sync the documentation with regards to this. Nath From bix at sendu.me.uk Tue Oct 24 06:19:05 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 24 Oct 2006 11:19:05 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453DE385.8010700@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> <453DE385.8010700@sheffield.ac.uk> Message-ID: <453DE899.4030603@sendu.me.uk> Nathan Haigh wrote: > > Therefore I would see the following $VERSION being applied: > 1.5.2 RC1 = 1.52_01 > 1.5.2 RC2 = 1.52_02 > 1.5.2 RC3 = 1.52_03 > 1.5.2 = 1.52_10 > 1.6 RC1 = 1.60_01 > 1.6 RC2 = 1.60_02 > 1.6 = 1.60 > 1.6.1 RC1 = 1.61_01 > 1.6.1 = 1.61 > > This should satisfy the requirement of CPAN for having underscores in > versions to indicate a developer release, which here is a Bioperl > release with an odd minor version number or any RC whether it be of a > developer release or a stable release. This should mean that we could > have the RC's on CPAN, but by default, CPAN would only install the > latest "non developer release" (i.e. the last package without an > underscore in the version). That all sounds good to me, except I worry about potential confusion if people look manually at the things available in CPAN, see 1.60_02 and think it is more recent than 1.60 and try to install it manually. Since $VERSION = 1.52_10; is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, final release version should be $VERSION = 1.6010. > If we are going ahead with the new $VERSION scheme (as it currently is > in HEAD), we should, for the sake of clarity, try to talk about Bioperl > 1.52 instead of Bioperl 1.5.2 and make an effort to sync the > documentation with regards to this. I might disagree with this though. I think perl people, and perhaps unix people in general, should be used to version numbers like '1.5.2', but then getting '1.52' from the code since such a number allows simple numerical comparisons while the former does not. The former is easier to read and understand. This is just how Perl itself behaves. Most users who wouldn't expect such a behaviour aren't going to be checking the version number programatically anyway. BTW. do we have someone with a CPAN account, or should I get one? From n.haigh at sheffield.ac.uk Tue Oct 24 07:37:12 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Tue, 24 Oct 2006 12:37:12 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453DE899.4030603@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> <453DE385.8010700@sheffield.ac.uk> <453DE899.4030603@sendu.me.uk> Message-ID: <453DFAE8.5050602@sheffield.ac.uk> Sendu Bala wrote: > Nathan Haigh wrote: > >> Therefore I would see the following $VERSION being applied: >> 1.5.2 RC1 = 1.52_01 >> 1.5.2 RC2 = 1.52_02 >> 1.5.2 RC3 = 1.52_03 >> 1.5.2 = 1.52_10 >> 1.6 RC1 = 1.60_01 >> 1.6 RC2 = 1.60_02 >> 1.6 = 1.60 >> 1.6.1 RC1 = 1.61_01 >> 1.6.1 = 1.61 >> >> This should satisfy the requirement of CPAN for having underscores in >> versions to indicate a developer release, which here is a Bioperl >> release with an odd minor version number or any RC whether it be of a >> developer release or a stable release. This should mean that we could >> have the RC's on CPAN, but by default, CPAN would only install the >> latest "non developer release" (i.e. the last package without an >> underscore in the version). >> > > That all sounds good to me, except I worry about potential confusion if > people look manually at the things available in CPAN, see 1.60_02 and > think it is more recent than 1.60 and try to install it manually. > > I not sure if this would be a problem. As far as I understand, CPAN treats these packages with underscores in $VERSION as something distinctly different to the others releases (i.e. developer releases). If you look at such a page, it is clearly evident that it is a developers release. For example, if you search on CPAN for the latest version of the CPAN module is shows 1.8802. if you go to that page: http://search.cpan.org/~andk/CPAN-1.8802/ There is also a link for the latest developer release, released 1 day after 1.8802 with a version of 1.88_57 (which would convert to 1.8857). This too appears to be later that 1.8802, but since it is dealt with as a developer release it doesn't seem to matter - CPAN will only deal with the stable (non-developer) releases, while the developer releases can be used as a convenient way to access developer releases. Although I'm thinking CPAN uses some hocus pocus with release dates too. > Since > $VERSION = 1.52_10; > is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, > final release version should be > $VERSION = 1.6010. > > > Because they are dealt with separately, I don't think this is an issue (see above). >> If we are going ahead with the new $VERSION scheme (as it currently is >> in HEAD), we should, for the sake of clarity, try to talk about Bioperl >> 1.52 instead of Bioperl 1.5.2 and make an effort to sync the >> documentation with regards to this. >> > > I might disagree with this though. I think perl people, and perhaps unix > people in general, should be used to version numbers like '1.5.2', but > then getting '1.52' from the code since such a number allows simple > numerical comparisons while the former does not. The former is easier to > read and understand. This is just how Perl itself behaves. > > Most users who wouldn't expect such a behaviour aren't going to be > checking the version number programatically anyway. > > > BTW. do we have someone with a CPAN account, or should I get one? > It says Ewan Birney is the author of Bioperl - I assume it must be possible to have multiple people have the permissions to update a single package. Nath From chhalling at alumni.ls.berkeley.edu Tue Oct 24 07:15:12 2006 From: chhalling at alumni.ls.berkeley.edu (Conrad Halling) Date: Tue, 24 Oct 2006 07:15:12 -0400 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453DBB51.6010505@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453D6620.5020401@alumni.ls.berkeley.edu> <453DBB51.6010505@sheffield.ac.uk> Message-ID: <453DF5C0.3040104@alumni.ls.berkeley.edu> Nathan S. Haigh wrote: > Conrad Halling wrote: >> Sorry, I should know better about giving all the details. >> >> This was using the Bundle::BioPerl on CPAN, installing for Perl 5.8.8 >> (a fresh compile) with Mac OS X 10.4.8. >> >> -- Conrad > My apologies Conrad, this was my bad! Are you in need of the > corrections being made swiftly or can you wait until the Bioperl 1.5.2 > release when I'll ensure the Bundle is updated correctly for that > release? > > Cheers > Nath No, I'm fine. I used the cpan utility to load the three modules manually. -- Conrad -- Conrad Halling chhalling at alumni.ls.berkeley.edu From bix at sendu.me.uk Tue Oct 24 08:16:54 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 24 Oct 2006 13:16:54 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453DFAE8.5050602@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> <453DE385.8010700@sheffield.ac.uk> <453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk> Message-ID: <453E0436.3050903@sendu.me.uk> Nathan Haigh wrote: > Sendu Bala wrote: > >> That all sounds good to me, except I worry about potential confusion if >> people look manually at the things available in CPAN, see 1.60_02 and >> think it is more recent than 1.60 and try to install it manually. > > I not sure if this would be a problem. As far as I understand, CPAN > treats these packages with underscores in $VERSION as something > distinctly different to the others releases (i.e. developer releases). > If you look at such a page, it is clearly evident that it is a > developers release. For example, if you search on CPAN for the latest > version of the CPAN module is shows 1.8802. if you go to that page: > http://search.cpan.org/~andk/CPAN-1.8802/ > There is also a link for the latest developer release, released 1 day > after 1.8802 with a version of 1.88_57 (which would convert to 1.8857). [snip] >> Since >> $VERSION = 1.52_10; >> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, >> final release version should be >> $VERSION = 1.6010. > > Because they are dealt with separately, I don't think this is an issue > (see above). If you don't notice the dates, or are doing numerical version number comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may not be automatic, but you can still chose to download the developer releases. Which means if we say to someone 'use Bioperl 1.6 or better' they may choose to get the latest version and think it is 1.6002 when infact 1.60 was the more recent version. 1.6010 solves the problem, is consistent with your 1.50_10 suggestion, and doesn't cause any problems as far as I can see. >>> If we are going ahead with the new $VERSION scheme (as it currently is >>> in HEAD), we should, for the sake of clarity, try to talk about Bioperl >>> 1.52 instead of Bioperl 1.5.2 and make an effort to sync the >>> documentation with regards to this. >>> >> I might disagree with this though. I think perl people, and perhaps unix >> people in general, should be used to version numbers like '1.5.2', but >> then getting '1.52' from the code since such a number allows simple >> numerical comparisons while the former does not. The former is easier to >> read and understand. This is just how Perl itself behaves. >> >> Most users who wouldn't expect such a behaviour aren't going to be >> checking the version number programatically anyway. >> >> >> BTW. do we have someone with a CPAN account, or should I get one? >> > > It says Ewan Birney is the author of Bioperl - I assume it must be > possible to have multiple people have the permissions to update a single > package. How did you get Bundle::BioPerl updated? Did you just ask Chris Dagdigian to do it for you? Or do you have access to his account? I'll ask Ewan about it. From n.haigh at sheffield.ac.uk Tue Oct 24 08:21:56 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Tue, 24 Oct 2006 13:21:56 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453E0436.3050903@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> <453DE385.8010700@sheffield.ac.uk> <453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk> <453E0436.3050903@sendu.me.uk> Message-ID: <453E0564.9030302@sheffield.ac.uk> Sendu Bala wrote: > Nathan Haigh wrote: >> Sendu Bala wrote: >> >>> That all sounds good to me, except I worry about potential confusion >>> if people look manually at the things available in CPAN, see 1.60_02 >>> and think it is more recent than 1.60 and try to install it manually. >> >> I not sure if this would be a problem. As far as I understand, CPAN >> treats these packages with underscores in $VERSION as something >> distinctly different to the others releases (i.e. developer releases). >> If you look at such a page, it is clearly evident that it is a >> developers release. For example, if you search on CPAN for the latest >> version of the CPAN module is shows 1.8802. if you go to that page: >> http://search.cpan.org/~andk/CPAN-1.8802/ >> There is also a link for the latest developer release, released 1 day >> after 1.8802 with a version of 1.88_57 (which would convert to 1.8857). > > [snip] > >>> Since >>> $VERSION = 1.52_10; >>> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before >>> release, final release version should be >>> $VERSION = 1.6010. >> >> Because they are dealt with separately, I don't think this is an issue >> (see above). > > If you don't notice the dates, or are doing numerical version number > comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may > not be automatic, but you can still chose to download the developer > releases. Which means if we say to someone 'use Bioperl 1.6 or better' > they may choose to get the latest version and think it is 1.6002 when > infact 1.60 was the more recent version. 1.6010 solves the problem, is > consistent with your 1.50_10 suggestion, and doesn't cause any > problems as far as I can see. > > I see - you mean for a non-RC release append 10 to the version number and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to the version. --snip-- > > How did you get Bundle::BioPerl updated? Did you just ask Chris > Dagdigian to do it for you? Or do you have access to his account? I'll > ask Ewan about it. I just asked Chris D. to do it for me :o) Nath From bix at sendu.me.uk Tue Oct 24 09:01:22 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 24 Oct 2006 14:01:22 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453E0564.9030302@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> <453DE385.8010700@sheffield.ac.uk> <453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk> <453E0436.3050903@sendu.me.uk> <453E0564.9030302@sheffield.ac.uk> Message-ID: <453E0EA2.6050306@sendu.me.uk> Nathan Haigh wrote: > I see - you mean for a non-RC release append 10 to the version number > and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to > the version. Precisely. 1.5.2 RC3 will have in Bio::Root::Version : $VERSION = 1.52_03; $VERSION = eval $VERSION; # $VERSION is 1.5203 1.5.2 final release would have: $VERSION = 1.52_10; $VERSION = eval $VERSION; # $VERSION is 1.5210 1.6.0 RC1 would have: $VERSION = 1.60_01; $VERSION = eval $VERSION; # $VERSION is 1.6001 1.6.0 final release would have: $VERSION = 1.6010; Nice thing about putting RCs up on CPAN is that I suppose we'd see the test results from cpantesters. The more test results the better :) From n.haigh at sheffield.ac.uk Tue Oct 24 09:05:54 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Tue, 24 Oct 2006 14:05:54 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453E0EA2.6050306@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> <453DE385.8010700@sheffield.ac.uk> <453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk> <453E0436.3050903@sendu.me.uk> <453E0564.9030302@sheffield.ac.uk> <453E0EA2.6050306@sendu.me.uk> Message-ID: <453E0FB2.4080002@sheffield.ac.uk> Sendu Bala wrote: > Nathan Haigh wrote: >> I see - you mean for a non-RC release append 10 to the version number >> and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to >> the version. > > Precisely. > > 1.5.2 RC3 will have in Bio::Root::Version : > > $VERSION = 1.52_03; > $VERSION = eval $VERSION; # $VERSION is 1.5203 > > 1.5.2 final release would have: > > $VERSION = 1.52_10; > $VERSION = eval $VERSION; # $VERSION is 1.5210 > > 1.6.0 RC1 would have: > > $VERSION = 1.60_01; > $VERSION = eval $VERSION; # $VERSION is 1.6001 > > 1.6.0 final release would have: > > $VERSION = 1.6010; > > > Nice thing about putting RCs up on CPAN is that I suppose we'd see the > test results from cpantesters. The more test results the better :) Did you see the cpants site I sent earlier: http://cpants.perl.org/dist/bioperl But I'm not sure why 1.4 didn't make it in there instead of 1.2.3 From bix at sendu.me.uk Tue Oct 24 09:14:08 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 24 Oct 2006 14:14:08 +0100 Subject: [Bioperl-l] CPAN testing Service In-Reply-To: <453D2120.9010301@sheffield.ac.uk> References: <453D2120.9010301@sheffield.ac.uk> Message-ID: <453E11A0.20304@sendu.me.uk> Nathan S. Haigh wrote: > We should also check the CPAN testing service (CPANTS) to see how "good" > our package is for CPAN and try to increase the Kwalitee score. There > only appears to be details for bioperl-1.2.3 for some reason: > http://cpants.perl.org/dist/bioperl Yes, but I think it will be pretty similar score this time round. We'll resolve the remaining issues for 1.6. From cjfields at uiuc.edu Tue Oct 24 10:24:44 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 24 Oct 2006 09:24:44 -0500 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453E0436.3050903@sendu.me.uk> Message-ID: <000501c6f778$279cee10$15327e82@pyrimidine> ... > >> Since > >> $VERSION = 1.52_10; > >> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, > >> final release version should be > >> $VERSION = 1.6010. > > > > Because they are dealt with separately, I don't think this is an issue > > (see above). > > If you don't notice the dates, or are doing numerical version number > comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may > not be automatic, but you can still chose to download the developer > releases. Which means if we say to someone 'use Bioperl 1.6 or better' > they may choose to get the latest version and think it is 1.6002 when > infact 1.60 was the more recent version. 1.6010 solves the problem, is > consistent with your 1.50_10 suggestion, and doesn't cause any problems > as far as I can see. CPAN looks like it can handle 'x.y.z', at least for Pugs: http://search.cpan.org/~audreyt/Perl6-Pugs-6.2.13/ >From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013': our $VERSION = 6.002013; That's also a very perlish-way to do it. And there are no developer versions of Pugs, since it is always under active development. We could try something like: our $VERSION = 1.005002_01; just to tag it as a developer release or release candidate, if that's what you want; I'm neutral to that point. I don't think it's necessary to post every RC to CPAN, though, unless you feel very strongly about it. It just seems like more hassle than it's worth, esp. since you've been releasing about one per week leading up to a final 1.5.2 (due soon). > >> I might disagree with this though. I think perl people, and perhaps > unix > >> people in general, should be used to version numbers like '1.5.2', but > >> then getting '1.52' from the code since such a number allows simple > >> numerical comparisons while the former does not. The former is easier > to > >> read and understand. This is just how Perl itself behaves. > >> > >> Most users who wouldn't expect such a behaviour aren't going to be > >> checking the version number programatically anyway. > >> > >> > >> BTW. do we have someone with a CPAN account, or should I get one? > >> > > > > It says Ewan Birney is the author of Bioperl - I assume it must be > > possible to have multiple people have the permissions to update a single > > package. As a quick response to the above, I would read 'rel. 1.5.2' as the second patched release of the second revision (here in a developer cycle) of the first major release. I would read 'rel 1.52' as the 52nd release of the major release (just can't quite make it to version 2, I guess). I don't think we can use the latter as it is just too confusing, especially since we've adopted the 'major.minor.patch' versioning quite early on. As for CPAN, I believe there is usually a person or group responsible for maintaining each distribution. As Ewan seems to be the point man, you'll have to ask him. I suppose it is possible to add more if needed > How did you get Bundle::BioPerl updated? Did you just ask Chris > Dagdigian to do it for you? Or do you have access to his account? I'll > ask Ewan about it. When I inquired about XML::Simple, I emailed Chris D. via his contact information from CPAN. He let me know that adding it would be pretty easy, so all you need to do is let him know about any errors/additions/deletions. I think his wiki page also has some contact info. Which reminds me, if anyone contacts him, could you make sure that XML::Simple is added? I can't remember if it has been. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 24 10:29:11 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 24 Oct 2006 09:29:11 -0500 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453E0FB2.4080002@sheffield.ac.uk> Message-ID: <000601c6f778$c639f0e0$15327e82@pyrimidine> > Sendu Bala wrote: > > Nathan Haigh wrote: > >> I see - you mean for a non-RC release append 10 to the version number > >> and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to > >> the version. > > > > Precisely. > > > > 1.5.2 RC3 will have in Bio::Root::Version : > > > > $VERSION = 1.52_03; > > $VERSION = eval $VERSION; # $VERSION is 1.5203 > > > > 1.5.2 final release would have: > > > > $VERSION = 1.52_10; > > $VERSION = eval $VERSION; # $VERSION is 1.5210 > > > > 1.6.0 RC1 would have: > > > > $VERSION = 1.60_01; > > $VERSION = eval $VERSION; # $VERSION is 1.6001 > > > > 1.6.0 final release would have: > > > > $VERSION = 1.6010; > > > > > > Nice thing about putting RCs up on CPAN is that I suppose we'd see the > > test results from cpantesters. The more test results the better :) > Did you see the cpants site I sent earlier: > http://cpants.perl.org/dist/bioperl > > But I'm not sure why 1.4 didn't make it in there instead of 1.2.3 Yes, odd. Another thing to note is that CPAN also list two bugs related to bioperl 1.4. We may need to have some way of either redirecting users from there to bugzilla, or routinely checking the CPAN site. Otherwise we'll miss those. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From JK at novozymes.com Tue Oct 24 10:45:26 2006 From: JK at novozymes.com (JK (Jesper Agerbo Krogh)) Date: Tue, 24 Oct 2006 16:45:26 +0200 Subject: [Bioperl-l] Keeping references around in the objects? Message-ID: <934F95E71B6C9347A873C42AE3C196191299E011@NZT0004E.dknz.nzcorp.net> Hi All. When getting a Bio::Seq object back from a feature it would be really nice to have access to the old objects through the new object as: $featseq->feature()->parent_seq(); Would it be possible to keep the references around for (as an example) to be able to access the global information through the particular feature. Most of the annotation in the general header of a EMBL/Genbank-record also applies to the specific features. Jesper From JK at novozymes.com Tue Oct 24 10:28:22 2006 From: JK at novozymes.com (JK (Jesper Agerbo Krogh)) Date: Tue, 24 Oct 2006 16:28:22 +0200 Subject: [Bioperl-l] Subclassing Bio::Seq ? Extending Bio::Perl Message-ID: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net> Hi. We're trying to "extend" bioperl in our own setup. We have some funtions that we'd like to "allways" have available on a Bio::Seq-object. As an example, I'd like to have the sequence-digest available on ->digest that just returns A hex-encoded message-digest of the sequence in the object. This is really comfortable when trying to figure out wether we've got some computations stored in the cache for this particular sequence. Another example is that we have some fields we want to be mandatory in the objects, thus adding additional checks in the constructor is nessesary. Our approach has been to "subclass" Bio::Seq in a new object: (Nz::Seq) and add the functionality there. This generally works fine (->translate() calls ->can_call_new() and instantiates the correct subclassed object. But the logic fails when the ->seq of a feature just instantiates a Bio::PrimarySeq without trying to get the subclassed object. So the question basically is: What is the preferred way of extending/subclassing Bio-perl -objects with our own methods? Jesper From bix at sendu.me.uk Tue Oct 24 11:26:19 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 24 Oct 2006 16:26:19 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <000501c6f778$279cee10$15327e82@pyrimidine> References: <000501c6f778$279cee10$15327e82@pyrimidine> Message-ID: <453E309B.9090007@sendu.me.uk> Chris Fields wrote: > ... >>>> Since >>>> $VERSION = 1.52_10; >>>> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, >>>> final release version should be >>>> $VERSION = 1.6010. >>> Because they are dealt with separately, I don't think this is an issue >>> (see above). >> If you don't notice the dates, or are doing numerical version number >> comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may >> not be automatic, but you can still chose to download the developer >> releases. Which means if we say to someone 'use Bioperl 1.6 or better' >> they may choose to get the latest version and think it is 1.6002 when >> infact 1.60 was the more recent version. 1.6010 solves the problem, is >> consistent with your 1.50_10 suggestion, and doesn't cause any problems >> as far as I can see. > > CPAN looks like it can handle 'x.y.z', at least for Pugs: > > http://search.cpan.org/~audreyt/Perl6-Pugs-6.2.13/ 'handle'? I think it shows up as '6.2.13' simply because it was uploaded with the filename Perl6-Pugs-6.2.13.tar.gz As you point out, the code has the kind of $VERSION number we've been suggesting in this thread: > From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013': > > our $VERSION = 6.002013; > > That's also a very perlish-way to do it. And there are no developer > versions of Pugs, since it is always under active development. We could try > something like: > > our $VERSION = 1.005002_01; Yes, this was already like one of my suggestions (1.0502_01), but I brought up the concern that 1.05 might be < 1.4. So then we have a question: do we try and fumble a 1.4 compatible number by using 1.60_10, or do we have a clean break, remove 1.4 from CPAN if it causes problems, and go for the 'proper' 1.006000 (1.6.0) with no room for RC numbering, or 1.006000010 (1.6.0.10) - the first final release following some 1.006000_001 (1.6.0.01 == rc1) RCs? > just to tag it as a developer release or release candidate, if that's what > you want; I'm neutral to that point. I don't think it's necessary to post > every RC to CPAN, though, unless you feel very strongly about it. It just > seems like more hassle than it's worth, esp. since you've been releasing > about one per week leading up to a final 1.5.2 (due soon). I don't think it would be a hassle; on the contrary it would be very useful to know the CPAN distribution actually works. I'm very happy with the idea that a release candidate gets fully tested... From bix at sendu.me.uk Tue Oct 24 11:39:16 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 24 Oct 2006 16:39:16 +0100 Subject: [Bioperl-l] Subclassing Bio::Seq ? Extending Bio::Perl In-Reply-To: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net> References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net> Message-ID: <453E33A4.5060004@sendu.me.uk> JK (Jesper Agerbo Krogh) wrote: > Hi. > > We're trying to "extend" bioperl in our own setup. We have some funtions > that we'd like to "allways" have available on a Bio::Seq-object. [snip] > So the question basically is: > What is the preferred way of extending/subclassing Bio-perl -objects > with our own methods? http://www.bioperl.org/wiki/Advanced_BioPerl#Extending_the_toolkit From hlapp at gmx.net Tue Oct 24 12:24:09 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 24 Oct 2006 12:24:09 -0400 Subject: [Bioperl-l] Subclassing Bio::Seq ? Extending Bio::Perl In-Reply-To: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net> References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net> Message-ID: I think you've generally taken the right path, but see below. First off, object factories are used extensively already but not yet in each and every place where Bioperl creates an object internally. Achieving your goal may entail fixes to Bioperl to use a factory instead of a hard-coded module name. Also be on the lookout for factory() or seq_factory() methods for classes whose work entails creating sequence objects and that already give you control over the type to be created. The problem that hits you here though isn't one of determining the type of the object to be created, because the respective method doesn't create a sequence object. It only returns the sequence object that the feature has a reference to. The reason that this is a Bio::PrimarySeq and not a Bio::Seq or your extension of the latter is that the Perl garbage collector can't deal with circular references. The way we've circumvented the problem with sequence (who hold references to their feature objects) and feature objects (who need to hold a reference to their sequence object) is to make Bio::Seq a wrapper around Bio::PrimarySeq (i.e., Bio::Seq implements Bio::PrimarySeqI by delegating all the Bio::PrimarySeqI methods to an instance of Bio::PrimarySeq, and then adds implementations of the Bio::SeqI methods), and then make feature objects only hold a reference to the 'base' Bio::PrimarySeq instance. This works because Bio::PrimarySeq doesn't hold features, only Bio::SeqI objects do. Having said all that, note that if all what you want to do is defining computations on Bio::Seq objects, as opposed to storing values for additional attributes, the best design approach is not to extend the class but to create a class with those computations as static methods (which would accept the seq object on which to compute as an argument; e.g., print $seqComputations->message_digest($seq)). -hlmar On Oct 24, 2006, at 10:28 AM, JK ((Jesper Agerbo Krogh)) wrote: > Hi. > > We're trying to "extend" bioperl in our own setup. We have some > funtions > > that we'd like to "allways" have available on a Bio::Seq-object. As an > example, > I'd like to have the sequence-digest available on ->digest that just > returns > A hex-encoded message-digest of the sequence in the object. This is > really comfortable > when trying to figure out wether we've got some computations stored in > the cache > for this particular sequence. > > Another example is that we have some fields we want to be mandatory in > the objects, > thus adding additional checks in the constructor is nessesary. > > Our approach has been to "subclass" Bio::Seq in a new object: > (Nz::Seq) > and add > the functionality there. This generally works fine (->translate() > calls > ->can_call_new() > and instantiates the correct subclassed object. > > But the logic fails when the ->seq of a feature just instantiates a > Bio::PrimarySeq > without trying to get the subclassed object. > > So the question basically is: > What is the preferred way of extending/subclassing Bio-perl -objects > with > our own methods? > > Jesper > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Tue Oct 24 12:45:25 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 24 Oct 2006 11:45:25 -0500 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453E309B.9090007@sendu.me.uk> Message-ID: <000001c6f78b$d1c65a30$15327e82@pyrimidine> ... > > 'handle'? I think it shows up as '6.2.13' simply because it was uploaded > with the filename Perl6-Pugs-6.2.13.tar.gz Sorry, my point was that when Audrey T. uses '6.2.13', her $VERSION is '6.002013'. So maybe we should follow a similar convention. Seems easier and less confusing to me, at least. > As you point out, the code has the kind of $VERSION number we've been > suggesting in this thread: > > > From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013': > > > > our $VERSION = 6.002013; > > > > That's also a very perlish-way to do it. And there are no developer > > versions of Pugs, since it is always under active development. We could > try > > something like: > > > > our $VERSION = 1.005002_01; > > Yes, this was already like one of my suggestions (1.0502_01), but I > brought up the concern that 1.05 might be < 1.4. > > So then we have a question: do we try and fumble a 1.4 compatible number > by using 1.60_10, or do we have a clean break, remove 1.4 from CPAN if > it causes problems, and go for the 'proper' 1.006000 (1.6.0) with no > room for RC numbering, or 1.006000010 (1.6.0.10) - the first final > release following some 1.006000_001 (1.6.0.01 == rc1) RCs? I would go for the clean break if it follows perl/CPAN convention. '1.60_10' looks like '1.60.10', not 1.6 or 1.6 RC1, so it's too confusing. If we can use '1.00600x' for 1.6, 1.6.1, etc, and '1.006000_00x' for 1.6 RC1, 1.6 RC2 etc then that would be consistent and perl-compatible. BTW, the reason I looked at Pugs was to see what some of the Perl6 developers were using. Who knows; they'll probably change it! ... > I don't think it would be a hassle; on the contrary it would be very > useful to know the CPAN distribution actually works. I'm very happy with > the idea that a release candidate gets fully tested... So you obviously feel strongly about it! ;> I don't have a problem as long as we stick with doing this from now on (i.e. have a consistent versioning scheme, release policy, CPAN release policy, etc). Would be nice for Jason/Brian/Hilmar to chime in as to the reasoning behind the older versioning scheme. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From JK at novozymes.com Tue Oct 24 13:59:10 2006 From: JK at novozymes.com (JK (Jesper Agerbo Krogh)) Date: Tue, 24 Oct 2006 19:59:10 +0200 Subject: [Bioperl-l] Subclassing Bio::Seq ? Extending Bio::Perl References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.n et> Message-ID: <934F95E71B6C9347A873C42AE3C196190B84C5FC@NZT0004E.dknz.nzcorp.net> > > I think you've generally taken the right path, but see below. > > First off, object factories are used extensively already but not yet > in each and every place where Bioperl creates an object internally. > Achieving your goal may entail fixes to Bioperl to use a factory > instead of a hard-coded module name. Also be on the lookout for > factory() or seq_factory() methods for classes whose work entails > creating sequence objects and that already give you control over the > type to be created. Can you elaborate/describe this a bit more? > The problem that hits you here though isn't one of determining the > type of the object to be created, because the respective method > doesn't create a sequence object. It only returns the sequence object > that the feature has a reference to. This was what Data::Dumper told me, but stuff I'd likewise would like to change was to get a RichSeq object returned every-time from Bio::Seq, adding in the stuff that allways seems appropriate. > The reason that this is a Bio::PrimarySeq and not a Bio::Seq or your > extension of the latter is that the Perl garbage collector can't deal > with circular references. Doesn't Scalar::Util::weaken solve that? > Having said all that, note that if all what you want to do is > defining computations on Bio::Seq objects, as opposed to storing > values for additional attributes, the best design approach is not to > extend the class but to create a class with those computations as > static methods (which would accept the seq object on which to compute > as an argument; e.g., print $seqComputations->message_digest($seq)). I could but there are some functionality that I'd by design would like to have available on every sequence in the system. This way I would end up coding the functionality for getting the message_digest every place that I needed to get the value (which would be quite often in this application), whereas it by design belongs into the Bio::Seq-stuff. Jesper From JK at novozymes.com Tue Oct 24 13:59:19 2006 From: JK at novozymes.com (JK (Jesper Agerbo Krogh)) Date: Tue, 24 Oct 2006 19:59:19 +0200 Subject: [Bioperl-l] Subclassing Bio::Seq ? Extending Bio::Perl References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.n et> <453E33A4.5060004@sendu.me.uk> Message-ID: <934F95E71B6C9347A873C42AE3C196190B84C5FD@NZT0004E.dknz.nzcorp.net> > JK (Jesper Agerbo Krogh) wrote: > > Hi. > > > > We're trying to "extend" bioperl in our own setup. We have some funtions > > that we'd like to "allways" have available on a Bio::Seq-object. > [snip] > > So the question basically is: > > What is the preferred way of extending/subclassing Bio-perl -objects > > with our own methods? > > http://www.bioperl.org/wiki/Advanced_BioPerl#Extending_the_toolkit That is definately a way of extending Bio-perl, thanks. Jesper From hlapp at gmx.net Tue Oct 24 14:57:02 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 24 Oct 2006 14:57:02 -0400 Subject: [Bioperl-l] Subclassing Bio::Seq ? Extending Bio::Perl In-Reply-To: <934F95E71B6C9347A873C42AE3C196190B84C5FC@NZT0004E.dknz.nzcorp.net> References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.n et> <934F95E71B6C9347A873C42AE3C196190B84C5FC@NZT0004E.dknz.nzcorp.net> Message-ID: On Oct 24, 2006, at 1:59 PM, JK ((Jesper Agerbo Krogh)) wrote: >> >> I think you've generally taken the right path, but see below. >> >> First off, object factories are used extensively already but not yet >> in each and every place where Bioperl creates an object internally. >> Achieving your goal may entail fixes to Bioperl to use a factory >> instead of a hard-coded module name. Also be on the lookout for >> factory() or seq_factory() methods for classes whose work entails >> creating sequence objects and that already give you control over the >> type to be created. > > Can you elaborate/describe this a bit more? See for example the POD of Bio::SeqIO (sorry, the method is called sequence_factory()). > >> The reason that this is a Bio::PrimarySeq and not a Bio::Seq or your >> extension of the latter is that the Perl garbage collector can't deal >> with circular references. > > Doesn't Scalar::Util::weaken solve that? You're welcome to test and try. It should be a simple change in Bio::Seq::add_SeqFeature(). You will see that it is this method and not the feature object that makes sure the wrapped primarySeq gets passed as sequence reference. Just change that to creating a new reference to the sequence object and make it a weak reference before passing it to the feature object. (The feature object has no requirement (or knowledge) that the referenced sequence object is a PrimarySeq.) > >> Having said all that, note that if all what you want to do is >> defining computations on Bio::Seq objects, as opposed to storing >> values for additional attributes, the best design approach is not to >> extend the class but to create a class with those computations as >> static methods (which would accept the seq object on which to compute >> as an argument; e.g., print $seqComputations->message_digest($seq)). > > I could but there are some functionality that I'd by design would > like to > have available on every sequence in the system. This way I would > end up > coding the functionality for getting the message_digest every place > that > I needed to get the value (which would be quite often in this > application), > whereas it by design belongs into the Bio::Seq-stuff. I'm not following you why this would make any difference (it would be $seq->message_digest() compared to $seqCompute->message_digest ($seq)), unless what you are saying is that you would like to cache the result of the computation. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Wed Oct 25 06:36:27 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 25 Oct 2006 11:36:27 +0100 Subject: [Bioperl-l] Lagan environment variable Message-ID: <453F3E2B.2040309@sendu.me.uk> Notification to say I'm changing the environmental variable that Bio::Tools::Run::Alignment::Lagan expects to define the location of the lagan executables from LAGANDIR to LAGAN_DIR, since the latter is the default variable that the lagan installation and scripts themselves look for. I hope this isn't too much of a burden, but it seems like the sensible approach to getting Bio::Tools::Run::Alignment::Lagan to actually work. Thank you, Sendu. From n.haigh at sheffield.ac.uk Wed Oct 25 09:07:47 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 25 Oct 2006 13:07:47 +0000 Subject: [Bioperl-l] Lagan environment variable In-Reply-To: <453F3E2B.2040309@sendu.me.uk> References: <453F3E2B.2040309@sendu.me.uk> Message-ID: <453F61A3.4090904@sheffield.ac.uk> Sendu Bala wrote: > Notification to say I'm changing the environmental variable that > Bio::Tools::Run::Alignment::Lagan expects to define the location of the > lagan executables from LAGANDIR to LAGAN_DIR, since the latter is the > default variable that the lagan installation and scripts themselves look > for. > > I hope this isn't too much of a burden, but it seems like the sensible > approach to getting Bio::Tools::Run::Alignment::Lagan to actually work. > > > Thank you, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > Woudn't it make more sense to change the test? That is what I've just done for t/Genscan.t It seemed to fit in with the ENV variable syntax that other modules in Bioperl-run used. Nath -- > A: Yes. >> Q: Are you sure? >> >>> A: Because it reverses the logical flow of conversation. >>> >>>> Q: Why is top posting frowned upon? >>>> Get Thunderbird From bix at sendu.me.uk Wed Oct 25 08:12:00 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 25 Oct 2006 13:12:00 +0100 Subject: [Bioperl-l] Lagan environment variable In-Reply-To: <453F61A3.4090904@sheffield.ac.uk> References: <453F3E2B.2040309@sendu.me.uk> <453F61A3.4090904@sheffield.ac.uk> Message-ID: <453F5490.7060808@sendu.me.uk> Nathan S. Haigh wrote: > Sendu Bala wrote: >> Notification to say I'm changing the environmental variable that >> Bio::Tools::Run::Alignment::Lagan expects to define the location of the >> lagan executables from LAGANDIR to LAGAN_DIR, since the latter is the >> default variable that the lagan installation and scripts themselves look >> for. >> >> I hope this isn't too much of a burden, but it seems like the sensible >> approach to getting Bio::Tools::Run::Alignment::Lagan to actually work. > > Woudn't it make more sense to change the test? That is what I've just > done for t/Genscan.t For Genscan.t, the test script looked at the wrong environment variable. Here I'm talking about lagan itself (the thing you get from http://lagan.stanford.edu/lagan_web/citing.shtml, nothing to do with Bioperl) needing the environment variable LAGAN_DIR to be set in order to work. Since you need to set LAGAN_DIR to make lagan work, it makes sense that the Bioperl front-end to lagan also use the same variable. From n.haigh at sheffield.ac.uk Wed Oct 25 09:16:16 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 25 Oct 2006 13:16:16 +0000 Subject: [Bioperl-l] Lagan environment variable In-Reply-To: <453F5490.7060808@sendu.me.uk> References: <453F3E2B.2040309@sendu.me.uk> <453F61A3.4090904@sheffield.ac.uk> <453F5490.7060808@sendu.me.uk> Message-ID: <453F63A0.7040609@sheffield.ac.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: >> Sendu Bala wrote: >>> Notification to say I'm changing the environmental variable that >>> Bio::Tools::Run::Alignment::Lagan expects to define the location of >>> the lagan executables from LAGANDIR to LAGAN_DIR, since the latter >>> is the default variable that the lagan installation and scripts >>> themselves look for. >>> >>> I hope this isn't too much of a burden, but it seems like the >>> sensible approach to getting Bio::Tools::Run::Alignment::Lagan to >>> actually work. >> >> Woudn't it make more sense to change the test? That is what I've just >> done for t/Genscan.t > > For Genscan.t, the test script looked at the wrong environment variable. > > Here I'm talking about lagan itself (the thing you get from > http://lagan.stanford.edu/lagan_web/citing.shtml, nothing to do with > Bioperl) needing the environment variable LAGAN_DIR to be set in order > to work. > > Since you need to set LAGAN_DIR to make lagan work, it makes sense > that the Bioperl front-end to lagan also use the same variable. > Ah, OK! :-[ teach me for speak up about something I know nothing about! :-) FYI, I've been busy this morning installing as much Bioperl-run external software as I could (those that have tests). Will be posting results shorty. Nath From massimo.ubaldi at gmail.com Wed Oct 25 10:28:52 2006 From: massimo.ubaldi at gmail.com (Massimo Ubaldi) Date: Wed, 25 Oct 2006 16:28:52 +0200 Subject: [Bioperl-l] blastxml format Message-ID: <4b5350650610250728s1a421199if2493c9c4660474d@mail.gmail.com> Hi I'm using the script below to parse a blastn output to multiple sequences I got the output from the blast web interface asking for xml formatted output. Everything work fine except that I cannot print the name of each input sequence (see below). That is, using the line (see below) $result->query_description I got just the name of the first sequence. Infact this is defined by the tag. What I really want is to extract the name that is defined by the tag. Now I digged out the bioperl mailing list and other sources but I did not find anything to solve this. Can somebody help me? Thanks alot Massimo This is an example of ouput I got MRDNA_probe 46.1 PREDICTED: Danio rerio similar to mineralocorticoid receptor form B (LOC562171), mRNA 68354945 XM_685568 81.8 Danio rerio VDR-B mRNA, partial cds 68132043 DQ017633 PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA 68420187 XM_684078 This what I'd like to get MRDNA_probe 46.1 PREDICTED: Danio rerio similar to mineralocorticoid receptor form B (LOC562171), mRNA 68354945 XM_685568 VDRacterm_probe 81.8 Danio rerio VDR-B mRNA, partial cds 68132043 DQ017633 ARalpcterm_probe PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA 68420187 XM_684078 This is the script #!/usr/bin/perl use strict; use Bio::SearchIO; my $in = new Bio::SearchIO(-format => 'blast', -file => 'Blastn_danio.bls'); open OUTFILE, ">parsed_blastn_danio.txt" or die "Could not open file, stopped"; my $result = $in->next_result; print OUTFILE $result->algorithm, "\n"; print OUTFILE $result->database_name, "\n"; print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers", "\t", "GenBank Accession", "\n"; while($result = $in->next_result ) { print OUTFILE $result->query_description, "\n"; while( my $hit = $result->next_hit ) { while( my $hsp = $hit->next_hsp ) { my $acc=$hit->name; my $description= $hit->description; $acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/; print OUTFILE $hit->raw_score, "\t", # Score $hit->description, "\t", # Description $1, "\t", $2, "\n"; } } } From cjfields at uiuc.edu Wed Oct 25 11:04:14 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 25 Oct 2006 10:04:14 -0500 Subject: [Bioperl-l] blastxml format In-Reply-To: <4b5350650610250728s1a421199if2493c9c4660474d@mail.gmail.com> Message-ID: <000301c6f846$d6227760$15327e82@pyrimidine> Iterations (which are related to PSIBLAST) aren't currently handled in blastxml, which is why the tag isn't being parsed. I'll give it a look but I don't think it will be properly fixed anytime soon, since we're gearing up for a developer release and are sorting out various bugs in relation to that. In the meantime, you could always try changing the relevant tag in the %MAPPING hash in your local copy of Bio::SearchIO::blastxml from 'BlastOutput_query-def' to 'Iteration_query-def', which may do the trick for you. I'm a bit reluctant to change this in CVS as it would be better to add this in when iterations are handled properly by blastxml, and I'm not sure all BLAST XML varieties have the tag. If you want you can add this to the bioperl bugzilla as an enhancement request to remind us: http://bugzilla.open-bio.org/ Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Massimo Ubaldi > Sent: Wednesday, October 25, 2006 9:29 AM > To: bioperl-l List > Subject: [Bioperl-l] blastxml format > > Hi > I'm using the script below to parse a blastn output to multiple sequences > I got the output from the blast web interface asking for xml formatted > output. > Everything work fine except that I cannot print the name of each input > sequence (see below). > That is, using the line (see below) $result->query_description I got just > the name of the first sequence. Infact this is defined by the > tag. > What I really want is to extract the name that is defined by the > tag. > Now I digged out the bioperl mailing list and other sources but I did not > find anything to solve this. > Can somebody help me? > Thanks alot > Massimo > > > This is an example of ouput I got > > MRDNA_probe > 46.1 PREDICTED: Danio rerio similar to mineralocorticoid receptor form > B > (LOC562171), mRNA 68354945 XM_685568 > 81.8 Danio rerio VDR-B mRNA, partial cds 68132043 DQ017633 > PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA > 68420187 XM_684078 > > This what I'd like to get > MRDNA_probe > 46.1 PREDICTED: Danio rerio similar to mineralocorticoid receptor form > B > (LOC562171), mRNA 68354945 XM_685568 > VDRacterm_probe > 81.8 Danio rerio VDR-B mRNA, partial cds 68132043 DQ017633 > ARalpcterm_probe > PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA > 68420187 XM_684078 > > This is the script > #!/usr/bin/perl > use strict; > use Bio::SearchIO; > my $in = new Bio::SearchIO(-format => 'blast', > -file => 'Blastn_danio.bls'); > open OUTFILE, ">parsed_blastn_danio.txt" or die "Could not open file, > stopped"; > my $result = $in->next_result; > print OUTFILE $result->algorithm, "\n"; > print OUTFILE $result->database_name, "\n"; > > print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers", > "\t", "GenBank Accession", "\n"; > > while($result = $in->next_result ) { > print OUTFILE $result->query_description, "\n"; > while( my $hit = $result->next_hit ) { > while( my $hsp = $hit->next_hsp ) { > > my $acc=$hit->name; > my $description= $hit->description; > > $acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/; > > print OUTFILE > > $hit->raw_score, "\t", # Score > $hit->description, "\t", # Description > > $1, "\t", $2, "\n"; > } > } > } > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From massimo.ubaldi at gmail.com Wed Oct 25 11:20:49 2006 From: massimo.ubaldi at gmail.com (Massimo Ubaldi) Date: Wed, 25 Oct 2006 17:20:49 +0200 Subject: [Bioperl-l] blastxml format In-Reply-To: <000301c6f846$d6227760$15327e82@pyrimidine> References: <4b5350650610250728s1a421199if2493c9c4660474d@mail.gmail.com> <000301c6f846$d6227760$15327e82@pyrimidine> Message-ID: <4b5350650610250820w1498b27dnd155896fbf9a2012@mail.gmail.com> Thanks for the reply. I've already tried this but I got exactly the same results as before. What other can I try? Massimo On 10/25/06, Chris Fields wrote: > > Iterations (which are related to PSIBLAST) aren't currently handled in > blastxml, which is why the tag isn't being parsed. I'll give it a look > but > I don't think it will be properly fixed anytime soon, since we're gearing > up > for a developer release and are sorting out various bugs in relation to > that. > > In the meantime, you could always try changing the relevant tag in the > %MAPPING hash in your local copy of Bio::SearchIO::blastxml from > 'BlastOutput_query-def' to 'Iteration_query-def', which may do the trick > for > you. I'm a bit reluctant to change this in CVS as it would be better to > add > this in when iterations are handled properly by blastxml, and I'm not sure > all BLAST XML varieties have the tag. > > If you want you can add this to the bioperl bugzilla as an enhancement > request to remind us: > > http://bugzilla.open-bio.org/ > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > -----Original Message----- > > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > > bounces at lists.open-bio.org] On Behalf Of Massimo Ubaldi > > Sent: Wednesday, October 25, 2006 9:29 AM > > To: bioperl-l List > > Subject: [Bioperl-l] blastxml format > > > > Hi > > I'm using the script below to parse a blastn output to multiple > sequences > > I got the output from the blast web interface asking for xml formatted > > output. > > Everything work fine except that I cannot print the name of each input > > sequence (see below). > > That is, using the line (see below) $result->query_description I got > just > > the name of the first sequence. Infact this is defined by the > > tag. > > What I really want is to extract the name that is defined by the > > tag. > > Now I digged out the bioperl mailing list and other sources but I did > not > > find anything to solve this. > > Can somebody help me? > > Thanks alot > > Massimo > > > > > > This is an example of ouput I got > > > > MRDNA_probe > > 46.1 PREDICTED: Danio rerio similar to mineralocorticoid receptor > form > > B > > (LOC562171), mRNA 68354945 XM_685568 > > 81.8 Danio rerio VDR-B mRNA, partial cds 68132043 DQ017633 > > PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA > > 68420187 XM_684078 > > > > This what I'd like to get > > MRDNA_probe > > 46.1 PREDICTED: Danio rerio similar to mineralocorticoid receptor > form > > B > > (LOC562171), mRNA 68354945 XM_685568 > > VDRacterm_probe > > 81.8 Danio rerio VDR-B mRNA, partial cds 68132043 DQ017633 > > ARalpcterm_probe > > PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA > > 68420187 XM_684078 > > > > This is the script > > #!/usr/bin/perl > > use strict; > > use Bio::SearchIO; > > my $in = new Bio::SearchIO(-format => 'blast', > > -file => 'Blastn_danio.bls'); > > open OUTFILE, ">parsed_blastn_danio.txt" or die "Could not open file, > > stopped"; > > my $result = $in->next_result; > > print OUTFILE $result->algorithm, "\n"; > > print OUTFILE $result->database_name, "\n"; > > > > print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers", > > "\t", "GenBank Accession", "\n"; > > > > while($result = $in->next_result ) { > > print OUTFILE $result->query_description, "\n"; > > while( my $hit = $result->next_hit ) { > > while( my $hsp = $hit->next_hsp ) { > > > > my $acc=$hit->name; > > my $description= $hit->description; > > > > $acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/; > > > > print OUTFILE > > > > $hit->raw_score, "\t", # Score > > $hit->description, "\t", # Description > > > > $1, "\t", $2, "\n"; > > } > > } > > } > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at uiuc.edu Wed Oct 25 12:56:46 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 25 Oct 2006 11:56:46 -0500 Subject: [Bioperl-l] blastxml format In-Reply-To: <4b5350650610250820w1498b27dnd155896fbf9a2012@mail.gmail.com> Message-ID: <000001c6f856$8ee44bc0$15327e82@pyrimidine> > Thanks for the reply. I've already tried this but I got exactly the same > > results as before. > What other can I try? > Massimo If you don't mind me asking, what version of perl and Bioperl are you using, and what version of BLAST is used? I want to point out there are a number of problems with your script, now I have had a chance to look at it. 1) You have the SearchIO format set to 'blast'. It should be 'blastxml' if you are parsing XML format. 2) Every time you call next_result() you iterate through each BLAST report. In effect, you're doing something like this: my $result = $in->next_result(); ....# do something here (in first BLAST report) while ($result = $in->next_result()) { # change to second BLAST report # more stuff here (in second BLAST report, if there is one) } I don't know if it's intentional though, but it's something to point out. 3) You also use raw_score(), which doesn't return a value for me (this may be related to the bioperl version, which is why I asked above). If you use $hit->bits() or $hit->significance() you can get the bits or hit evalue, respectively. 4) Also, I didn't see a difference with the two XML tags and using BLAST 2.2.15 output (WebBLAST at NCBI), which makes sense since they should originate from the same query sequence anyway. This could be related to the BLAST version. Here's my version of your script, using WinXP and bioperl-live (CVS): use Bio::SearchIO; my $file = shift @ARGV; my $in = new Bio::SearchIO(-format => 'blastxml', -file => $file); open OUTFILE, ">parsed_blastn_danio.txt" || die "Could not open file, stopped"; while(my $result = $in->next_result ) { print OUTFILE $result->algorithm, "\n"; print OUTFILE $result->database_name, "\n"; print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers", "\t", "GenBank Accession", "\n"; print OUTFILE $result->query_description, "\n"; while( my $hit = $result->next_hit ) { while( my $hsp = $hit->next_hsp ) { my $acc=$hit->name; my $description= $hit->description; if ($acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/) { print OUTFILE $hit->bits, "\t", # Score $hit->description, "\t", # Description $1, "\t", $2, "\n"; } } } } Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign ... From n.haigh at sheffield.ac.uk Thu Oct 26 04:47:27 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 26 Oct 2006 09:47:27 +0100 Subject: [Bioperl-l] More extensive Bioperl-run 1.5.2RC2 tests Message-ID: <4540761F.6010904@sheffield.ac.uk> Oops, I posted this to the Biojava list the other day by mistake! I have recently installed some more software for which there are bioperl-run tests and run the test suite with several versions of the software I could find. I've added info to http://www.bioperl.org/wiki/Release_1.5.2#bioperl-run. If there were any fails in any of the versions I tested I've noted them together with versions that were ok (if any). There maybe another 6 or so programs I'm trying to get hold of to run further tests - I'll update when I get them. Nath From n.haigh at sheffield.ac.uk Thu Oct 26 05:14:07 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 26 Oct 2006 10:14:07 +0100 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally Message-ID: <45407C5F.40104@sheffield.ac.uk> I'm thinking that it's not wise to test for things like overall_percentage_identity etc in alignments that are generated by external software like T-Coffee, Clustalw etc. Changes to software algorithms/efficiency, bug fixes etc may well alter the quality of the alignment produced in different versions and thus affect the value returned by such methods. Therefore, I think these methods should only be tested from alignments loaded directly from t/data. Nath From bix at sendu.me.uk Thu Oct 26 05:48:37 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 26 Oct 2006 10:48:37 +0100 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <45407C5F.40104@sheffield.ac.uk> References: <45407C5F.40104@sheffield.ac.uk> Message-ID: <45408475.30903@sendu.me.uk> Nathan Haigh wrote: > I'm thinking that it's not wise to test for things like > overall_percentage_identity etc in alignments that are generated by > external software like T-Coffee, Clustalw etc. Changes to software > algorithms/efficiency, bug fixes etc may well alter the quality of the > alignment produced in different versions and thus affect the value > returned by such methods. Therefore, I think these methods should only > be tested from alignments loaded directly from t/data. Did you discover some specific problem cases? From n.haigh at sheffield.ac.uk Thu Oct 26 06:04:54 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 26 Oct 2006 11:04:54 +0100 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <45408475.30903@sendu.me.uk> References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk> Message-ID: <45408846.1050001@sheffield.ac.uk> Sendu Bala wrote: > Nathan Haigh wrote: >> I'm thinking that it's not wise to test for things like >> overall_percentage_identity etc in alignments that are generated by >> external software like T-Coffee, Clustalw etc. Changes to software >> algorithms/efficiency, bug fixes etc may well alter the quality of the >> alignment produced in different versions and thus affect the value >> returned by such methods. Therefore, I think these methods should only >> be tested from alignments loaded directly from t/data. > > Did you discover some specific problem cases? My messages seem to be taking a while to come through, but, yes. It may be due to the software changing default parameters, but it makes testing the output for specific details pretty difficult and inconsistent. For example, running T-Coffee, the following command from t/TCoffee.t results in slightly different alignment: $aln = $factory->run('-type' => 'profile', '-profile' => $aln1, '-seq' => Bio::Root::IO->catfile("t","data","cysprot1b.fa")); Of particular note, is the gaps on the last line of the sequences. In 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in CATH_RAT/1-333 ------mwtalpllcagawllsagat----------aeltvnaiek------------fh ftswmkqhqktyss-reyshrlqvfannwrkiqahn----qrnhtfkmglnqfsdmsfae ikhkylwsepqncsat--ksnylrgtgp--ypssmdwrkkgnvvspvknqgacgscwtfs ttgalesavaiasgkmmtlaeqqlvdcaqnfnnh--------gcqgglpsqafeyilynk gimgedsypyigkngqckfnpekavafvknvv-nitlndeaamveavalynpvsfafevt -edfmmyksgvyssnschktpdkvnhavlavgygeqn-----gllywivknswgsnwgnn gyfliergk-nm---cglaacasypipqv >CATL_HUMAN/1-333 --------------------------------mnptlilaafclgiasatltfdhsleaq wtkwkamhnrlygmnee-gwrravweknmkmielhnqeyregkhsftmamnafgdmtsee frqvmngfqnrkpr----kgkvfqeplfyeaprsvdwrekg-yvtpvknqgqcgscwafs atgalegqmfrktgrlislseqnlvdcsgpqgn--------egcngglmdyafqyvqdng gldseesypyeateesckynpkysvandtgfv-dip-kqekalmkavatvgpisvaidag hesflfykegiyfepdc--ssedmdhgvlvvgygfestesdnn-kywlvknswgeewgmg gyvkmakdrrnh---cgiasaasyptv-- >CATL_RAT/1-334 --------------------------------mtpllllavlclgtalatpkfdqtfnaq whqwksthrrlygtnee-ewrravweknmrmiqlhngeysngkhgftmemnafgdmtnee frqivngyrhqkhk----kgrlfqeplmlqipktvdwrekg-cvtpvknqgqcgscwafs asgclegqmflktgklislseqnlvdcshdqgn--------qgcngglmdfafqyikeng gldseesypyeakdgsckyraeyavandtgfv-dip-qqekalmkavatvgpisvamdas hpslqfyssgiyyepnc--sskdldhgvlvvgygyegtdsnkd-kywlvknswgkewgmd gyikiakdrnnh---cglataasypivn- >PAPA_CARPA/1-345 mamipsiskllfvaiclfvymglsfg-------------dfsivgysqndltsterliql feswmlkhnkiyknidekiyrfeifkdnlkyidetn----kknnsywlglnvfadmsnde fkekytgsiagnytttelsyeevlndgdvnipeyvdwrqkg-avtpvknqgscgscwafs avvtiegiikirtgnlneyseqelldc----------drrsygcnggypwsalqlvaqy- gihyrntypyegvqrycrsrekgpyaaktdgvrqvqpynegallysian-qpvsvvleaa gkdfqlyrggifvgpcgnk----vdhavaavgygpn---------yiliknswgtgwgen gyirikrgtgnsygvcglytssfypvkn- >ALEU_HORVU/1-362 maharvlllalavlataavavassssfadsnpirpvtdraastlesavlgalgrtrhalr farfavrygksyesaaevrrrfrifsesleevrstn----rkglpyrlginrfsdmswee fqatrlg-aaqtcsatlagnhlmrdaaa--lpetkdwredg-ivspvknqahcgscwtfs ttgaleaaytqatgknislseqqlvdcaggfnnf--------gcngglpsqafeyikyng gidteesypykgvngvchykaenaavqvldsv-nitlnaedelknavglvrpvsvafqvi -dgfrqyksgvytsdhcgttpddvnhavlavgygven-----gvpywliknswgadwgdn gyfkmemgk-nm---caiatcasypvvaa >CATH_HUMAN/1-335 ------mwatlpllcagawllg--------vpvcgaaelsvnslek------------fh fkswmskhrktys-teeyhhrlqtfasnwrkinahn----ngnhtfkmalnqfsdmsfae ikhkylwsepqncsatks--nylrgtgp--yppsvdwrkkgnfvspvknqgacgscwtfs ttgalesaiaiatgkmlslaeqqlvdcaqdfnny--------gcqgglpsqafeyilynk gimgedtypyqgkdgyckfqpgkaigfvkdva-nitiydeeamveavalynpvsfafevt -qdfmmyrtgiysstschktpdkvnhavlavgygekn-----gipywivknswgpqwgmn gyfliergk-nm---cglaacasypiplv >CYS1_DICDI/1-343 -----mkvillfvlavftvfvs---------------srgippeeq------------sq flefqdkfnkkys-heeylerfeifksnlgkieelnliainhkadtkfgvnkfadlssde fknyylnnkeaiftddlpvadylddefinsiptafdwrtrg-avtpvknqgqcgscwsfs ttgnvegqhfisqnklvslseqnlvdcdhecmeyegeeacdegcngglqpnaynyiikng giqtessypytaetgtqcnfnsanigakisnf-tmipknetvmagyivstgplaiaadav -e-wqfyiggvfdipcn---pnsldhgilivgysakntifrknmpywivknswgadwgeq gyiylrrgk-nt---cgvsnfvstsii-- While T-Coffee <4.45 returned: >CATH_RAT/1-333 ----------mwtalpllcagawllsagat----------aeltvnaiek---------- --fhftswmkqhqktyss-reyshrlqvfannwrkiqahn----q----rnhtfkmglnq fsdmsfaeikhkylwsepqncsat--ksnylrgtgp--ypssmdwrkkgnvvspvknqga cgscwtfsttgalesavaiasgkmmtlaeqqlvdcaqnfnnh--------gcqgglpsqa feyilynkgimgedsypyigkngqckfnpekavafvknvvn-itlndeaamveavalynp vsfafevt-edfmmyksgvyssnschktpdkvnhavlavgygeqn-----gllywivkns wgsnwgnngyfliergkn----mcglaacasypipqv >PAPA_CARPA/1-345 mamipsiskllfvaiclfvymglsfgdfsivgysqndltsterliqlfeswml------- -------------khnkiyknidekiyrf-----eifkdnlkyidetnkknnsywlglnv fadmsndefkekytgsiagnytttelsyeevlndgdvnipeyvdwrqkg-avtpvknqgs cgscwafsavvtiegiikirtgnlneyseqelldc----------drrsygcnggypwsa lq-lvaqygihyrntypyegvqrycrsrekgpyaaktdgvrqvqpynegallysia-nqp vsvvleaagkdfqlyrggifvgpcgnk----vdhavaavgygpn---------yilikns wgtgwgengyirikrgtgnsygvcglytssfypvkn- >CATL_HUMAN/1-333 -----------------------------------------mnptlilaafclgiasatl tfdhsleaqwtkwkamhnrlygmneegwrravweknmkmielhnqeyregkhsftmamna fgdmtseefrqvmngfqnrkprkgkvfqeplf----yeaprsvdwrekg-yvtpvknqgq cgscwafsatgalegqmfrktgrlislseqnlvdcsgpqgn--------egcngglmdya fqyvqdnggldseesypyeateesckynpkysvandtgfvd--ipkqekalmkavatvgp isvaidaghesflfykegiyfepdc--ssedmdhgvlvvgygfestesdnn-kywlvkns wgeewgmggyvkmakdrrnh---cgiasaasyptv-- >CATL_RAT/1-334 -----------------------------------------mtpllllavlclgtalatp kfdqtfnaqwhqwksthrrlygtneeewrravweknmrmiqlhngeysngkhgftmemna fgdmtneefrqivngyrhqkhkkgrlfqeplm----lqipktvdwrekg-cvtpvknqgq cgscwafsasgclegqmflktgklislseqnlvdcshdqgn--------qgcngglmdfa fqyikenggldseesypyeakdgsckyraeyavandtgfvd--ipqqekalmkavatvgp isvamdashpslqfyssgiyyepnc--sskdldhgvlvvgygyegtdsnkd-kywlvkns wgkewgmdgyikiakdrnnh---cglataasypivn- >ALEU_HORVU/1-362 ----maharvlllalavlataavavassssfadsnpirpvtdraastlesavlgalgrtr halrfarfavrygksyesaaevrrrfrifsesleevrstn----r----kglpyrlginr fsdmsweefqatrlg-aaqtcsatlagnhlmrdaaa--lpetkdwredg-ivspvknqah cgscwtfsttgaleaaytqatgknislseqqlvdcaggfnnf--------gcngglpsqa feyikynggidteesypykgvngvchykaenaavqvldsvn-itlnaedelknavglvrp vsvafqvi-dgfrqyksgvytsdhcgttpddvnhavlavgygven-----gvpywlikns wgadwgdngyfkmemgkn----mcaiatcasypvvaa >CATH_HUMAN/1-335 ----------mwatlpllcagawllg--------vpvcgaaelsvnslek---------- --fhfkswmskhrktys-teeyhhrlqtfasnwrkinahn----n----gnhtfkmalnq fsdmsfaeikhkylwsepqncsatks--nylrgtgp--yppsvdwrkkgnfvspvknqga cgscwtfsttgalesaiaiatgkmlslaeqqlvdcaqdfnny--------gcqgglpsqa feyilynkgimgedtypyqgkdgyckfqpgkaigfvkdvan-itiydeeamveavalynp vsfafevt-qdfmmyrtgiysstschktpdkvnhavlavgygekn-----gipywivkns wgpqwgmngyfliergkn----mcglaacasypiplv >CYS1_DICDI/1-343 ---------mkvillfvlavftvfvs---------------srgippeeq---------- --sqflefqdkfnkkys-heeylerfeifksnlgkieelnliain----hkadtkfgvnk fadlssdefknyylnnkeaiftddlpvadylddefinsiptafdwrtrg-avtpvknqgq cgscwsfsttgnvegqhfisqnklvslseqnlvdcdhecmeyegeeacdegcngglqpna ynyiiknggiqtessypytaetgtqcnfnsanigakisnft-mipknetvmagyivstgp laiaadav-e-wqfyiggvfdipcn---pnsldhgilivgysakntifrknmpywivkns wgadwgeqgyiylrrgkn----tcgvsnfvstsii-- From sanges at biogem.it Thu Oct 26 06:26:36 2006 From: sanges at biogem.it (Remo Sanges) Date: Thu, 26 Oct 2006 11:26:36 +0100 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <45408846.1050001@sheffield.ac.uk> References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk> <45408846.1050001@sheffield.ac.uk> Message-ID: <45408D5C.1000305@biogem.it> Nathan Haigh wrote: > Sendu Bala wrote: > >> Nathan Haigh wrote: >> >>> I'm thinking that it's not wise to test for things like >>> overall_percentage_identity etc in alignments that are generated by >>> external software like T-Coffee, Clustalw etc. Changes to software >>> algorithms/efficiency, bug fixes etc may well alter the quality of the >>> alignment produced in different versions and thus affect the value >>> returned by such methods. Therefore, I think these methods should only >>> be tested from alignments loaded directly from t/data. >>> >> Did you discover some specific problem cases? >> > My messages seem to be taking a while to come through, but, yes. It may > be due to the software changing default parameters, but it makes testing > the output for specific details pretty difficult and inconsistent. For > example, running T-Coffee, the following command from t/TCoffee.t > results in slightly different alignment: > $aln = $factory->run('-type' => 'profile', > '-profile' => $aln1, > '-seq' => > Bio::Root::IO->catfile("t","data","cysprot1b.fa")); > > Of particular note, is the gaps on the last line of the sequences. In > 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in > I'm not a T-coffee user but usually you can come across these problems when you use different scoring parameters when align sequences. Could it be possible that they have simply changed the default parameters for gap penalties and that kind of stuff? It is possible to set them? If so you can just run the test by defining the scores in the param hash without using the default. HTH Remo From n.haigh at sheffield.ac.uk Thu Oct 26 06:33:55 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 26 Oct 2006 11:33:55 +0100 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <45408D5C.1000305@biogem.it> References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk> <45408846.1050001@sheffield.ac.uk> <45408D5C.1000305@biogem.it> Message-ID: <45408F13.9020209@sheffield.ac.uk> Remo Sanges wrote: > Nathan Haigh wrote: >> Sendu Bala wrote: >> >>> Nathan Haigh wrote: >>> >>>> I'm thinking that it's not wise to test for things like >>>> overall_percentage_identity etc in alignments that are generated by >>>> external software like T-Coffee, Clustalw etc. Changes to software >>>> algorithms/efficiency, bug fixes etc may well alter the quality of the >>>> alignment produced in different versions and thus affect the value >>>> returned by such methods. Therefore, I think these methods should only >>>> be tested from alignments loaded directly from t/data. >>>> >>> Did you discover some specific problem cases? >>> >> My messages seem to be taking a while to come through, but, yes. It may >> be due to the software changing default parameters, but it makes testing >> the output for specific details pretty difficult and inconsistent. For >> example, running T-Coffee, the following command from t/TCoffee.t >> results in slightly different alignment: >> $aln = $factory->run('-type' => 'profile', >> '-profile' => $aln1, >> '-seq' => >> Bio::Root::IO->catfile("t","data","cysprot1b.fa")); >> >> Of particular note, is the gaps on the last line of the sequences. In >> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in >> > > I'm not a T-coffee user but usually you can come across > these problems when you use different scoring parameters > when align sequences. > > Could it be possible that they have simply changed the > default parameters for gap penalties and that kind of > stuff? It is possible to set them? > > If so you can just run the test by defining > the scores in the param hash without using the default. > > HTH > > Remo That is true, but it depends on the whether the wrapper is complete enough to be able to set all the parameters provided by the software. Nath From n.haigh at sheffield.ac.uk Thu Oct 26 12:13:03 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 26 Oct 2006 17:13:03 +0100 Subject: [Bioperl-l] Bio::Restriction::Enzyme Message-ID: <4540DE8F.7070501@sheffield.ac.uk> I'm in the middle of writing some code that uses Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using Bioperl from HEAD. I seem to find that $enzyme->is_palindromic always seems to return true. Can anyone verify this? If needs be, I can send some code. Thanks Nathan From info at nanotechcongresssmailer.net Tue Oct 24 10:45:10 2006 From: info at nanotechcongresssmailer.net (International Association of Nanotechnology) Date: Tue, 24 Oct 2006 09:45:10 -0500 Subject: [Bioperl-l] ICNT2006-presents Nanotechnology Workforce Development Message-ID: <200610241445.k9OEjBBA024478@portal.open-bio.org> An HTML attachment was scrubbed... URL: From bosborne11 at verizon.net Thu Oct 26 12:37:06 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Thu, 26 Oct 2006 12:37:06 -0400 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: <4540DE8F.7070501@sheffield.ac.uk> Message-ID: Nathan, Perhaps because most restriction sites are palindromes. Anyway, I added tests for palindromic() and is_palindromic() where the site is not a palindrome, these tests pass (t/RestrictionAnalyis.t). Brian O. On 10/26/06 12:13 PM, "Nathan Haigh" wrote: > I'm in the middle of writing some code that uses > Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using > Bioperl from HEAD. > > I seem to find that $enzyme->is_palindromic always seems to return true. > Can anyone verify this? If needs be, I can send some code. > > Thanks > Nathan > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From n.haigh at sheffield.ac.uk Thu Oct 26 12:49:48 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 26 Oct 2006 17:49:48 +0100 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: References: Message-ID: <4540E72C.5020800@sheffield.ac.uk> Brian Osborne wrote: > Nathan, > > Perhaps because most restriction sites are palindromes. Anyway, I added > tests for palindromic() and is_palindromic() where the site is not a > palindrome, these tests pass (t/RestrictionAnalyis.t). > > Brian O. > > > On 10/26/06 12:13 PM, "Nathan Haigh" wrote: > > >> I'm in the middle of writing some code that uses >> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using >> Bioperl from HEAD. >> >> I seem to find that $enzyme->is_palindromic always seems to return true. >> Can anyone verify this? If needs be, I can send some code. >> >> Thanks >> Nathan >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > > Ok, thanks - nice to know :-) From cjfields at uiuc.edu Thu Oct 26 12:58:34 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 26 Oct 2006 11:58:34 -0500 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: <4540DE8F.7070501@sheffield.ac.uk> Message-ID: <001301c6f91f$f9611770$15327e82@pyrimidine> > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Nathan Haigh > Sent: Thursday, October 26, 2006 11:13 AM > To: Bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] Bio::Restriction::Enzyme > > I'm in the middle of writing some code that uses > Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using > Bioperl from HEAD. > > I seem to find that $enzyme->is_palindromic always seems to return true. > Can anyone verify this? If needs be, I can send some code. > > Thanks > Nathan You should file a bug report if you have found a test case where this method isn't working as it should, especially if Brian's tests pass and you're still getting the wrong results. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Thu Oct 26 12:57:32 2006 From: jason at bioperl.org (Jason Stajich) Date: Thu, 26 Oct 2006 09:57:32 -0700 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <45408F13.9020209@sheffield.ac.uk> References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk> <45408846.1050001@sheffield.ac.uk> <45408D5C.1000305@biogem.it> <45408F13.9020209@sheffield.ac.uk> Message-ID: Nathan - I agree - the values tend to change with different versions of the applications unfortunately. It would make sense to just test that you get out sequences that are in valid alignment format and perhaps have as many ending sequences as you started with. The more restrictive tests probably aren't reliable with mixing and matching versions. One thing we do for PAML is condition tests on the version used - but of course when a new version comes out we have to add more stuff to the tests (or just have some code that skips those tests). -jason On Oct 26, 2006, at 3:33 AM, Nathan Haigh wrote: > Remo Sanges wrote: >> Nathan Haigh wrote: >>> Sendu Bala wrote: >>> >>>> Nathan Haigh wrote: >>>> >>>>> I'm thinking that it's not wise to test for things like >>>>> overall_percentage_identity etc in alignments that are >>>>> generated by >>>>> external software like T-Coffee, Clustalw etc. Changes to software >>>>> algorithms/efficiency, bug fixes etc may well alter the quality >>>>> of the >>>>> alignment produced in different versions and thus affect the value >>>>> returned by such methods. Therefore, I think these methods >>>>> should only >>>>> be tested from alignments loaded directly from t/data. >>>>> >>>> Did you discover some specific problem cases? >>>> >>> My messages seem to be taking a while to come through, but, yes. >>> It may >>> be due to the software changing default parameters, but it makes >>> testing >>> the output for specific details pretty difficult and >>> inconsistent. For >>> example, running T-Coffee, the following command from t/TCoffee.t >>> results in slightly different alignment: >>> $aln = $factory->run('-type' => 'profile', >>> '-profile' => $aln1, >>> '-seq' => >>> Bio::Root::IO->catfile("t","data","cysprot1b.fa")); >>> >>> Of particular note, is the gaps on the last line of the >>> sequences. In >>> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in >>> >> >> I'm not a T-coffee user but usually you can come across >> these problems when you use different scoring parameters >> when align sequences. >> >> Could it be possible that they have simply changed the >> default parameters for gap penalties and that kind of >> stuff? It is possible to set them? >> >> If so you can just run the test by defining >> the scores in the param hash without using the default. >> >> HTH >> >> Remo > That is true, but it depends on the whether the wrapper is complete > enough to be able to set all the parameters provided by the software. > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From cjfields at uiuc.edu Thu Oct 26 18:01:08 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 26 Oct 2006 17:01:08 -0500 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: Message-ID: <000301c6f94a$3e2a3f10$15327e82@pyrimidine> I have been running into similar issues with EUtilities tests. Since the data on the server is constantly updated I have to try an future-proof the tests so they don't constantly fail. I have been using Test::More and like/unlike or cmp_ok to get around some of those 'fuzzy data' issues. If some methods consistently return a particular type of value, such as an integer, you could use: like($foo->get_value, qr{^\d+$}, 'value test'); #integer or similar. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > Nathan - > > I agree - the values tend to change with different versions of the > applications unfortunately. It would make sense to just test that > you get out sequences that are in valid alignment format and perhaps > have as many ending sequences as you started with. The more > restrictive tests probably aren't reliable with mixing and matching > versions. > > One thing we do for PAML is condition tests on the version used - but > of course when a new version comes out we have to add more stuff to > the tests (or just have some code that skips those tests). > > -jason > On Oct 26, 2006, at 3:33 AM, Nathan Haigh wrote: > > > Remo Sanges wrote: > >> Nathan Haigh wrote: > >>> Sendu Bala wrote: > >>> > >>>> Nathan Haigh wrote: > >>>> > >>>>> I'm thinking that it's not wise to test for things like > >>>>> overall_percentage_identity etc in alignments that are > >>>>> generated by > >>>>> external software like T-Coffee, Clustalw etc. Changes to software > >>>>> algorithms/efficiency, bug fixes etc may well alter the quality > >>>>> of the > >>>>> alignment produced in different versions and thus affect the value > >>>>> returned by such methods. Therefore, I think these methods > >>>>> should only > >>>>> be tested from alignments loaded directly from t/data. > >>>>> > >>>> Did you discover some specific problem cases? > >>>> > >>> My messages seem to be taking a while to come through, but, yes. > >>> It may > >>> be due to the software changing default parameters, but it makes > >>> testing > >>> the output for specific details pretty difficult and > >>> inconsistent. For > >>> example, running T-Coffee, the following command from t/TCoffee.t > >>> results in slightly different alignment: > >>> $aln = $factory->run('-type' => 'profile', > >>> '-profile' => $aln1, > >>> '-seq' => > >>> Bio::Root::IO->catfile("t","data","cysprot1b.fa")); > >>> > >>> Of particular note, is the gaps on the last line of the > >>> sequences. In > >>> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in > >>> >>> > >> I'm not a T-coffee user but usually you can come across > >> these problems when you use different scoring parameters > >> when align sequences. > >> > >> Could it be possible that they have simply changed the > >> default parameters for gap penalties and that kind of > >> stuff? It is possible to set them? > >> > >> If so you can just run the test by defining > >> the scores in the param hash without using the default. > >> > >> HTH > >> > >> Remo > > That is true, but it depends on the whether the wrapper is complete > > enough to be able to set all the parameters provided by the software. > > > > Nath > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich, PhD > Miller Research Fellow > University of California > Dept of Plant and Microbial Biology > 321 Koshland Hall #3102 > Berkeley, CA 94720-3102 > lab: 510.642.8441 > http://pmb.berkeley.edu/~taylor/people/js.html > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From gbazykin at Princeton.EDU Thu Oct 26 18:49:56 2006 From: gbazykin at Princeton.EDU (Georgii A Bazykin) Date: Thu, 26 Oct 2006 18:49:56 -0400 Subject: [Bioperl-l] about PAML running within bioperl In-Reply-To: <001901c6dbcf$9af4de50$0915020a@zchou> References: <001901c6dbcf$9af4de50$0915020a@zchou> Message-ID: <185431468.20061026184956@princeton.edu> I just had the exact same problem, which was also (as in Caleb Davis's case) was solved by switching to PAML 3.14 from 3.15. ------------------------------ Tuesday, September 19, 2006, 5:40:07 AM, you wrote: > Hello, every one, > I use code in the PAML HOWTO (running PAML fom within Bioperl) on > my Linux OS. And I set ENV as described by instructions. At the > beginning, it seems that ClustalW run smoothly. However, when the > programme run to call method "get_MLmatrix", somethign happened. The > following information was listed as follows: (What reason or How to solve these problems?) > ........ > Sequences (2:3) Aligned. Score: 87 > Sequences (2:4) Aligned. Score: 88 > Sequences (2:5) Aligned. Score: 87 > Sequences (2:6) Aligned. Score: 87 > Sequences (2:7) Aligned. Score: 87 > Sequences (2:8) Aligned. Score: 87 > Sequences (3:4) Aligned. Score: 93 > Sequences (3:5) Aligned. Score: 93 > Sequences (3:6) Aligned. Score: 93 > Sequences (3:7) Aligned. Score: 92 > Sequences (3:8) Aligned. Score: 92 > Sequences (4:5) Aligned. Score: 99 > Sequences (4:6) Aligned. Score: 99 > Sequences (4:7) Aligned. Score: 98 > Sequences (4:8) Aligned. Score: 98 > Sequences (5:6) Aligned. Score: 100 > Sequences (5:7) Aligned. Score: 99 > Sequences (5:8) Aligned. Score: 99 > Sequences (6:7) Aligned. Score: 99 > Sequences (6:8) Aligned. Score: 99 > Sequences (7:8) Aligned. Score: 100 > Guide tree file created: > [/home/zchou/TMPDIR/8QEqLivAKY/JU833u8OTP.dnd] > Start of Multiple Alignment > There are 7 groups > Aligning... > Group 1: Sequences: 2 Score:5875 > Group 2: Sequences: 2 Score:5877 > Group 3: Sequences: 4 Score:5864 > Group 4: Sequences: 5 Score:5537 > Group 5: Sequences: 6 Score:5727 > Group 6: Sequences: 7 Score:5608 > Group 7: Sequences: 8 Score:5607 > Alignment Score 43650 > GCG-Alignment file created > [/home/zchou/TMPDIR/8QEqLivAKY/CussPD56rZ] > aligned aa sequences were: Bio::SimpleAlign=HASH(0x87b93f4) > Can't call method "get_MLmatrix" on an undefined value at > originalpaml.pl line 57, line 332. > Zhuocheng Hou > Department of Animal Genetics and Breeding > China Agricultural University From himanshu.ardawatia at bccs.uib.no Thu Oct 26 21:54:36 2006 From: himanshu.ardawatia at bccs.uib.no (Himanshu Ardawatia) Date: Fri, 27 Oct 2006 03:54:36 +0200 Subject: [Bioperl-l] Query on tree bootstrap values Message-ID: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> Hi, 2 questions : 1. I have a phylogenetic tree and I wish to set (or modify or query) bootstrap values for all internal nodes. How do I do that using BioPerl ? 2. I tried the example script attached below for general purpose for the example newick tree with bootstrap values (also attached below) and It gives strange results even for branch length. It shows Parent ID as 0.71 which actually is the bootstrap value for the last ancestral node for human and chimp and It shows the Child node ID as 'Human' ! Am I missing something in the tree formatting ? Results also attached below. Also how to extract / modify/ add bootstrap values in this tree ? Thanks Himanshu EXAMPLE TREE (Newick with bootstrap values and branch lengths) : ################################# ( ('Chimp' : 0.052, 'Human' : 0.042) 0.71 : 0.007, 'Gorilla' : 0.060, ('Gibbon' : 0.124, 'Orangutan' : 0.0971) 1 : 0.038 ); ################################# EXAMPLE SCRIPT: ################################# #!/usr/bin/perl -w use Bio::Seq; # use Bio::TreeIO; use Bio::Tree::TreeI; # get a Tree::NodeI somehow # like from a TreeIO use Bio::TreeIO; # read in a clustalw NJ in phylip/newick format my $treeio = new Bio::TreeIO(-format => 'newick', -file => 'example_newick_tree.newick'); my $tree = $treeio->next_tree; # we'll assume it worked for demo purposes # you might want to test that it was defined my $rootnode = $tree->get_root_node; # process just the next generation foreach my $node ( $rootnode->each_Descendent() ) { print "branch len is ", $node->branch_length, "\n"; } # process all the children my $example_leaf_node; foreach my $node ( $rootnode->get_Descendents() ) { if( $node->is_Leaf ) { print "node is a leaf ... "; # for example use below $example_leaf_node = $node unless defined $example_leaf_node; } print "branch len is ", $node->branch_length, "\n"; } # The ancestor() method points to the parent of a node # A node can only have one parent my $parent = $example_leaf_node->ancestor; # parent won't likely have an description because it is an internal node # but child will because it is a leaf print "Parent id: ", $parent->id," child id: ", $example_leaf_node->id, "\n"; ########################################## RESULTS: branch len is 0.007 branch len is 0.060 branch len is 0.038 node is a leaf ... branch len is 0.042 node is a leaf ... branch len is 0.052 branch len is 0.007 node is a leaf ... branch len is 0.060 node is a leaf ... branch len is 0.0971 node is a leaf ... branch len is 0.124 branch len is 0.038 Parent id: _0.71_ child id: ___'Human'__ From n.haigh at sheffield.ac.uk Fri Oct 27 04:42:23 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 08:42:23 +0000 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: References: Message-ID: <4541C66F.1020404@sheffield.ac.uk> Hi Brian, I wonder if i'm using is_prototype() correctly as I don't seem to get any returning true: my $enz_coll = Bio::Restriction::EnzymeCollection->new(); my $prototype = 0; foreach my $enz ($enz_coll->each_enzyme) { $prototype++ if $enz->is_prototype; } print "$prototype have unique recognition sites\n"; prints: 0 have unique recognition sites Thanks Nath Brian Osborne wrote: > Nathan, > > Perhaps because most restriction sites are palindromes. Anyway, I added > tests for palindromic() and is_palindromic() where the site is not a > palindrome, these tests pass (t/RestrictionAnalyis.t). > > Brian O. > > > On 10/26/06 12:13 PM, "Nathan Haigh" wrote: > > >> I'm in the middle of writing some code that uses >> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using >> Bioperl from HEAD. >> >> I seem to find that $enzyme->is_palindromic always seems to return true. >> Can anyone verify this? If needs be, I can send some code. >> >> Thanks >> Nathan >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > > -- > A: Yes. >> Q: Are you sure? >> >>> A: Because it reverses the logical flow of conversation. >>> >>>> Q: Why is top posting frowned upon? >>>> Get Thunderbird From n.haigh at sheffield.ac.uk Fri Oct 27 04:47:21 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 08:47:21 +0000 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: <001301c6f91f$f9611770$15327e82@pyrimidine> References: <001301c6f91f$f9611770$15327e82@pyrimidine> Message-ID: <4541C799.4090507@sheffield.ac.uk> Chris Fields wrote: >> -----Original Message----- >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >> bounces at lists.open-bio.org] On Behalf Of Nathan Haigh >> Sent: Thursday, October 26, 2006 11:13 AM >> To: Bioperl-l at lists.open-bio.org >> Subject: [Bioperl-l] Bio::Restriction::Enzyme >> >> I'm in the middle of writing some code that uses >> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using >> Bioperl from HEAD. >> >> I seem to find that $enzyme->is_palindromic always seems to return true. >> Can anyone verify this? If needs be, I can send some code. >> >> Thanks >> Nathan >> > > You should file a bug report if you have found a test case where this method > isn't working as it should, especially if Brian's tests pass and you're > still getting the wrong results. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > I was doing some filtering of the default set of enzymes and happened to removed the 2 that are not palindromic before I used is_palindromic(). Thus, I didn't see any that were not palindromic - if that makes sense! Since I know very little about restriction enzymes, I'll trust that these are correct :-) and I'm getting the correct results. Thanks Nath From n.haigh at sheffield.ac.uk Fri Oct 27 05:04:40 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 09:04:40 +0000 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <000301c6f94a$3e2a3f10$15327e82@pyrimidine> References: <000301c6f94a$3e2a3f10$15327e82@pyrimidine> Message-ID: <4541CBA8.10006@sheffield.ac.uk> Chris Fields wrote: > I have been running into similar issues with EUtilities tests. Since the > data on the server is constantly updated I have to try an future-proof the > tests so they don't constantly fail. > > I have been using Test::More and like/unlike or cmp_ok to get around some of > those 'fuzzy data' issues. If some methods consistently return a particular > type of value, such as an integer, you could use: > > like($foo->get_value, qr{^\d+$}, 'value test'); #integer > > or similar. > > Chris > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > >> Nathan - >> >> I agree - the values tend to change with different versions of the >> applications unfortunately. It would make sense to just test that >> you get out sequences that are in valid alignment format and perhaps >> have as many ending sequences as you started with. The more >> restrictive tests probably aren't reliable with mixing and matching >> versions. >> >> One thing we do for PAML is condition tests on the version used - but >> of course when a new version comes out we have to add more stuff to >> the tests (or just have some code that skips those tests). >> >> -jason >> On Oct 26, 2006, at 3:33 AM, Nathan Haigh wrote: >> >> I think it makes sense to test that data of the expected type was returned by the xternal resource but not to test the specifics of what was retured. If specifics are tested we are then in the realm of testing whether we believe the data returned by the external resource or not. We should assume that the domain experts for these resources know what they are doing - in some cases this might not be true :-) but I think we should stick to testing that the objects created hold the expected type of data. I like what Chris had to say (above) but wonder whether tests would/should be tested for in the module itself - i.e. testing that a stored value is an integer and warn/throw if not? Nath From bix at sendu.me.uk Fri Oct 27 05:08:18 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 27 Oct 2006 10:08:18 +0100 Subject: [Bioperl-l] Query on tree bootstrap values In-Reply-To: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> Message-ID: <4541CC82.2040705@sendu.me.uk> Himanshu Ardawatia wrote: > Hi, > > 2 questions : > > 1. I have a phylogenetic tree and I wish to set (or modify or query) > bootstrap values for all internal nodes. How do I do that using BioPerl ? Does bootstrap() not do what you need? > 2. I tried the example script attached below for general purpose for the > example newick tree with bootstrap values (also attached below) and It gives > strange results even for branch length. It shows Parent ID as 0.71 which > actually is the bootstrap value for the last ancestral node for human and > chimp and It shows the Child node ID as 'Human' ! Am I missing something in > the tree formatting ? Results also attached below. Also how to extract / > modify/ add bootstrap values in this tree ? [snip] > EXAMPLE TREE (Newick with bootstrap values and branch lengths) : > ################################# > ( > ('Chimp' : 0.052, > 'Human' : 0.042) 0.71 : 0.007, > 'Gorilla' : 0.060, > ('Gibbon' : 0.124, > 'Orangutan' : 0.0971) 1 : 0.038 > ); > ################################# Are you sure this is in the correct format? For example, with the tree: ( ('Chimp':0.052,'Human':0.042)'Human_Chimp_Ancestor':0.007, 'Gorilla':0.060, ('Gibbon':0.124,'Orangutan':0.0971)'Gibbon_Orangutan_Ancestor':0.038); and your script (with a print "--\n" between the two printing loops for clarity) I get... > ########################################## > > RESULTS: > branch len is 0.007 > branch len is 0.060 > branch len is 0.038 > node is a leaf ... branch len is 0.042 > node is a leaf ... branch len is 0.052 > branch len is 0.007 > node is a leaf ... branch len is 0.060 > node is a leaf ... branch len is 0.0971 > node is a leaf ... branch len is 0.124 > branch len is 0.038 > Parent id: _0.71_ child id: ___'Human'__ ... branch len is 0.007 branch len is 0.060 branch len is 0.038 -- branch len is 0.007 node is a leaf ... branch len is 0.052 node is a leaf ... branch len is 0.042 node is a leaf ... branch len is 0.060 branch len is 0.038 node is a leaf ... branch len is 0.124 node is a leaf ... branch len is 0.0971 Parent id: 'Human_Chimp_Ancestor' child id: 'Chimp' This seems reasonable to me. What were you expecting? From n.haigh at sheffield.ac.uk Fri Oct 27 07:36:10 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 11:36:10 +0000 Subject: [Bioperl-l] Query on tree bootstrap values In-Reply-To: <4541CC82.2040705@sendu.me.uk> References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> <4541CC82.2040705@sendu.me.uk> Message-ID: <4541EF2A.4050600@sheffield.ac.uk> Sendu Bala wrote: > Himanshu Ardawatia wrote: > >> Hi, >> >> 2 questions : >> >> 1. I have a phylogenetic tree and I wish to set (or modify or query) >> bootstrap values for all internal nodes. How do I do that using BioPerl ? >> > > Does bootstrap() not do what you need? > > > >> 2. I tried the example script attached below for general purpose for the >> example newick tree with bootstrap values (also attached below) and It gives >> strange results even for branch length. It shows Parent ID as 0.71 which >> actually is the bootstrap value for the last ancestral node for human and >> chimp and It shows the Child node ID as 'Human' ! Am I missing something in >> the tree formatting ? Results also attached below. Also how to extract / >> modify/ add bootstrap values in this tree ? >> > [snip] > >> EXAMPLE TREE (Newick with bootstrap values and branch lengths) : >> ################################# >> ( >> ('Chimp' : 0.052, >> 'Human' : 0.042) 0.71 : 0.007, >> 'Gorilla' : 0.060, >> ('Gibbon' : 0.124, >> 'Orangutan' : 0.0971) 1 : 0.038 >> ); >> ################################# >> > > Are you sure this is in the correct format? > He/she may have a tree that already contains bootstrap values output from another program. If this is so, which program did you use? Without reminding myself of the formats, you should lookup newick format and whther it is possible to store bootstraps in it. In addition you should also look up the nhx format. > For example, with the tree: > ( ('Chimp':0.052,'Human':0.042)'Human_Chimp_Ancestor':0.007, > 'Gorilla':0.060, > ('Gibbon':0.124,'Orangutan':0.0971)'Gibbon_Orangutan_Ancestor':0.038); > > This tree does not contain any bootstrap values - only branch lengths. Sorry I can't be much more help at the moment - if i get a spare 10 mins i'll have a closer look. Nath From bix at sendu.me.uk Fri Oct 27 07:16:08 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 27 Oct 2006 12:16:08 +0100 Subject: [Bioperl-l] Query on tree bootstrap values In-Reply-To: <4541EF2A.4050600@sheffield.ac.uk> References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> <4541CC82.2040705@sendu.me.uk> <4541EF2A.4050600@sheffield.ac.uk> Message-ID: <4541EA78.3050404@sendu.me.uk> Nathan S. Haigh wrote: > Sendu Bala wrote: >> Himanshu Ardawatia wrote: >>> >>> EXAMPLE TREE (Newick with bootstrap values and branch lengths) : >>> ################################# >>> ( >>> ('Chimp' : 0.052, >>> 'Human' : 0.042) 0.71 : 0.007, >>> 'Gorilla' : 0.060, >>> ('Gibbon' : 0.124, >>> 'Orangutan' : 0.0971) 1 : 0.038 >>> ); >>> ################################# >>> >> Are you sure this is in the correct format? >> > > He/she may have a tree that already contains bootstrap values output > from another program. If this is so, which program did you use? Without > reminding myself of the formats, you should lookup newick format and > whther it is possible to store bootstraps in it. In addition you should > also look up the nhx format. Ah, well from a brief google it seemed like some software do store boostrap values for internal nodes as the node ids when outputting in Newick format. I don't think Bioperl should be able to tell the difference between a normal id and a bootstrap value, so you'll have to detect that yourself and manually use bootstrap() when you get an id that looks like a number. Or should Bioperl be making this assumption for you? Is that a safe thing to do? Maybe as an option only? From n.haigh at sheffield.ac.uk Fri Oct 27 08:24:49 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 12:24:49 +0000 Subject: [Bioperl-l] Query on tree bootstrap values In-Reply-To: <4541EA78.3050404@sendu.me.uk> References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> <4541CC82.2040705@sendu.me.uk> <4541EF2A.4050600@sheffield.ac.uk> <4541EA78.3050404@sendu.me.uk> Message-ID: <4541FA91.3040505@sheffield.ac.uk> --snip-- > > Ah, well from a brief google it seemed like some software do store > boostrap values for internal nodes as the node ids when outputting in > Newick format. I don't think Bioperl should be able to tell the > difference between a normal id and a bootstrap value, so you'll have > to detect that yourself and manually use bootstrap() when you get an > id that looks like a number. If I remember rightly, in programs like Clustal you can specify where bootstrap values are stored - node or branch. I can't remember which is the default way, but TreeView can only see bootstraps in they are stored using the "non-default" setting. This "could" be the same issue here. > > Or should Bioperl be making this assumption for you? Is that a safe > thing to do? Maybe as an option only? I don't know without a closer look - i'd also need to look at the newick format definition as to whether this is an "extension" to the format or if something is just flouting the newick rules. Nath From n.haigh at sheffield.ac.uk Fri Oct 27 08:59:51 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 12:59:51 +0000 Subject: [Bioperl-l] Caching sequences Message-ID: <454202C7.1040701@sheffield.ac.uk> I have a script that is capable of downloading sequences from GenBank based on GI numbers. I retrieve them if fasta format in order to save bandwidth, but I'd like to take this one step further and cache the sequences in case the user want to rerun the script using some of the GI's they used previously. Does anyone have any guidance on how best to do this? Cheers Nath From bix at sendu.me.uk Fri Oct 27 08:35:13 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 27 Oct 2006 13:35:13 +0100 Subject: [Bioperl-l] Caching sequences In-Reply-To: <454202C7.1040701@sheffield.ac.uk> References: <454202C7.1040701@sheffield.ac.uk> Message-ID: <4541FD01.6090803@sendu.me.uk> Nathan S. Haigh wrote: > I have a script that is capable of downloading sequences from GenBank > based on GI numbers. I retrieve them if fasta format in order to save > bandwidth, but I'd like to take this one step further and cache the > sequences in case the user want to rerun the script using some of the > GI's they used previously. > > Does anyone have any guidance on how best to do this? You'd probably write the sequences out in some suitable format and access them via Bio::Index Or, I'm sure bioperl-db excels at this kind of thing, but is a little more involved if this is only a simple situation. From bosborne11 at verizon.net Fri Oct 27 09:09:30 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Fri, 27 Oct 2006 09:09:30 -0400 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: <4541C66F.1020404@sheffield.ac.uk> Message-ID: Nathan, I don't know how this is supposed to work, there would be different ways to make is_prototype true. One way would be to make the enzyme with the first occurrence of a given restriction site the prototype (and the next enzymes with the same site are isoschizomers). Or, one could wait until one site had appeared twice, with 2 different enzymes, then make the first the prototype, etc. I would have done it the first way myself but I took a quick look at IO/withrefm.pm and it looks like it's doing it the second way. That means one can read an enzyme file and end up with no duplicated restriction sites, or prototypes and isoschizomers. Brian O. On 10/27/06 4:42 AM, "Nathan S. Haigh" wrote: > Hi Brian, > > I wonder if i'm using is_prototype() correctly as I don't seem to get > any returning true: > > my $enz_coll = Bio::Restriction::EnzymeCollection->new(); > my $prototype = 0; > foreach my $enz ($enz_coll->each_enzyme) { > $prototype++ if $enz->is_prototype; > } > print "$prototype have unique recognition sites\n"; > > prints: > 0 have unique recognition sites > > Thanks > Nath > > Brian Osborne wrote: >> Nathan, >> >> Perhaps because most restriction sites are palindromes. Anyway, I added >> tests for palindromic() and is_palindromic() where the site is not a >> palindrome, these tests pass (t/RestrictionAnalyis.t). >> >> Brian O. >> >> >> On 10/26/06 12:13 PM, "Nathan Haigh" wrote: >> >> >>> I'm in the middle of writing some code that uses >>> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using >>> Bioperl from HEAD. >>> >>> I seem to find that $enzyme->is_palindromic always seems to return true. >>> Can anyone verify this? If needs be, I can send some code. >>> >>> Thanks >>> Nathan >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> >> >> > From n.haigh at sheffield.ac.uk Fri Oct 27 10:19:02 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 14:19:02 +0000 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: References: Message-ID: <45421556.9060300@sheffield.ac.uk> Brian Osborne wrote: > Nathan, > > I don't know how this is supposed to work, there would be different ways to > make is_prototype true. One way would be to make the enzyme with the first > occurrence of a given restriction site the prototype (and the next enzymes > with the same site are isoschizomers). Or, one could wait until one site had > appeared twice, with 2 different enzymes, then make the first the prototype, > etc. I would have done it the first way myself but I took a quick look at > IO/withrefm.pm and it looks like it's doing it the second way. That means > one can read an enzyme file and end up with no duplicated restriction sites, > or prototypes and isoschizomers. > > Brian O. > > Hmm, I'd have done it the first way also. Doing it the second way would mean you only ended up with something as a prototype if there were multiple enzymes with the same restriction site - is that correct biologically? Nath From n.haigh at sheffield.ac.uk Fri Oct 27 10:23:20 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 14:23:20 +0000 Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage Message-ID: <45421658.5000103@sheffield.ac.uk> As you may be aware by now, i'm working with Bio::Restriction::Analysis and friends. I'm doing restriction analysis on large sequences - chromosomes. I need to identify an appropriate enzyme based on the total length of fragments that are of a certain size (e.g. 100 - 500 bp). However, the amount of memory used by Bio::Restriction::Analysis::fragments() is prohibative. I have the following code (bottom) which downloads 2 thaliana chromosomes (mito and chloro - so pretty small) and runs an analysis and then loops through the fragments for all enzymes in the default collection. My memory usage just keep on climbing and none seems to get freed up even when a $ra goes out of scope (start dealing with the next sequence). Is this a memory leak of some sort, is there a way to free up memory as I go? I'd appreciate any help/advice on how to reduce the amount of memory being consumed as I'd like to use all the thaliana chromosomes (not just mito and chloro), which at the moment probably won't work. Cheers Nath use strict; use Bio::DB::GenBank; use Bio::Restriction::Analysis; use Bio::Restriction::EnzymeCollection; my @seq_objs; my @gis = ( 7525012, 26556996 ); my $db = Bio::DB::GenBank->new(-format => "fasta"); foreach my $gi (@gis) { print "Getting GI: $gi\n"; push @seq_objs, $db->get_Seq_by_id($gi) } my $min_fragment_size = 100; my $max_fragment_size = 500; my $enz_Coll = Bio::Restriction::EnzymeCollection->new(); foreach my $seq (@seq_objs) { my $tot_size = 0; print "Processing ", $seq->primary_id,"\n"; my $ra = Bio::Restriction::Analysis->new( -seq=>$seq, -enzymes=>$enz_Coll, ); my @all_enzymes = $ra->cutters->each_enzyme; print " Calc total length of fragments in range: $min_fragment_size - $max_fragment_size\n"; foreach my $enzyme ( @all_enzymes ) { # fragments() is a real memory hog foreach my $frag ($ra->fragments($enzyme)) { next if $min_fragment_size && (length $frag < $min_fragment_size); next if $max_fragment_size && (length $frag > $max_fragment_size); $tot_size += length $frag; } # do something based on value of $tot_size #print " ", $enzyme->name, " total = $tot_size\n"; } print "DONE\n"; } From avilella at gmail.com Fri Oct 27 09:39:41 2006 From: avilella at gmail.com (Albert Vilella) Date: Fri, 27 Oct 2006 14:39:41 +0100 Subject: [Bioperl-l] scale branch lengths of a tree to sum 1 In-Reply-To: <358f4d650610270623t13875c8bv50278af5ee5b3c44@mail.gmail.com> References: <358f4d650610270623t13875c8bv50278af5ee5b3c44@mail.gmail.com> Message-ID: <358f4d650610270639q14870a6erae2e3c4e9063105d@mail.gmail.com> I respond to myself: I think I found the way: my $tree = $treeio->next_tree; my $total_branch_length = 0; foreach my $node ($tree->get_nodes) { $total_branch_length += $node->branch_length; } foreach my $node ($tree->get_nodes) { my $branch_length = $node->branch_length; next unless (defined($branch_length)); $node->branch_length($branch_length/$total_branch_length); 1; } my $new_branch_length; foreach my $node ($tree->get_nodes) { $new_branch_length += $node->branch_length; } 1; On 10/27/06, Albert Vilella wrote: > Hi all, > > I am in need of a method that would scale the different branch lengths > of a tree so that after the scaling they all sum up to exactly 1. > > Any pointers? Has anyone done that before? > > Thanks in advance, > > Albert. > From cjfields at uiuc.edu Fri Oct 27 10:35:35 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 09:35:35 -0500 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <4541CBA8.10006@sheffield.ac.uk> Message-ID: <001501c6f9d5$2e33e120$15327e82@pyrimidine> ... > I think it makes sense to test that data of the expected type was > returned by the xternal resource but not to test the specifics of what > was retured. If specifics are tested we are then in the realm of testing > whether we believe the data returned by the external resource or not. We > should assume that the domain experts for these resources know what they > are doing - in some cases this might not be true :-) but I think we > should stick to testing that the objects created hold the expected type > of data. > > I like what Chris had to say (above) but wonder whether tests > would/should be tested for in the module itself - i.e. testing that a > stored value is an integer and warn/throw if not? > > Nath Yeah, sorry about the top post (stupid Outlook always sticks the sig at the top of the page!). Testing in the module would be best but can be tricky for the very same reasons that writing tests entail, even more so. For instance, for NCBI esummary data, I parse the data in a very generic way in order to have access to as much data as possible. For tests, I have to assume that NCBI will always return a particular type of value (string, integer, date). I can test for each of those with a regex in the module fairly simply and throw/wanr, as you indicate. However, if they decide to add new data with a data tag other that the ones I test for in the module (i.e. String, Integer, Date), I suddenly have warns/throws showing up and cluttering/clobbering the code for perfectly valid data. However, if these are caught in tests and the tests fail, no big loss. The actual module still works, even if the tests are failing based on an new unknown value being returned. For me, failed tests are sort of a warning light to let me know that something has changed, but it doesn't necessarily mean a module doesn't work. I generally use throw/warn for something truly catastrophic, like no response from the server or an error in the XML, which affects downstream methods. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Fri Oct 27 11:09:36 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 10:09:36 -0500 Subject: [Bioperl-l] Caching sequences In-Reply-To: <454202C7.1040701@sheffield.ac.uk> Message-ID: <001601c6f9d9$ebd8c7f0$15327e82@pyrimidine> > I have a script that is capable of downloading sequences from GenBank > based on GI numbers. I retrieve them if fasta format in order to save > bandwidth, but I'd like to take this one step further and cache the > sequences in case the user want to rerun the script using some of the > GI's they used previously. > > Does anyone have any guidance on how best to do this? > > Cheers > Nath There is Bio::DB::InMemoryCache, which is really an interface but appears to have several methods defined; you could look for modules which implement it. Sendu's suggestion of the Bio::Index modules and bioperl-db are also good starting points. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Fri Oct 27 11:21:49 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 10:21:49 -0500 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: <45421556.9060300@sheffield.ac.uk> Message-ID: <001701c6f9db$9f90d160$15327e82@pyrimidine> > Brian Osborne wrote: > > Nathan, > > > > I don't know how this is supposed to work, there would be different ways > to > > make is_prototype true. One way would be to make the enzyme with the > first > > occurrence of a given restriction site the prototype (and the next > enzymes > > with the same site are isoschizomers). Or, one could wait until one site > had > > appeared twice, with 2 different enzymes, then make the first the > prototype, > > etc. I would have done it the first way myself but I took a quick look > at > > IO/withrefm.pm and it looks like it's doing it the second way. That > means > > one can read an enzyme file and end up with no duplicated restriction > sites, > > or prototypes and isoschizomers. > > > > Brian O. > > > > > Hmm, I'd have done it the first way also. Doing it the second way would > mean you only ended up with something as a prototype if there were > multiple enzymes with the same restriction site - is that correct > biologically? > > Nath I had a look at all the Restriction::IO modules a while back; most need serious updating! It just hasn't been a top priority unfortunately. I think the prototype issue may depend on the IO format and whether or not one is defined explicitly in the file being parsed or is just chosen based on what Brian said (order in the file, similar cutting site). By the strictest definition (and cheating by looking at the Fermentas web site), the prototype is supposed to be the first enzyme discovered which cleaves a unique sequence, so it may not be the first enzyme found in the file. Isoschizomers are those discovered to cleave the same sequence subsequent to the prototype. Neoschizomers cleave the same sequence as a prototype but at a different site. So this calls into question whether the prototype should be defined at all unless it is specifically indicated in the file. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Fri Oct 27 12:47:53 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 16:47:53 +0000 Subject: [Bioperl-l] Caching sequences In-Reply-To: <8273f6c20610270838y76bf0dd1haed77615f46a82fc@mail.gmail.com> References: <454202C7.1040701@sheffield.ac.uk> <001601c6f9d9$ebd8c7f0$15327e82@pyrimidine> <8273f6c20610270838y76bf0dd1haed77615f46a82fc@mail.gmail.com> Message-ID: <45423839.9040503@sheffield.ac.uk> Jason Stajich wrote: > Bio::DB::FileCache does one better and lets you cache the data in a > persistent file. Not sure this index is shareable among users though > - bioperl-db is a better soln when that is desired. Thanks I'll have a look into it. No need for being sharable among users - not unless the script becomes heavily used. Thanks Nath From cjfields at uiuc.edu Fri Oct 27 12:15:00 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 11:15:00 -0500 Subject: [Bioperl-l] StandAloneFasta.t bioperl-run tests Message-ID: <000101c6f9e3$0e5e95d0$15327e82@pyrimidine> Nathan, The test fails you posted on the wiki seem to indicate that using the wrapper works but the order of the returned hits is off. Does the order of the returned hits match the actual FASTA report order? If it does then the tests need to be fixed in a way to make it more flexible, to account for some data 'fuzziness' due to variations in output based on different versions. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Fri Oct 27 12:50:54 2006 From: jason at bioperl.org (Jason Stajich) Date: Fri, 27 Oct 2006 09:50:54 -0700 Subject: [Bioperl-l] Query on tree bootstrap values In-Reply-To: <4541EA78.3050404@sendu.me.uk> References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> <4541CC82.2040705@sendu.me.uk> <4541EF2A.4050600@sheffield.ac.uk> <4541EA78.3050404@sendu.me.uk> Message-ID: <1230E110-01AB-4D4E-842F-20B939555299@bioperl.org> I've answered to this effect this multiple times in the past on the mailing list. newick format does not distinguish between internal ids and bootstrap values (or whatever else you want to attach there). Different programs have different conventions. when both values are present and encoded so that we can parse out the bootstrap like this: [BOOTSTRAP] the parser grabs it out. If you know all the internal ids are boostraps you can just copy the values over manually very simply for my $node ( grep { ! $_->is_Leaf } $tree->get_nodes ) { # get all the internal nodes $node->bootstrap($node->id) if defined $node->id && length($node- >id); # copy id to boostrap $node->id(''); # set internal id to empty } If someone can make this clearer on a wiki page that would be great. On Oct 27, 2006, at 4:16 AM, Sendu Bala wrote: > Nathan S. Haigh wrote: >> Sendu Bala wrote: >>> Himanshu Ardawatia wrote: >>>> >>>> EXAMPLE TREE (Newick with bootstrap values and branch lengths) : >>>> ################################# >>>> ( >>>> ('Chimp' : 0.052, >>>> 'Human' : 0.042) 0.71 : 0.007, >>>> 'Gorilla' : 0.060, >>>> ('Gibbon' : 0.124, >>>> 'Orangutan' : 0.0971) 1 : 0.038 >>>> ); >>>> ################################# >>>> >>> Are you sure this is in the correct format? >>> >> >> He/she may have a tree that already contains bootstrap values output >> from another program. If this is so, which program did you use? >> Without >> reminding myself of the formats, you should lookup newick format and >> whther it is possible to store bootstraps in it. In addition you >> should >> also look up the nhx format. > > Ah, well from a brief google it seemed like some software do store > boostrap values for internal nodes as the node ids when outputting in > Newick format. I don't think Bioperl should be able to tell the > difference between a normal id and a bootstrap value, so you'll > have to > detect that yourself and manually use bootstrap() when you get an id > that looks like a number. > > Or should Bioperl be making this assumption for you? Is that a safe > thing to do? Maybe as an option only? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From avilella at gmail.com Fri Oct 27 09:23:07 2006 From: avilella at gmail.com (Albert Vilella) Date: Fri, 27 Oct 2006 14:23:07 +0100 Subject: [Bioperl-l] scale branch lengths of a tree to sum 1 Message-ID: <358f4d650610270623t13875c8bv50278af5ee5b3c44@mail.gmail.com> Hi all, I am in need of a method that would scale the different branch lengths of a tree so that after the scaling they all sum up to exactly 1. Any pointers? Has anyone done that before? Thanks in advance, Albert. From cjfields at uiuc.edu Fri Oct 27 14:34:57 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 13:34:57 -0500 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign Message-ID: <000001c6f9f6$9ab12710$15327e82@pyrimidine> I am working an refactoring the AlignIO::stockholm parser to get it reading and writing Pfam/Rfam alignments, and noticed that many alignments have EMBL-like annotations attached, which pertain to the entire alignment: # STOCKHOLM 1.0 #=GF ID ykkC-yxkD #=GF AC RF00442 #=GF DE ykkC-yxkD element #=GF AU Moxon SJ #=GF GA 20.0 #=GF NC 0.1 #=GF TC 59.4 #=GF SE Barrick JE, Breaker RR #=GF SS Predicted; Barrick JE, Breaker RR #=GF TP Cis-reg; riboswitch; #=GF BM cmbuild CM SEED #=GF BM cmsearch -W 175 CM SEQDB #=GF RN [1] #=GF RM 15096624 #=GF RT New RNA motifs suggest an expanded scope for riboswitches in #=GF RT bacterial genetic control. #=GF RA Barrick JE, Corbino KA, Winkler WC, Nahvi A, Mandal M, Collins J, Lee #=GF RA M, Roth A, Sudarsan N, Jona I, Wickiser JK, Breaker RR; #=GF RL Proc Natl Acad Sci U S A 2004;101:6421-6426. #=GF CC This family represents the bacterial ykkC/yxkD element. The function of #=GF CC this family is unclear although it has been suggested that it may function #=GF CC to switch on efflux pumps and detoxification systems in response to harmful #=GF CC environmental molecules [1]. The Thermoanaerobacter tengcongensis sequence #=GF CC EMBL:AE013027 overlaps with Rfam:RF00167 suggesting that the two #=GF CC riboswitches may work in conjunction to regulate the the upstream gene #=GF CC which codes for Swiss:Q8RC62, a member of Pfam:PF00860 (Personal obs. Moxon #=GF CC SJ). #=GF SQ 16 SimpleAlign, as implemented, seemingly doesn't have a way to store this information. I'll work on getting the core alignment IO working, but would there be any interest in having a way to store annotations in Bio::SimpleAlign? I'm guessing the methods would be similar to the various Bio::Seq Annotation methods. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Fri Oct 27 16:23:46 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 27 Oct 2006 16:23:46 -0400 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: <000001c6f9f6$9ab12710$15327e82@pyrimidine> References: <000001c6f9f6$9ab12710$15327e82@pyrimidine> Message-ID: You could make SimpleAlign be a Bio::AnnotationHolderI. (I suppose this is what you meant by the 'various Bio::Seq Annotation methods' too.) Just to make sure I'm not misunderstanding, I suppose the annotation pertains to the entire alignment? -hilmar On Oct 27, 2006, at 2:34 PM, Chris Fields wrote: > I am working an refactoring the AlignIO::stockholm parser to get it > reading > and writing Pfam/Rfam alignments, and noticed that many alignments > have > EMBL-like annotations attached, which pertain to the entire alignment: > > # STOCKHOLM 1.0 > #=GF ID ykkC-yxkD > #=GF AC RF00442 > #=GF DE ykkC-yxkD element > #=GF AU Moxon SJ > #=GF GA 20.0 > #=GF NC 0.1 > #=GF TC 59.4 > #=GF SE Barrick JE, Breaker RR > #=GF SS Predicted; Barrick JE, Breaker RR > #=GF TP Cis-reg; riboswitch; > #=GF BM cmbuild CM SEED > #=GF BM cmsearch -W 175 CM SEQDB > #=GF RN [1] > #=GF RM 15096624 > #=GF RT New RNA motifs suggest an expanded scope for > riboswitches in > #=GF RT bacterial genetic control. > #=GF RA Barrick JE, Corbino KA, Winkler WC, Nahvi A, Mandal M, > Collins J, > Lee > #=GF RA M, Roth A, Sudarsan N, Jona I, Wickiser JK, Breaker RR; > #=GF RL Proc Natl Acad Sci U S A 2004;101:6421-6426. > #=GF CC This family represents the bacterial ykkC/yxkD element. The > function of > #=GF CC this family is unclear although it has been suggested > that it may > function > #=GF CC to switch on efflux pumps and detoxification systems in > response > to harmful > #=GF CC environmental molecules [1]. The Thermoanaerobacter > tengcongensis > sequence > #=GF CC EMBL:AE013027 overlaps with Rfam:RF00167 suggesting that > the two > #=GF CC riboswitches may work in conjunction to regulate the the > upstream > gene > #=GF CC which codes for Swiss:Q8RC62, a member of Pfam:PF00860 > (Personal > obs. Moxon > #=GF CC SJ). > #=GF SQ 16 > > SimpleAlign, as implemented, seemingly doesn't have a way to store > this > information. > > I'll work on getting the core alignment IO working, but would there > be any > interest in having a way to store annotations in Bio::SimpleAlign? > I'm > guessing the methods would be similar to the various Bio::Seq > Annotation > methods. > > Chris > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Fri Oct 27 16:38:17 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 15:38:17 -0500 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: Message-ID: <000001c6fa07$d8659990$15327e82@pyrimidine> Hilmar Lapp wrote: > You could make SimpleAlign be a Bio::AnnotationHolderI. (I > suppose this is what you meant by the 'various Bio::Seq Annotation > methods' too.) > > Just to make sure I'm not misunderstanding, I suppose the > annotation pertains to the entire alignment? > > -hilmar ... Yes, that's correct. I would probably use Bio::Seq::Meta for the sequence-specific markup lines. I would have to add another new method to deal with non-sequence-based consensus data (like sec. structure) for now. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Fri Oct 27 11:38:05 2006 From: jason at bioperl.org (Jason Stajich) Date: Fri, 27 Oct 2006 08:38:05 -0700 Subject: [Bioperl-l] Caching sequences In-Reply-To: <001601c6f9d9$ebd8c7f0$15327e82@pyrimidine> References: <454202C7.1040701@sheffield.ac.uk> <001601c6f9d9$ebd8c7f0$15327e82@pyrimidine> Message-ID: <8273f6c20610270838y76bf0dd1haed77615f46a82fc@mail.gmail.com> Bio::DB::FileCache does one better and lets you cache the data in a persistent file. Not sure this index is shareable among users though - bioperl-db is a better soln when that is desired. -jason On 10/27/06, Chris Fields wrote: > > > I have a script that is capable of downloading sequences from GenBank > > based on GI numbers. I retrieve them if fasta format in order to save > > bandwidth, but I'd like to take this one step further and cache the > > sequences in case the user want to rerun the script using some of the > > GI's they used previously. > > > > Does anyone have any guidance on how best to do this? > > > > Cheers > > Nath > > There is Bio::DB::InMemoryCache, which is really an interface but appears > to > have several methods defined; you could look for modules which implement > it. > Sendu's suggestion of the Bio::Index modules and bioperl-db are also good > starting points. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich jason at bioperl.org http://www.duke.edu/~jes12/ From cjfields at uiuc.edu Fri Oct 27 21:57:58 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 20:57:58 -0500 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: References: <000001c6f9f6$9ab12710$15327e82@pyrimidine> Message-ID: <24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu> On Oct 27, 2006, at 3:23 PM, Hilmar Lapp wrote: > You could make SimpleAlign be a Bio::AnnotationHolderI. (I suppose > this is what you meant by the 'various Bio::Seq Annotation methods' > too.) > > Just to make sure I'm not misunderstanding, I suppose the annotation > pertains to the entire alignment? > > -hilmar BTW, was that supposed to be Bio::AnnotatableI, or Bio::AnnotationHolderI? The latter isn't present in CVS HEAD. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From eric.ross at neuro.utah.edu Sat Oct 28 17:24:30 2006 From: eric.ross at neuro.utah.edu (Eric Ross) Date: Sat, 28 Oct 2006 15:24:30 -0600 Subject: [Bioperl-l] PAML References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu> Message-ID: <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu> I am trying to extract the "Naive Empirical Bayes (NEB) probabilities" from a Bio::Tools::Phylo::PAML::Result object. I am able to extract other data from the report, but there seems to be a conflict in the documentation. One doc implies that there should be a get_posteriors method. (It's used as an example in the Bio::Tools::Phylo::PAML doc), but the method does not appear to exist in the Bio::Tools::Phylo::PAML::Result object. I have been trying various methods, in the event I'm just "confused", but I've had no luck, thus far. Anyone have suggestions? code: ----begin code------- #!/usr/bin/perl -w use strict; use Bio::Tools::Phylo::PAML; my $parser = new Bio::Tools::Phylo::PAML (-file => "mlc"); my $result = $parser->next_result; my @posteriors = $result->get_posteriors(); print "@posteriors"; exit(0); ---------end code------------- --------------- Eric Ross Computer Analyst II ejr at neuro.utah.edu Howard Hughes Medical Institute University of Utah S?nchez Lab From avilella at gmail.com Sun Oct 29 05:52:04 2006 From: avilella at gmail.com (Albert Vilella) Date: Sun, 29 Oct 2006 10:52:04 +0000 Subject: [Bioperl-l] PAML In-Reply-To: <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu> References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu> <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu> Message-ID: <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com> I don't know if this method is implemented. I can't grep-find it. Maybe it's simply not there yet, but was planned when the documentation was written. On 10/28/06, Eric Ross wrote: > I am trying to extract the "Naive Empirical Bayes (NEB) probabilities" from a Bio::Tools::Phylo::PAML::Result object. > > I am able to extract other data from the report, but there seems to be a conflict in the documentation. One doc implies that there should be a get_posteriors method. (It's used as an example in the Bio::Tools::Phylo::PAML doc), but the method does not appear to exist in the Bio::Tools::Phylo::PAML::Result object. > > > I have been trying various methods, in the event I'm just "confused", but I've had no luck, thus far. Anyone have suggestions? > > > code: > > ----begin code------- > #!/usr/bin/perl -w > > use strict; > > > use Bio::Tools::Phylo::PAML; > my $parser = new Bio::Tools::Phylo::PAML > (-file => "mlc"); > my $result = $parser->next_result; > my @posteriors = $result->get_posteriors(); > > print "@posteriors"; > > exit(0); > > ---------end code------------- > > > > --------------- > Eric Ross > Computer Analyst II > ejr at neuro.utah.edu > Howard Hughes Medical Institute > University of Utah > S?nchez Lab > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at uiuc.edu Sun Oct 29 09:23:45 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 29 Oct 2006 08:23:45 -0600 Subject: [Bioperl-l] PAML In-Reply-To: <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com> References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu> <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu> <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com> Message-ID: <9E776AD6-76E1-403A-8003-C27958F21D90@uiuc.edu> Does the data show up in the object using Data::Dumper? This should be filed as a bug since the docs imply the method exists. This could be written up fairly quickly if one had test data and and a script to work with (hint hint...) Chris On Oct 29, 2006, at 4:52 AM, Albert Vilella wrote: > I don't know if this method is implemented. I can't grep-find it. > Maybe it's simply not there yet, but was planned when the > documentation was written. > > On 10/28/06, Eric Ross wrote: >> I am trying to extract the "Naive Empirical Bayes (NEB) >> probabilities" from a Bio::Tools::Phylo::PAML::Result object. >> >> I am able to extract other data from the report, but there seems >> to be a conflict in the documentation. One doc implies that there >> should be a get_posteriors method. (It's used as an example in the >> Bio::Tools::Phylo::PAML doc), but the method does not appear to >> exist in the Bio::Tools::Phylo::PAML::Result object. >> >> >> I have been trying various methods, in the event I'm just >> "confused", but I've had no luck, thus far. Anyone have suggestions? >> >> >> code: >> >> ----begin code------- >> #!/usr/bin/perl -w >> >> use strict; >> >> >> use Bio::Tools::Phylo::PAML; >> my $parser = new Bio::Tools::Phylo::PAML >> (-file => "mlc"); >> my $result = $parser->next_result; >> my @posteriors = $result->get_posteriors(); >> >> print "@posteriors"; >> >> exit(0); >> >> ---------end code------------- >> >> >> >> --------------- >> Eric Ross >> Computer Analyst II >> ejr at neuro.utah.edu >> Howard Hughes Medical Institute >> University of Utah >> S?nchez Lab >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From eric.ross at neuro.utah.edu Sun Oct 29 12:06:54 2006 From: eric.ross at neuro.utah.edu (Eric Ross) Date: Sun, 29 Oct 2006 10:06:54 -0700 Subject: [Bioperl-l] PAML References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu> <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu> <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com> <9E776AD6-76E1-403A-8003-C27958F21D90@uiuc.edu> Message-ID: <2DCC13A920E6C640AD6257DEFA01F78408A3F5@CAMPUSV1.xds.umail.utah.edu> Thanks for all the help. I've been looking at the code for the PAML rst parser. It's a bit tricky. We have written a parser specific for our needs, but it looks to be a pretty complicated matter to make it generic. The output of PAML can vary a lot depending upon your options and this section can be repeated multiple times. I'm sure someone with a good grasp of the potential output of PAML could come up with something, but I'll admit to being at a loss. --------------- Eric Ross Computer Analyst II ejr at neuro.utah.edu Howard Hughes Medical Institute University of Utah S?nchez Lab -----Original Message----- From: Chris Fields [mailto:cjfields at uiuc.edu] Sent: Sun 2006-10-29 7:23 AM To: Albert Vilella Cc: Eric Ross; Bioperl-l at lists.open-bio.org Subject: Re: [Bioperl-l] PAML Does the data show up in the object using Data::Dumper? This should be filed as a bug since the docs imply the method exists. This could be written up fairly quickly if one had test data and and a script to work with (hint hint...) Chris On Oct 29, 2006, at 4:52 AM, Albert Vilella wrote: > I don't know if this method is implemented. I can't grep-find it. > Maybe it's simply not there yet, but was planned when the > documentation was written. > > On 10/28/06, Eric Ross wrote: >> I am trying to extract the "Naive Empirical Bayes (NEB) >> probabilities" from a Bio::Tools::Phylo::PAML::Result object. >> >> I am able to extract other data from the report, but there seems >> to be a conflict in the documentation. One doc implies that there >> should be a get_posteriors method. (It's used as an example in the >> Bio::Tools::Phylo::PAML doc), but the method does not appear to >> exist in the Bio::Tools::Phylo::PAML::Result object. >> >> >> I have been trying various methods, in the event I'm just >> "confused", but I've had no luck, thus far. Anyone have suggestions? >> >> >> code: >> >> ----begin code------- >> #!/usr/bin/perl -w >> >> use strict; >> >> >> use Bio::Tools::Phylo::PAML; >> my $parser = new Bio::Tools::Phylo::PAML >> (-file => "mlc"); >> my $result = $parser->next_result; >> my @posteriors = $result->get_posteriors(); >> >> print "@posteriors"; >> >> exit(0); >> >> ---------end code------------- >> >> >> >> --------------- >> Eric Ross >> Computer Analyst II >> ejr at neuro.utah.edu >> Howard Hughes Medical Institute >> University of Utah >> S?nchez Lab >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Sun Oct 29 12:43:20 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sun, 29 Oct 2006 17:43:20 +0000 Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage In-Reply-To: <45421658.5000103@sheffield.ac.uk> References: <45421658.5000103@sheffield.ac.uk> Message-ID: <4544E838.7090400@sheffield.ac.uk> Sorry for the repeat post but I haven't had a response. Just wondered if anyone had any idea about this? Thanks Nath Nathan S. Haigh wrote: > As you may be aware by now, i'm working with Bio::Restriction::Analysis > and friends. > > I'm doing restriction analysis on large sequences - chromosomes. I need > to identify an appropriate enzyme based on the total length of fragments > that are of a certain size (e.g. 100 - 500 bp). However, the amount of > memory used by Bio::Restriction::Analysis::fragments() is prohibative. I > have the following code (bottom) which downloads 2 thaliana chromosomes > (mito and chloro - so pretty small) and runs an analysis and then loops > through the fragments for all enzymes in the default collection. > > My memory usage just keep on climbing and none seems to get freed up > even when a $ra goes out of scope (start dealing with the next > sequence). Is this a memory leak of some sort, is there a way to free up > memory as I go? I'd appreciate any help/advice on how to reduce the > amount of memory being consumed as I'd like to use all the thaliana > chromosomes (not just mito and chloro), which at the moment probably > won't work. > > Cheers > Nath > > use strict; > use Bio::DB::GenBank; > use Bio::Restriction::Analysis; > use Bio::Restriction::EnzymeCollection; > > my @seq_objs; > my @gis = ( 7525012, 26556996 ); > > my $db = Bio::DB::GenBank->new(-format => "fasta"); > foreach my $gi (@gis) { > print "Getting GI: $gi\n"; > push @seq_objs, $db->get_Seq_by_id($gi) > } > > my $min_fragment_size = 100; > my $max_fragment_size = 500; > my $enz_Coll = Bio::Restriction::EnzymeCollection->new(); > > foreach my $seq (@seq_objs) { > my $tot_size = 0; > print "Processing ", $seq->primary_id,"\n"; > my $ra = Bio::Restriction::Analysis->new( > -seq=>$seq, > -enzymes=>$enz_Coll, > ); > > my @all_enzymes = $ra->cutters->each_enzyme; > print " Calc total length of fragments in range: $min_fragment_size - > $max_fragment_size\n"; > foreach my $enzyme ( @all_enzymes ) { > # fragments() is a real memory hog > foreach my $frag ($ra->fragments($enzyme)) { > next if $min_fragment_size && (length $frag < $min_fragment_size); > next if $max_fragment_size && (length $frag > $max_fragment_size); > $tot_size += length $frag; > } > # do something based on value of $tot_size > #print " ", $enzyme->name, " total = $tot_size\n"; > } > print "DONE\n"; > } > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at uiuc.edu Sun Oct 29 13:09:54 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 29 Oct 2006 12:09:54 -0600 Subject: [Bioperl-l] PAML In-Reply-To: <2DCC13A920E6C640AD6257DEFA01F78408A3F5@CAMPUSV1.xds.umail.utah.edu> References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu> <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu> <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com> <9E776AD6-76E1-403A-8003-C27958F21D90@uiuc.edu> <2DCC13A920E6C640AD6257DEFA01F78408A3F5@CAMPUSV1.xds.umail.utah.edu> Message-ID: On Oct 29, 2006, at 11:06 AM, Eric Ross wrote: > Thanks for all the help. > > I've been looking at the code for the PAML rst parser. It's a bit > tricky. > > We have written a parser specific for our needs, but it looks to be > a pretty complicated matter to make it generic. > > The output of PAML can vary a lot depending upon your options and > this section can be repeated multiple times. I'm sure someone with > a good grasp of the potential output of PAML could come up with > something, but I'll admit to being at a loss. Eric, I planned on looking at ways to integrate the protein-based PAML programs but I'm working on a different area at the moment. I agree it may be hard to adequately genericize parsing/methods to accomplish this, but if you have any ideas feel free to post them. Again, I would suggest adding any proposed enhancements or bugs to Bugzilla: http://bugzilla.open-bio.org/ Suggestions or bug reports on the list sometimes get lost in the shuffle, esp. since we're planning on a new developer release soon. Chris Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Sun Oct 29 13:16:37 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 29 Oct 2006 12:16:37 -0600 Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage In-Reply-To: <4544E838.7090400@sheffield.ac.uk> References: <45421658.5000103@sheffield.ac.uk> <4544E838.7090400@sheffield.ac.uk> Message-ID: <6D9EAA04-199C-4BDD-AA60-4833BC1CE250@uiuc.edu> On Oct 29, 2006, at 11:43 AM, Nathan S. Haigh wrote: > Sorry for the repeat post but I haven't had a response. Just > wondered if > anyone had any idea about this? > > Thanks > Nath ... I think Warnock applies here. Likely no one is really sure, hence they aren't answering. It probably bears investigating by submitting and tracking as a bug. My guess is something isn't garbage-collected properly (i.e. there are circular references present), leading to a memory leak. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From chhalling at alumni.ls.berkeley.edu Sun Oct 29 14:16:36 2006 From: chhalling at alumni.ls.berkeley.edu (Conrad Halling) Date: Sun, 29 Oct 2006 14:16:36 -0500 Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage In-Reply-To: <4544E838.7090400@sheffield.ac.uk> References: <45421658.5000103@sheffield.ac.uk> <4544E838.7090400@sheffield.ac.uk> Message-ID: <4544FE14.7030701@alumni.ls.berkeley.edu> Nathan S. Haigh wrote: > Sorry for the repeat post but I haven't had a response. Just wondered if > anyone had any idea about this? > > Thanks > Nath > > Nathan S. Haigh wrote: > >> As you may be aware by now, i'm working with Bio::Restriction::Analysis >> and friends. >> >> I'm doing restriction analysis on large sequences - chromosomes. I need >> to identify an appropriate enzyme based on the total length of fragments >> that are of a certain size (e.g. 100 - 500 bp). However, the amount of >> memory used by Bio::Restriction::Analysis::fragments() is prohibative. I >> have the following code (bottom) which downloads 2 thaliana chromosomes >> (mito and chloro - so pretty small) and runs an analysis and then loops >> through the fragments for all enzymes in the default collection. >> >> My memory usage just keep on climbing and none seems to get freed up >> even when a $ra goes out of scope (start dealing with the next >> sequence). Is this a memory leak of some sort, is there a way to free up >> memory as I go? I'd appreciate any help/advice on how to reduce the >> amount of memory being consumed as I'd like to use all the thaliana >> chromosomes (not just mito and chloro), which at the moment probably >> won't work. >> >> Cheers >> Nath >> >> use strict; >> use Bio::DB::GenBank; >> use Bio::Restriction::Analysis; >> use Bio::Restriction::EnzymeCollection; >> >> my @seq_objs; >> my @gis = ( 7525012, 26556996 ); >> >> my $db = Bio::DB::GenBank->new(-format => "fasta"); >> foreach my $gi (@gis) { >> print "Getting GI: $gi\n"; >> push @seq_objs, $db->get_Seq_by_id($gi) >> } >> >> my $min_fragment_size = 100; >> my $max_fragment_size = 500; >> my $enz_Coll = Bio::Restriction::EnzymeCollection->new(); >> >> foreach my $seq (@seq_objs) { >> my $tot_size = 0; >> print "Processing ", $seq->primary_id,"\n"; >> my $ra = Bio::Restriction::Analysis->new( >> -seq=>$seq, >> -enzymes=>$enz_Coll, >> ); >> >> my @all_enzymes = $ra->cutters->each_enzyme; >> print " Calc total length of fragments in range: $min_fragment_size - >> $max_fragment_size\n"; >> foreach my $enzyme ( @all_enzymes ) { >> # fragments() is a real memory hog >> foreach my $frag ($ra->fragments($enzyme)) { >> next if $min_fragment_size && (length $frag < $min_fragment_size); >> next if $max_fragment_size && (length $frag > $max_fragment_size); >> $tot_size += length $frag; >> } >> # do something based on value of $tot_size >> #print " ", $enzyme->name, " total = $tot_size\n"; >> } >> print "DONE\n"; >> } >> >> Try this code, which creates a new Bio::Restriction::Analysis object for each digest. On my PowerBook, this doesn't use more than 13 Mb of memory. Reading the code for Bio::Restriction::Analysis reveals that the fragments() method calls the cut() method. The documentation for the cut method states: Note: cut doesn't now re-initialize everything before figuring out cuts. This is so that you can do multiple digests, or add more data or whatever. You'll have to use new to reset everything. This means there is no memory leak; it's just that the Bio::Restriction::Analysis object is retaining cut information for each enzyme, which takes a lot of memory. use strict; use warnings; use Bio::DB::GenBank; use Bio::Restriction::Analysis; use Bio::Restriction::EnzymeCollection; my @seq_objs; my @gis = ( 7525012, 26556996 ); my $db = Bio::DB::GenBank->new(-format => "fasta"); foreach my $gi (@gis) { print "Getting GI: $gi\n"; push @seq_objs, $db->get_Seq_by_id($gi) } my $min_fragment_size = 100; my $max_fragment_size = 500; my $enz_Coll = Bio::Restriction::EnzymeCollection->new(); foreach my $seq (@seq_objs) { print "Processing ", $seq->primary_id, "\n"; foreach my $enzyme ( $enz_Coll->each_enzyme() ) { my $ra = Bio::Restriction::Analysis->new( -seq => $seq, -enzymes => $enzyme ); my $tot_size = 0; print " Calc total length of fragments in range: $min_fragment_size -" . " $max_fragment_size\n"; foreach my $frag ($ra->fragments($enzyme)) { next if $min_fragment_size && (length $frag < $min_fragment_size); next if $max_fragment_size && (length $frag > $max_fragment_size); $tot_size += length $frag; } # do something based on value of $tot_size print " ", $enzyme->name, " total = $tot_size\n"; } print "DONE\n"; } -- Conrad Halling chhalling at alumni.ls.berkeley.edu From n.haigh at sheffield.ac.uk Mon Oct 30 03:51:49 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 30 Oct 2006 08:51:49 +0000 Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id() Message-ID: <4545BD25.3030107@sheffield.ac.uk> In my script I retrieve sequences from GenBank in FASTA format by GI numbers and optionally store the sequence in a cache using Bio::DB::Fasta. On subsequent runs of the script, the cache is first checked for the GI and returns the sequence if it is found or the sequence is obtained from GenBank as above. I would have thought that Bio::DB::Fasta::get_Seq_by_id() would have returned a Bio::Seq object but rather it returns a Bio::PrimarySeq object which is defined within the Bio::DB::Fasta file. This is annoying, since $seq_obj in my script would be either a Bio::Seq if it was obtained from GenBank or a Bio::PrimarySeq if obtained from the cache and calling primary_id() on it doesn't do the expected thing with Bio::PrimarySeq: ID: Bio::PrimarySeq::Fasta=HASH(0x89b4508) Is there a reason why Bio::DB::Fasta doesn't return a Bio::Seq object? Nath From yuhki at ncifcrf.gov Mon Oct 30 08:57:35 2006 From: yuhki at ncifcrf.gov (Naoya Yuhki) Date: Mon, 30 Oct 2006 08:57:35 -0500 Subject: [Bioperl-l] bptutorial.pl 0 Message-ID: <46170A95-13C8-4839-BD70-EF086ED1E0DA@ncifcrf.gov> Hello, I run perl bptutorial.pl 0 and I got the following error. -------------------- WARNING --------------------- MSG: id (ROA1_HUMAN) does not exist --------------------------------------------------- Can't call method "display_id" on an undefined value at bptutorial.pl line 3945. other tests all worked. I thank any suggestions from you. NAOYA YUHKI. From cjfields at uiuc.edu Mon Oct 30 12:42:21 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 30 Oct 2006 11:42:21 -0600 Subject: [Bioperl-l] bptutorial.pl 0 In-Reply-To: <46170A95-13C8-4839-BD70-EF086ED1E0DA@ncifcrf.gov> Message-ID: <000601c6fc4a$c3e43450$15327e82@pyrimidine> > Hello, > I run > > perl bptutorial.pl 0 > > and I got the following error. > > -------------------- WARNING --------------------- > MSG: id (ROA1_HUMAN) does not exist > --------------------------------------------------- > Can't call method "display_id" on an undefined value at bptutorial.pl > line 3945. > > other tests all worked. > > I thank any suggestions from you. > > NAOYA YUHKI. What version of Bioperl are you running? As a warning, the bptutorial.pl script has been removed from CVS and will not be included in future versions of Bioperl. It can be found on the bioperl wiki instead: http://www.bioperl.org/wiki/Bptutorial chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Mon Oct 30 13:08:15 2006 From: jason at bioperl.org (Jason Stajich) Date: Mon, 30 Oct 2006 10:08:15 -0800 Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id() In-Reply-To: <4545BD25.3030107@sheffield.ac.uk> References: <4545BD25.3030107@sheffield.ac.uk> Message-ID: <29F47393-D134-4093-8751-E948BF521843@bioperl.org> Bio::PrimarySeq makes sense because Fasta databases only provide sequences without features. But you are actually getting a Bio::PrimarySeq::Fasta object which is a proxy object since the module won't pull a whole sequence into memory unless seq() is requested. The problem is really why you are getting something useless set for primary_id. What do you want it to be - the GI number? you'll need to explicitly set it because DB::Fasta has no concept of GI numbers encoded in the header line. AFAIK you cannot also set the primary_id to a value of your liking because this a proxy object. The best bet is to create a Bio::Seq object out of one of these and set the primary_id and display_id to values that you can compute from the display_id. At least that has been my strategy when using this - maybe someone wants to code something new into the object itsself. -jason On Oct 30, 2006, at 12:51 AM, Nathan S. Haigh wrote: > In my script I retrieve sequences from GenBank in FASTA format by GI > numbers and optionally store the sequence in a cache using > Bio::DB::Fasta. On subsequent runs of the script, the cache is first > checked for the GI and returns the sequence if it is found or the > sequence is obtained from GenBank as above. > > I would have thought that Bio::DB::Fasta::get_Seq_by_id() would have > returned a Bio::Seq object but rather it returns a Bio::PrimarySeq > object which is defined within the Bio::DB::Fasta file. This is > annoying, since $seq_obj in my script would be either a Bio::Seq if it > was obtained from GenBank or a Bio::PrimarySeq if obtained from the > cache and calling primary_id() on it doesn't do the expected thing > with > Bio::PrimarySeq: > ID: Bio::PrimarySeq::Fasta=HASH(0x89b4508) > > Is there a reason why Bio::DB::Fasta doesn't return a Bio::Seq object? > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From golharam at umdnj.edu Mon Oct 30 15:11:51 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Mon, 30 Oct 2006 15:11:51 -0500 Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String? Message-ID: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1> I'm trying to parse some blast output w/o actually creating the output file. Instead, I'm capturing the output in a variable and would like to use IO::String to represent the file: $_ = `megablast -d somedatabase -i somesequence -D 2`; my $blast_file = new IO::String($_); my $searchio = new Bio::SearchIO(-format => 'blast', -fh => $blast_file); my $results = $searchio->next_result; my $hit = $results->next_hit; if (! defined($hit)) { warn "No BLAST hit for $accession on chr $chr for Seq/$orth_id/$organism\n\n"; return; } Now, when Bio::SearchIO tries to read the output line by line, instead it reads the entire output as 1 line. If I provide the output in a file and use: my $searchio = new Bio::SearchIO(-format => 'blast', -file => '/tmp/somefile.blast'); This works...so is it possible to use IO::String to provide Bio::SearchIO with BLAST output? Ryan From golharam at umdnj.edu Mon Oct 30 15:54:29 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Mon, 30 Oct 2006 15:54:29 -0500 Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String? In-Reply-To: <716af09c0610301244q1c6d4cd6y23e72371dda4fa4d@mail.gmail.com> Message-ID: <00e801c6fc65$9849aee0$e6028a0a@GOLHARMOBILE1> Thanks. How are you getting the output? system()? BTW- I'm using v1.5.1... > -----Original Message----- > From: Bernd Web [mailto:bernd.web at gmail.com] > Sent: Monday, October 30, 2006 3:45 PM > To: golharam at umdnj.edu > Cc: bioperl-l > Subject: Re: [Bioperl-l] Is it possible to parse BLAST output > using IO:String? > > > Hi Ryan, > > I parse blastn output using IO::String w/o problems: > > my $stringfh = new IO::String($input); > my $in = new Bio::SearchIO(-format => 'blast', -fh => $stringfh); > > however this is input does not come via backticks. > > > bernd > > On 10/30/06, Ryan Golhar wrote: > > I'm trying to parse some blast output w/o actually creating > the output > > file. Instead, I'm capturing the output in a variable and > would like > > to use IO::String to represent the file: > > > > $_ = `megablast -d somedatabase -i somesequence -D 2`; > > my $blast_file = new IO::String($_); > > my $searchio = new Bio::SearchIO(-format => 'blast', -fh => > > $blast_file); > > my $results = $searchio->next_result; > > my $hit = $results->next_hit; > > if (! defined($hit)) { > > warn "No BLAST hit for $accession on chr $chr for > > Seq/$orth_id/$organism\n\n"; > > return; > > } > > > > Now, when Bio::SearchIO tries to read the output line by > line, instead > > it reads the entire output as 1 line. > > > > If I provide the output in a file and use: > > > > my $searchio = new Bio::SearchIO(-format => > 'blast', -file => > > '/tmp/somefile.blast'); > > > > This works...so is it possible to use IO::String to provide > > Bio::SearchIO with BLAST output? > > > > Ryan > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > From bix at sendu.me.uk Mon Oct 30 16:27:58 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 30 Oct 2006 21:27:58 +0000 Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String? In-Reply-To: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1> References: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1> Message-ID: <45466E5E.9000504@sendu.me.uk> Ryan Golhar wrote: > I'm trying to parse some blast output w/o actually creating the output > file. Instead, I'm capturing the output in a variable and would like to > use IO::String to represent the file: > > $_ = `megablast -d somedatabase -i somesequence -D 2`; > my $blast_file = new IO::String($_); > my $searchio = new Bio::SearchIO(-format => 'blast', -fh => > $blast_file); > my $results = $searchio->next_result; > my $hit = $results->next_hit; > if (! defined($hit)) { > warn "No BLAST hit for $accession on chr $chr for > Seq/$orth_id/$organism\n\n"; > return; > } > > Now, when Bio::SearchIO tries to read the output line by line, instead > it reads the entire output as 1 line. > > If I provide the output in a file and use: > > my $searchio = new Bio::SearchIO(-format => 'blast', -file => > '/tmp/somefile.blast'); > > This works...so is it possible to use IO::String to provide > Bio::SearchIO with BLAST output? Why must it be IO::String? Why not just open() your megablast and provide $searchio the real filehandle? It would be faster that way as well. Read the docs for `. Your usage above is inappropriate. From golharam at umdnj.edu Mon Oct 30 16:54:45 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Mon, 30 Oct 2006 16:54:45 -0500 Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String? In-Reply-To: Message-ID: <00f901c6fc6e$03916460$e6028a0a@GOLHARMOBILE1> Hmmm. Yes, I suppose I could. I did it with the backtick because I based my code off of the "To and >From a String" from the SeqIO HOWTO... -----Original Message----- From: Jason Stajich [mailto:jason.stajich at gmail.com] On Behalf Of Jason Stajich Sent: Monday, October 30, 2006 4:44 PM To: Sendu Bala Cc: golharam at umdnj.edu; 'bioperl-l' Subject: Re: [Bioperl-l] Is it possible to parse BLAST output using IO:String? right - can't you just do: my $fh; open($fh, "megablast -d ... | ") || die $!; my $searchIO = Bio::SearchIO->new(-format => 'blast', -fh => $fh); On Oct 30, 2006, at 1:27 PM, Sendu Bala wrote: Ryan Golhar wrote: I'm trying to parse some blast output w/o actually creating the output file. Instead, I'm capturing the output in a variable and would like to use IO::String to represent the file: $_ = `megablast -d somedatabase -i somesequence -D 2`; my $blast_file = new IO::String($_); my $searchio = new Bio::SearchIO(-format => 'blast', -fh => $blast_file); my $results = $searchio->next_result; my $hit = $results->next_hit; if (! defined($hit)) { warn "No BLAST hit for $accession on chr $chr for Seq/$orth_id/$organism\n\n"; return; } Now, when Bio::SearchIO tries to read the output line by line, instead it reads the entire output as 1 line. If I provide the output in a file and use: my $searchio = new Bio::SearchIO(-format => 'blast', -file => '/tmp/somefile.blast'); This works...so is it possible to use IO::String to provide Bio::SearchIO with BLAST output? Why must it be IO::String? Why not just open() your megablast and provide $searchio the real filehandle? It would be faster that way as well. Read the docs for `. Your usage above is inappropriate. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From bernd.web at gmail.com Mon Oct 30 15:44:31 2006 From: bernd.web at gmail.com (Bernd Web) Date: Mon, 30 Oct 2006 21:44:31 +0100 Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String? In-Reply-To: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1> References: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1> Message-ID: <716af09c0610301244q1c6d4cd6y23e72371dda4fa4d@mail.gmail.com> Hi Ryan, I parse blastn output using IO::String w/o problems: my $stringfh = new IO::String($input); my $in = new Bio::SearchIO(-format => 'blast', -fh => $stringfh); however this is input does not come via backticks. bernd On 10/30/06, Ryan Golhar wrote: > I'm trying to parse some blast output w/o actually creating the output > file. Instead, I'm capturing the output in a variable and would like to > use IO::String to represent the file: > > $_ = `megablast -d somedatabase -i somesequence -D 2`; > my $blast_file = new IO::String($_); > my $searchio = new Bio::SearchIO(-format => 'blast', -fh => > $blast_file); > my $results = $searchio->next_result; > my $hit = $results->next_hit; > if (! defined($hit)) { > warn "No BLAST hit for $accession on chr $chr for > Seq/$orth_id/$organism\n\n"; > return; > } > > Now, when Bio::SearchIO tries to read the output line by line, instead > it reads the entire output as 1 line. > > If I provide the output in a file and use: > > my $searchio = new Bio::SearchIO(-format => 'blast', -file => > '/tmp/somefile.blast'); > > This works...so is it possible to use IO::String to provide > Bio::SearchIO with BLAST output? > > Ryan > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From jason at bioperl.org Mon Oct 30 16:44:18 2006 From: jason at bioperl.org (Jason Stajich) Date: Mon, 30 Oct 2006 13:44:18 -0800 Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String? In-Reply-To: <45466E5E.9000504@sendu.me.uk> References: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1> <45466E5E.9000504@sendu.me.uk> Message-ID: right - can't you just do: my $fh; open($fh, "megablast -d ... | ") || die $!; my $searchIO = Bio::SearchIO->new(-format => 'blast', -fh => $fh); On Oct 30, 2006, at 1:27 PM, Sendu Bala wrote: > Ryan Golhar wrote: >> I'm trying to parse some blast output w/o actually creating the >> output >> file. Instead, I'm capturing the output in a variable and would >> like to >> use IO::String to represent the file: >> >> $_ = `megablast -d somedatabase -i somesequence -D 2`; >> my $blast_file = new IO::String($_); >> my $searchio = new Bio::SearchIO(-format => 'blast', -fh => >> $blast_file); >> my $results = $searchio->next_result; >> my $hit = $results->next_hit; >> if (! defined($hit)) { >> warn "No BLAST hit for $accession on chr $chr for >> Seq/$orth_id/$organism\n\n"; >> return; >> } >> >> Now, when Bio::SearchIO tries to read the output line by line, >> instead >> it reads the entire output as 1 line. >> >> If I provide the output in a file and use: >> >> my $searchio = new Bio::SearchIO(-format => 'blast', -file => >> '/tmp/somefile.blast'); >> >> This works...so is it possible to use IO::String to provide >> Bio::SearchIO with BLAST output? > > Why must it be IO::String? Why not just open() your megablast and > provide $searchio the real filehandle? It would be faster that way > as well. > > Read the docs for `. Your usage above is inappropriate. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From lstein at cshl.edu Mon Oct 30 13:59:29 2006 From: lstein at cshl.edu (Lincoln Stein) Date: Mon, 30 Oct 2006 13:59:29 -0500 Subject: [Bioperl-l] Last minute changes to Bio::Graphics::FeatureBase Message-ID: <6dce9a0b0610301059t6bfae183n8d359f4331985c62@mail.gmail.com> Hi All, I found a bug in Bio::Graphics::FeatureBase that was causing GFF3 output not to validate. I have committed a new version to live and to the release candidate branch. I hope it isn't too late to get this into the release. Lincoln -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From huangyi1 at hkusua.hku.hk Tue Oct 31 00:46:20 2006 From: huangyi1 at hkusua.hku.hk (Huang Yi) Date: Tue, 31 Oct 2006 13:46:20 +0800 Subject: [Bioperl-l] bioperl1.5 and GD2.35 Message-ID: <200610310546.k9V5kQGT010481@hkusua.hku.hk> Hi, I just installed bioperl 1.4 from CPAN to my Gentoo linux computer. But the installation was failed. I had to install by force. However, the GD module couldn't be installed for some unknown reasons. I therefore use "emerge" tool of Gentoo to get bioperl and GD again. They are fine. The version of bioperl became upgrade to1.5 and GD was 2.35. However, when I tested it by using the program in HOWTO wiki page (http://www.bioperl.org/wiki/HOWTO:Graphics), it always told me: Can't locate object method "png" via package "GD::Image" at /usr/lib/perl5/site_perl/5.8.8/Bio/Graphics/Panel.pm line 799, <> line 9. In my other computer, bioperl1.4 and GD2.34 work fine. I therefore want to remove the CPAN bioperl from the system and re-install it, but it seems to be impossible. Would you please give me some advices on how to let my GD and bioperl work. Thanks! Huang Yi From bix at sendu.me.uk Tue Oct 31 03:20:21 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 31 Oct 2006 08:20:21 +0000 Subject: [Bioperl-l] Last minute changes to Bio::Graphics::FeatureBase In-Reply-To: <6dce9a0b0610301059t6bfae183n8d359f4331985c62@mail.gmail.com> References: <6dce9a0b0610301059t6bfae183n8d359f4331985c62@mail.gmail.com> Message-ID: <45470745.1050605@sendu.me.uk> Lincoln Stein wrote: > Hi All, > > I found a bug in Bio::Graphics::FeatureBase that was causing GFF3 output not > to validate. I have committed a new version to live and to the release > candidate branch. I hope it isn't too late to get this into the release. It isn't too late, thank you. From avilella at gmail.com Tue Oct 31 08:54:39 2006 From: avilella at gmail.com (Albert Vilella) Date: Tue, 31 Oct 2006 13:54:39 +0000 Subject: [Bioperl-l] catfile and catdir Message-ID: <358f4d650610310554v5eb57db1x137ac307a73fa64@mail.gmail.com> Hi, I was testing the bioperl-run/t/PAML.t and stumbled upon this a catdir/catfile error: Can't locate object method "catdir" via package "Bio::Root::IO" at /home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line 113. BEGIN failed--compilation aborted at /home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line 143. Compilation failed in require at t/PAML.t line 64. BEGIN failed--compilation aborted at t/PAML.t line 64. Should be be using File::Spec for catdir and catfile instead of Root::IO? Cheers, Albert. From Kevin.M.Brown at asu.edu Tue Oct 31 10:34:34 2006 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Tue, 31 Oct 2006 08:34:34 -0700 Subject: [Bioperl-l] bioperl1.5 and GD2.35 Message-ID: <1A4207F8295607498283FE9E93B775B4023B5F3C@EX02.asurite.ad.asu.edu> Not really a Bioperl issue per se, but sounds like when you had Gentoo emerge GD it didn't include libpng and so didn't build the needed parts to create PNG type graphics. > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Huang Yi > Sent: Monday, October 30, 2006 10:46 PM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] bioperl1.5 and GD2.35 > > Hi, > > > > I just installed bioperl 1.4 from CPAN to my Gentoo linux > computer. But the > installation was failed. I had to install by force. > > > > However, the GD module couldn't be installed for some unknown reasons. > > > > I therefore use "emerge" tool of Gentoo to get bioperl and GD > again. They > are fine. The version of bioperl became upgrade to1.5 and GD was 2.35. > > > > However, when I tested it by using the program in HOWTO wiki page > (http://www.bioperl.org/wiki/HOWTO:Graphics), it always told me: > > > > Can't locate object method "png" via package "GD::Image" at > /usr/lib/perl5/site_perl/5.8.8/Bio/Graphics/Panel.pm line > 799, <> line 9. > > > > In my other computer, bioperl1.4 and GD2.34 work fine. I > therefore want to > remove the CPAN bioperl from the system and re-install it, > but it seems to > be impossible. > > > > Would you please give me some advices on how to let my GD and > bioperl work. > > > > Thanks! > > > > Huang Yi > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From hlapp at gmx.net Tue Oct 31 11:21:40 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 31 Oct 2006 11:21:40 -0500 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: <24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu> References: <000001c6f9f6$9ab12710$15327e82@pyrimidine> <24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu> Message-ID: <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net> On Oct 27, 2006, at 9:57 PM, Chris Fields wrote: > BTW, was that supposed to be Bio::AnnotatableI, or > Bio::AnnotationHolderI? Sorry, the former. I guess I got confused with FeatureHolders. Too bad Featureable isn't an English word. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Tue Oct 31 12:01:44 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 31 Oct 2006 12:01:44 -0500 Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id() In-Reply-To: <4545BD25.3030107@sheffield.ac.uk> References: <4545BD25.3030107@sheffield.ac.uk> Message-ID: <375EA8B5-EE6E-47BA-A335-AAC44CC454C3@gmx.net> The only thing I would add to Jason's reply is that it is easy to do if (! $seq->isa("Bio::SeqI")) { my $bioseq = Bio::Seq->new(); $bioseq->primary_seq($seq); $seq = $bioseq; } and from that point on all your objects are Bio::SeqI compliant regardless of whether they were obtained that way or not. Aside from that I wonder why there isn't a -primary_seq option in Bio::Seq::new - this would shorten the above into a (more perl'ish) single line: $seq = Bio::Seq->new(-primary_seq=>$seq) unless $seq->isa("Bio::SeqI"); Anyone takers to add that capability? -hilmar On Oct 30, 2006, at 3:51 AM, Nathan S. Haigh wrote: > In my script I retrieve sequences from GenBank in FASTA format by GI > numbers and optionally store the sequence in a cache using > Bio::DB::Fasta. On subsequent runs of the script, the cache is first > checked for the GI and returns the sequence if it is found or the > sequence is obtained from GenBank as above. > > I would have thought that Bio::DB::Fasta::get_Seq_by_id() would have > returned a Bio::Seq object but rather it returns a Bio::PrimarySeq > object which is defined within the Bio::DB::Fasta file. This is > annoying, since $seq_obj in my script would be either a Bio::Seq if it > was obtained from GenBank or a Bio::PrimarySeq if obtained from the > cache and calling primary_id() on it doesn't do the expected thing > with > Bio::PrimarySeq: > ID: Bio::PrimarySeq::Fasta=HASH(0x89b4508) > > Is there a reason why Bio::DB::Fasta doesn't return a Bio::Seq object? > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Tue Oct 31 12:08:56 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 31 Oct 2006 11:08:56 -0600 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net> Message-ID: <001401c6fd0f$4239aa50$15327e82@pyrimidine> >> BTW, was that supposed to be Bio::AnnotatableI, or >> Bio::AnnotationHolderI? > > Sorry, the former. I guess I got confused with > FeatureHolders. Too bad Featureable isn't an English word. > > -hilmar Having SimpleAlign be AnnotatableI shouldn't be too much of a burden, since the only additional implemented method is annotation(). So, I think all the various Stockholm tags can be placed somewhere. A bit OT: were we planning on getting rid of the various *_tag_* methods in AnnotatableI at some point? I'm a bit confused as to why they were added. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Tue Oct 31 12:09:26 2006 From: jason at bioperl.org (Jason Stajich) Date: Tue, 31 Oct 2006 09:09:26 -0800 Subject: [Bioperl-l] catfile and catdir In-Reply-To: <358f4d650610310554v5eb57db1x137ac307a73fa64@mail.gmail.com> References: <358f4d650610310554v5eb57db1x137ac307a73fa64@mail.gmail.com> Message-ID: <1AD4DB38-E08D-4E47-8A59-6539068474CB@bioperl.org> Yep. Unless we want this to also exist in Root::IO and delegate to File::Spec. -jason On Oct 31, 2006, at 5:54 AM, Albert Vilella wrote: > Hi, > > I was testing the bioperl-run/t/PAML.t and stumbled upon this a > catdir/catfile error: > > Can't locate object method "catdir" via package "Bio::Root::IO" at > /home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line > 113. > BEGIN failed--compilation aborted at > /home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line > 143. > Compilation failed in require at t/PAML.t line 64. > BEGIN failed--compilation aborted at t/PAML.t line 64. > > Should be be using File::Spec for catdir and catfile instead of > Root::IO? > > Cheers, > > Albert. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From jason at bioperl.org Tue Oct 31 12:10:51 2006 From: jason at bioperl.org (Jason Stajich) Date: Tue, 31 Oct 2006 09:10:51 -0800 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net> References: <000001c6f9f6$9ab12710$15327e82@pyrimidine> <24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu> <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net> Message-ID: <65F92B54-33FD-4D8F-90B7-49E2697CDBA2@bioperl.org> It just needs to have an annotation collection - so it would be Bio::AnnotateableI On Oct 31, 2006, at 8:21 AM, Hilmar Lapp wrote: > > On Oct 27, 2006, at 9:57 PM, Chris Fields wrote: > >> BTW, was that supposed to be Bio::AnnotatableI, or >> Bio::AnnotationHolderI? > > Sorry, the former. I guess I got confused with FeatureHolders. Too > bad Featureable isn't an English word. > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From hlapp at gmx.net Tue Oct 31 12:44:58 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 31 Oct 2006 12:44:58 -0500 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: References: Message-ID: Well isn't this a result of conflating some of the SeqFeatureI methods into the annotation collection? If I'm not mistaken on this then those methods were introduced in 1.5.0 and hence can go away without deprecation. -hilmar On Oct 31, 2006, at 11:37 AM, Brian Osborne wrote: > Chris, > > I don't think the intent was to remove the methods, rather we'd > just call > deprecated(). Example from AnnotatableI: > > sub remove_tag { > my ($self, at args) = @_; > > #uncomment in 1.6 > #$self->deprecated('remove_tag() is deprecated, use > remove_Annotations()'); > > return $self->annotation->remove_Annotations(@args); > } > > With regards to "why", I can't reconstruct the entire rationale > myself but I > can say that the newer names make more sense. Take that example > above - it's > function is to remove entire Annotations not just to remove tags, so > remove_Annotations is a better name. > > Brian O. > > > On 10/31/06 1:08 PM, "Chris Fields" wrote: > >> A bit OT: were we planning on getting rid of the various *_tag_* >> methods in >> AnnotatableI at some point? I'm a bit confused as to why they >> were added. > > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bosborne11 at verizon.net Tue Oct 31 11:37:01 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 31 Oct 2006 12:37:01 -0400 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: <001401c6fd0f$4239aa50$15327e82@pyrimidine> Message-ID: Chris, I don't think the intent was to remove the methods, rather we'd just call deprecated(). Example from AnnotatableI: sub remove_tag { my ($self, at args) = @_; #uncomment in 1.6 #$self->deprecated('remove_tag() is deprecated, use remove_Annotations()'); return $self->annotation->remove_Annotations(@args); } With regards to "why", I can't reconstruct the entire rationale myself but I can say that the newer names make more sense. Take that example above - it's function is to remove entire Annotations not just to remove tags, so remove_Annotations is a better name. Brian O. On 10/31/06 1:08 PM, "Chris Fields" wrote: > A bit OT: were we planning on getting rid of the various *_tag_* methods in > AnnotatableI at some point? I'm a bit confused as to why they were added. From cjfields at uiuc.edu Tue Oct 31 13:44:02 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 31 Oct 2006 12:44:02 -0600 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: Message-ID: <001e01c6fd1c$891fc4b0$15327e82@pyrimidine> Hilmar Lapp wrote: > Well isn't this a result of conflating some of the > SeqFeatureI methods into the annotation collection? > > If I'm not mistaken on this then those methods were > introduced in 1.5.0 and hence can go away without deprecation. > > -hilmar > > On Oct 31, 2006, at 11:37 AM, Brian Osborne wrote: > >> Chris, >> >> I don't think the intent was to remove the methods, rather we'd just >> call deprecated(). Example from AnnotatableI: >> >> sub remove_tag { >> my ($self, at args) = @_; >> >> #uncomment in 1.6 >> #$self->deprecated('remove_tag() is deprecated, use >> remove_Annotations()'); >> >> return $self->annotation->remove_Annotations(@args); } >> >> With regards to "why", I can't reconstruct the entire rationale >> myself but I can say that the newer names make more sense. Take that >> example above - it's function is to remove entire Annotations not >> just to remove tags, so remove_Annotations is a better name. >> >> Brian O. >> >> >> On 10/31/06 1:08 PM, "Chris Fields" wrote: >> >>> A bit OT: were we planning on getting rid of the various *_tag_* >>> methods in AnnotatableI at some point? I'm a bit confused as to why >>> they were added. Sorry Brian, what I meant was, based on CVS history, the various *tag* methods in AnnotatableI were added all at once, with deprecations already present in the commit. So the methods weren't there to begin with, then added only to be deprecated later? Hence the confusion... I think Hilmar's right; the CVS history indicates these were added just prior to rel. 1.5 by Allen and seem to be related to SeqFeatureI. I'm sure the intent was good, but they contradict methods in the Feature/Annotation HOWTO on retrieving Annotation objects via the Annotation::Collection object. I think that agrees with your point about the various Annotation* method names being the more appropriate ones. Does everybody agree we should just remove them? Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 31 13:53:16 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 31 Oct 2006 12:53:16 -0600 Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id() In-Reply-To: <375EA8B5-EE6E-47BA-A335-AAC44CC454C3@gmx.net> Message-ID: <000001c6fd1d$d4359c80$15327e82@pyrimidine> > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar Lapp > Sent: Tuesday, October 31, 2006 11:02 AM > To: n.haigh at sheffield.ac.uk > Cc: Bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id() > > The only thing I would add to Jason's reply is that it is easy to do > > if (! $seq->isa("Bio::SeqI")) { > my $bioseq = Bio::Seq->new(); > $bioseq->primary_seq($seq); > $seq = $bioseq; > } > > and from that point on all your objects are Bio::SeqI > compliant regardless of whether they were obtained that way or not. > > Aside from that I wonder why there isn't a -primary_seq > option in Bio::Seq::new - this would shorten the above into a > (more perl'ish) single line: > > $seq = Bio::Seq->new(-primary_seq=>$seq) unless > $seq->isa("Bio::SeqI"); > > Anyone takers to add that capability? > > -hilmar Sounds good to me! Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From nhansen at nhgri.nih.gov Tue Oct 31 14:51:23 2006 From: nhansen at nhgri.nih.gov (Nancy Hansen) Date: Tue, 31 Oct 2006 14:51:23 -0500 (EST) Subject: [Bioperl-l] Bio::SeqIO::scf header/comments handling Message-ID: Hello, As sequencing centers begin to deposit trace data from "Medical Sequencing" projects into the public archives, there is now the need to "anonymize" sequence trace files by removing embedded information which might be used to identify the individual who was the original source of the DNA being sequenced. I was hoping I might be able to use Bio::SeqIO to manipulate the comments contained in an SCF-formatted trace file, but I'm finding that Bio::SeqIO/Bio::Seq::SequenceTrace doesn't seem to store this information. Since SCF is a widely-accepted standard for trace files, would it be reasonable to include fields like "scf_comments" and "scf_header" in a Bio::Seq::SequenceTrace object and have Bio::SeqIO::scf populate them? Likewise, it would be great if write_seq could pull these values right from a SequenceTrace object rather than requiring them as arguments. I'd be happy to help in this effort if necessary. Thanks, --Nancy ************************************* Nancy F. Hansen, PhD nhansen at nhgri.nih.gov Bioinformatics Group NIH Intramural Sequencing Center (NISC) 5625 Fishers Lane Rockville, MD 20852 Phone: (301) 435-1560 Fax: (301) 435-6170 From lincoln.stein at gmail.com Tue Oct 31 15:24:17 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Tue, 31 Oct 2006 15:24:17 -0500 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <000001c6f78b$d1c65a30$15327e82@pyrimidine> References: <453E309B.9090007@sendu.me.uk> <000001c6f78b$d1c65a30$15327e82@pyrimidine> Message-ID: <6dce9a0b0610311224x79256b29sf102eb5c35865caf@mail.gmail.com> Are you going to go ahead with 1.52_XX ? If so, I will code GBrowse to look for 1.52 or higher. Lincoln On 10/24/06, Chris Fields wrote: > > .. > > > > 'handle'? I think it shows up as '6.2.13' simply because it was uploaded > > with the filename Perl6-Pugs-6.2.13.tar.gz > > Sorry, my point was that when Audrey T. uses '6.2.13', her $VERSION is > '6.002013'. So maybe we should follow a similar convention. Seems easier > and less confusing to me, at least. > > > As you point out, the code has the kind of $VERSION number we've been > > suggesting in this thread: > > > > > From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013': > > > > > > our $VERSION = 6.002013; > > > > > > That's also a very perlish-way to do it. And there are no developer > > > versions of Pugs, since it is always under active development. We > could > > try > > > something like: > > > > > > our $VERSION = 1.005002_01; > > > > Yes, this was already like one of my suggestions (1.0502_01), but I > > brought up the concern that 1.05 might be < 1.4. > > > > So then we have a question: do we try and fumble a 1.4 compatible number > > by using 1.60_10, or do we have a clean break, remove 1.4 from CPAN if > > it causes problems, and go for the 'proper' 1.006000 (1.6.0) with no > > room for RC numbering, or 1.006000010 (1.6.0.10) - the first final > > release following some 1.006000_001 (1.6.0.01 == rc1) RCs? > > I would go for the clean break if it follows perl/CPAN convention. > '1.60_10' looks like '1.60.10', not 1.6 or 1.6 RC1, so it's too confusing. > > If we can use '1.00600x' for 1.6, 1.6.1, etc, and '1.006000_00x' for 1.6 > RC1, 1.6 RC2 etc then that would be consistent and perl-compatible. > > BTW, the reason I looked at Pugs was to see what some of the Perl6 > developers were using. Who knows; they'll probably change it! > > .. > > > I don't think it would be a hassle; on the contrary it would be very > > useful to know the CPAN distribution actually works. I'm very happy with > > the idea that a release candidate gets fully tested... > > So you obviously feel strongly about it! ;> > > I don't have a problem as long as we stick with doing this from now on ( > i.e. > have a consistent versioning scheme, release policy, CPAN release policy, > etc). Would be nice for Jason/Brian/Hilmar to chime in as to the > reasoning > behind the older versioning scheme. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From hlapp at gmx.net Tue Oct 31 16:53:58 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 31 Oct 2006 16:53:58 -0500 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: <001e01c6fd1c$891fc4b0$15327e82@pyrimidine> References: <001e01c6fd1c$891fc4b0$15327e82@pyrimidine> Message-ID: On Oct 31, 2006, at 1:44 PM, Chris Fields wrote: > Does everybody agree we should just remove them? I wish you could but I'm afraid that would break stuff? Otherwise why were they added in the first place? I thought Bio::SeqFeature::Annotated needs them maybe? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Tue Oct 31 17:41:17 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 31 Oct 2006 16:41:17 -0600 Subject: [Bioperl-l] AnnotatableI tag methods, was Rfam/Pfam annotations and SimpleAlign In-Reply-To: Message-ID: <000001c6fd3d$ae37c240$15327e82@pyrimidine> > On Oct 31, 2006, at 1:44 PM, Chris Fields wrote: > > > Does everybody agree we should just remove them? > > I wish you could but I'm afraid that would break stuff? > Otherwise why were they added in the first place? I thought > Bio::SeqFeature::Annotated needs them maybe? > > -hilmar > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== Yep, removing them clobbers a ton of tests, including anything that requires SeqIO::FTHelper. Looks like SeqFeature::Generic and a few others use them. I could understand if these were meant to be permanent methods, but why add these in if they were to be deprecated in 1.6? Something that was meant to be a transition but wasn't finished? That seems to be indicated in the commented out lines for all the *tag* methods: #uncomment in 1.6 #$self->deprecated('remove_tag() is deprecated, use remove_Annotations()'); Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From lincoln.stein at gmail.com Tue Oct 31 18:18:07 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Tue, 31 Oct 2006 18:18:07 -0500 Subject: [Bioperl-l] Bio::DB::GFF::Util::Binning In-Reply-To: References: Message-ID: <6dce9a0b0610311518l3bec852q5d04a9b488621377@mail.gmail.com> Hi Keith, The current Bio/DB/GFF/Util/Binning.pm file just contains the hierarchical binning system that I implemented some time ago. Where is the R-tree system that you describe? How much of an improvement did the R-tree scheme give over the hierarchical scheme? FTYI the GFF3 implementation uses a different binning scheme in which there is a fixed-size bin. Every time a feature overlaps a bin, it creates a new row in a table. So big features will have multiple rows and little features that fit inside a bin will have only one row. The query for this is simpler and seems to give the same relative speedup as the hierarchical binning system. I'd really like to get these queries to go as fast as possible and would love to work with you on this if you're interested. Lincoln On 10/19/06, Keith Player wrote: > > I know that there may be some changes resulting from new GFF3 > implementations, > but thought I would see if the following is useful anyway. > > I implemented the R-tree binning schema as used by > Bio::DB::GFF::Util::Binning > and as mention in this article: > > I tested the following query on a normal table (no binning), but it > assumes > that you know the longest range in the table. So for example with a table > of > human genes, where the longest gene we know of is around 2.4Mb. > > SELECT COUNT(*) as count FROM groups WHERE start > max(0,[start-2.4Mb]) > AND > g.start < [end] AND g.end > [start] AND g.chromosome = '1' > > so for 100Mb:101Mb > > SELECT COUNT(*) as count FROM groups WHERE start > 97600000 AND g.start < > 101000000 AND g.end > 100000000 AND g.chromosome = '1' > > > where [start] and [end] define the region of interest. This query > outperforms > the R-Tree implementation on all tests that I have performed (for lengths > of > 200bp to 10Mb across a whole chromsome). Could this be of some practical > use? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From bosborne11 at verizon.net Tue Oct 31 21:31:49 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 31 Oct 2006 22:31:49 -0400 Subject: [Bioperl-l] Bio::SeqIO::scf header/comments handling In-Reply-To: Message-ID: Nancy, It looks like a good place to start would be the get_header() and _get_header methods in Bio::SeqIO::scf. If you read t/scf.t you can see that the author, at some point, wanted get_header to return meaningful information but stepping through the test shows it returning a lot of UNDEF. Now I don't know if this is due to the method or the source SCF file, but you might be able to get these methods to work yourself. But to answer your questions, yes, it certainly sounds reasonable that these values would be extracted by Bio::SeqIO::scf. Brian O. On 10/31/06 3:51 PM, "Nancy Hansen" wrote: > > Hello, > > As sequencing centers begin to deposit trace data from "Medical > Sequencing" projects into the public archives, there is now the need to > "anonymize" sequence trace files by removing embedded information which > might be used to identify the individual who was the original source of > the DNA being sequenced. > > I was hoping I might be able to use Bio::SeqIO to manipulate the > comments contained in an SCF-formatted trace file, but I'm finding that > Bio::SeqIO/Bio::Seq::SequenceTrace doesn't seem to store this information. > Since SCF is a widely-accepted standard for trace files, would it be > reasonable to include fields like "scf_comments" and "scf_header" in a > Bio::Seq::SequenceTrace object and have Bio::SeqIO::scf populate them? > Likewise, it would be great if write_seq could pull these values right > from a SequenceTrace object rather than requiring them as arguments. > > I'd be happy to help in this effort if necessary. > > Thanks, > --Nancy > > ************************************* > Nancy F. Hansen, PhD nhansen at nhgri.nih.gov > Bioinformatics Group > NIH Intramural Sequencing Center (NISC) > 5625 Fishers Lane > Rockville, MD 20852 > Phone: (301) 435-1560 Fax: (301) 435-6170 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Sun Oct 1 13:05:25 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 1 Oct 2006 12:05:25 -0500 Subject: [Bioperl-l] Tests involving remote databases In-Reply-To: <84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net> References: <000001c6e3e6$81630010$15327e82@pyrimidine> <6761B0A9-DEF0-4F33-9C89-39AFE104B61F@gmx.net> <79D77258-5A4E-4428-9322-00962D5C0419@uiuc.edu> <451E3707.4090400@sendu.me.uk> <0DA94AF7-DB0E-4DC5-8896-1048089D9AC5@uiuc.edu> <84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net> Message-ID: <3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu> On Sep 30, 2006, at 4:43 PM, Hilmar Lapp wrote: > > On Sep 30, 2006, at 10:57 AM, Chris Fields wrote: > >> There should be a failed test to let us know of the problem. As >> currently set up, the XEMBL server failure doesn't show up in >> Test::Harness test summaries. Biblio_biofetch.t had the similar >> problems before Brian's fixes. > > Just keep in mind that you may not want somebody's CPAN installation > to fail (or require a 'forced' install) just because some server > happens to be down for maintenance. > > -hilmar I don't think this would be a problem unless users specifically set BIOPERLDEBUG to 1, which is something most people don't bother with before installation (and probably not something we should promote for normal installation anyway). So, for CPAN installation we would suggest that BIOPERLDEBUG be 0 or not set at all, and outline the reasons why. The idea is to retain current behavior (remote DB access will not be run unless BIOPERLDEBUG is set to 1) and apply it to all tests requiring such access. Otherwise, just those tests are skipped (and not the rest of the tests, which occurs currently). If BIOPERLDEBUG is set, the next tests would check the URL, which passes/fails (based on the specific value of $@), and runs/skips tests based on the mere presence of $@, which indicates some URL issue. You can do this with Test::More, but I'm not sure this can be done with Test.pm or Test::Simple. The current behavior just skips all tests based on a single failed URL. Then, Test::Harness, as currently set, shows skipped tests as passed. The last run I posted previously where XEMBL_DB.t remote DB tests failed, I also ran all tests (make test) and get this, which doesn't tell us that the remote URL failed: ----------------------------------------- ... t/WABA.......................ok t/XEMBL_DB...................ok t/ztr........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping ztr.t tests ok All tests successful, 5 subtests skipped. ----------------------------------------- Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Sun Oct 1 13:17:24 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 1 Oct 2006 12:17:24 -0500 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: References: <7A592EAB-A869-4A6C-BFA8-F73F3DFD8F5B@gmx.net> <09FB1EB0-2C1C-4FCF-8339-E78556EFEFF2@uiuc.edu> <8D75FE6D-C02D-4A86-93FA-B7256050AF11@uiuc.edu> <40155903-555A-4662-BCCE-38E5E3784118@uiuc.edu> <54E79A5F-5446-4D8E-AD26-B70894048D60@gmx.net> <1D69005A-DF0E-4F37-93FE-7577A32CC625@gmx.net> Message-ID: The '-w' flag on the shebang line is the source of those errors. I never set it anymore on Windows due to this; I just use the 'use warnings' pragma. If you use 'perl -I. t/test.t' you can normally get around the '-w' assumed by using 'make test'. I will try running tests on bioperl-db and bioperl tomorrow on WinXP to confirm these. Chris On Sep 30, 2006, at 6:10 PM, Seth Johnson wrote: > How do I get rid of all of the warnings for "redefined subroutines" > during > the test?? It clutters the output and I can't see the errors. > > On 9/30/06, Hilmar Lapp wrote: >> >> It doesn't shed more light but it does raise an alert flag. All tests >> are supposed to pass. The fact that they don't means the problems you >> are seeing have nothing to do with your specific data or script. >> >> First off - can anyone else confirm those errors using the latest >> Bioperl-db and Bioperl? >> >> Second - Seth could you run those tests individually, e.g., using >> >> $ make test test_02species TEST_VERBOSE=1 >> >> and similarly for the other tests that have failures and post the >> output. Let's start with 02species and 03simpleseq. >> >> -hilmar >> >> On Sep 30, 2006, at 5:44 PM, Seth Johnson wrote: >> >>> There are errors during the test. Here's their summary: >>> ____________________________ >>> Failed Test Stat Wstat Total Fail Failed List of Failed >>> ------------------------------------------------------------- >>> t\02species.t 65 2 3.08% 63 65 >>> t\03simpleseq.t 1 256 59 106 179.66% 7-59 >>> t\04swiss.t 52 14 26.92% 25 27-34 38-42 >>> t\12ontology.t 2 512 738 1471 199.32% 3-738 >>> t\16obda.t 12 3 25.00% 10-12 >>> ____________________________ >>> >>> May be that can shed some light on the problem?!?! >>> >>> On 9/29/06, Hilmar Lapp < hlapp at gmx.net> wrote:This may in fact be >>> a knock-on effect of the fixes? >>> >>> Seth, did you run the test suite that comes with bioperl-db, and did >>> you get any errors? >>> >>> -hilmar >>> >>> On Sep 28, 2006, at 2:26 PM, Chris Fields wrote: >>> >>>> Seth, >>>> >>>> The organism issue is a bug and has been reported, though I thought >>>> it was fixed. >>>> >>>> The lack of the date and the version is a bit odd, but there have >>>> been a lot of changes lately to bioperl-live (core bioperl in CVS), >>>> and a few to bioperl-db. How old is your bioperl and bioperl-db >>>> installation. Hilmar, any additional thoughts? >>>> >>>> Chris >>>> >>>> On Sep 28, 2006, at 11:10 AM, Seth Johnson wrote: >>>> >>>>> Thank you. That takes care of that, however, I do have another >>>>> gripe. When >>>>> running my script, quoted before, with "my $out = >>>>> Bio::SeqIO->newFh('-format' => 'genbank');", I have several key >>>>> pieces of >>>>> information missing. The most important one is the version >>>>> number. There's >>>>> also a date missing, and source organism name is corrupted. >>>>> Here's what I >>>>> get: >>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>>>> LOCUS NM_014580 2145 bp dna linear >>>>> UNK >>>>> DEFINITION Homo sapiens solute carrier family 2, (facilitated >>>>> glucose >>>>> transporter) member 8 (SLC2A8), mRNA. >>>>> ACCESSION NM_014580 >>>>> SOURCE sapiens. >>>>> ORGANISM sapiens >>>>> Eukaryota; Fungi/Metazoa group; Metazoa; Eumetazoa; >>>>> Bilateria; >>>>> Coelomata; Deuterostomia; Chordata; Craniata; >>> Vertebrata; >>>>> Gnathostomata; Teleostomi; Euteleostomi; >>>>> Sarcopterygii; >>>>> Tetrapoda; >>>>> Amniota; Mammalia; Theria; Eutheria; Euarchontoglires; >>>>> Primates; >>>>> Haplorrhini; Simiiformes; Catarrhini; Hominoidea; >>>>> Hominidae; >>>>> Homo/Pan/Gorilla group; Homo. >>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>>>> >>>>> All of the missing information is stored in BioSQL and >>>>> theoretically should >>>>> be in the outpu. Here's how NCBI genbank file looks: >>>>> >>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>>>> >>>>> LOCUS NM_014580 2145 bp mRNA linear >>>>> PRI 17-OCT-2005 >>>>> DEFINITION Homo sapiens solute carrier family 2, (facilitated >>>>> glucose >>>>> transporter) member 8 (SLC2A8), mRNA. >>>>> ACCESSION NM_014580 >>>>> VERSION NM_014580.3 GI:51870928 >>>>> KEYWORDS . >>>>> SOURCE Homo sapiens (human) >>>>> ORGANISM Homo sapiens >>>>> >>>>> Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; >>>>> Euteleostomi; >>>>> Mammalia; Eutheria; Euarchontoglires; Primates; >>>>> Haplorrhini; >>>>> Catarrhini; Hominidae; Homo. >>>>> >>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>>>> >>>>> >>>>> On 9/28/06, Chris Fields wrote: >>>>>> >>>>>> Those are from the excessively paranoid '-w' flag on the shebang >>>>>> line. If you remove the flag but add the 'use warnings' pragma >>> the >>>>>> 'subroutine x redefined' warnings go away. This, BTW, is one >>> of the >>>>>> quirks of the ActivePerl distribution; other OSs don't have the >>> same >>>>>> problem. >>>>>> >>>>>> The 'solution' described on that page is actually a workaround, >>>>>> not a >>>>>> bugfix. It causes problems with stack traces with error handling >>>>>> but >>>>>> seems harmless beyond that. I haven't been able to find a >>>>>> satisfactory fix which works on all OS's. >>>>>> >>>>>> Chris >>>>>> >>>>>> >>>>>> On Sep 28, 2006, at 10:42 AM, Seth Johnson wrote: >>>>>> >>>>>>> This is under Windows, but using ActiveState Komodo 3.5 and >>>>>>> their >>>>>>> latest Perl for Windows and latest BioPerl & BioPerl-db from >>>>>>> CVS. >>>>>>> >>>>>>> I actually just stumbled upon a solution. It's described in the >>>>>>> "Installing Bioperl on Windows" by adding a comma after >>> $class: in >>>>>>> Bio::Root::Root throw() subroutine. Thanks for hinting me about >>>>>>> what I run it on. >>>>>>> >>>>>>> The code works now, BUT it spews whole bunch of warnings about >>>>>>> "Subroutine .... redefined": >>>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>>>>> Subroutine new redefined at c:/Perl/site/lib/Bio\BioEntry >>>>>>> .pm line 88. >>>>>>> Subroutine object_id redefined at c:/Perl/site/lib/Bio >>> \BioEntry.pm >>>>>>> line 128. >>>>>>> Subroutine version redefined at c:/Perl/site/lib/Bio\BioEntry.pm >>>>>>> line 150. >>>>>>> Subroutine authority redefined at c:/Perl/site/lib/Bio >>> \BioEntry.pm >>>>>>> line 171. >>>>>>> Subroutine namespace redefined at c:/Perl/site/lib/Bio >>> \BioEntry.pm >>>>>>> line 192. >>>>>>> Subroutine display_name redefined at c:/Perl/site/lib/Bio >>>>>>> \BioEntry.pm line 217. >>>>>>> Subroutine description redefined at c:/Perl/site/lib/Bio >>>>>>> \BioEntry.pm line 241. >>>>>>> Subroutine new redefined at c:/Perl/site/lib/Bio\Root\Root.pm >>> line >>>>>>> 201. >>>>>>> Subroutine verbose redefined at c:/Perl/site/lib/Bio\Root >>>>>>> \Root.pm >>>>>>> line 234. >>>>>>> Subroutine _register_for_cleanup redefined at c:/Perl/site/lib/ >>> Bio >>>>>>> \Root\Root.pm line 246. >>>>>>> Subroutine _unregister_for_cleanup redefined at c:/Perl/site/ >>>>>>> lib/ >>>>>>> Bio >>>>>>> \Root\Root.pm line 256. >>>>>>> Subroutine _cleanup_methods redefined at c:/Perl/site/lib/Bio >>> \Root >>>>>>> \Root.pm line 263. >>>>>>> Subroutine throw redefined at c:/Perl/site/lib/Bio\Root\Root.pm >>>>>>> line 316. >>>>>>> Subroutine debug redefined at c:/Perl/site/lib/Bio\Root\Root.pm >>>>>>> line 379. >>>>>>> Subroutine _load_module redefined at c:/Perl/site/lib/Bio\Root >>>>>>> \Root.pm line 398. >>>>>>> Subroutine DESTROY redefined at c:/Perl/site/lib/Bio\Root >>>>>>> \Root.pm >>>>>>> line 426. >>>>>>> Subroutine new redefined at c:/Perl/site/lib/Bio\Root\RootI.pm >>> line >>>>>>> 117. >>>>>>> Subroutine _initialize redefined at c:/Perl/site/lib/Bio\Root >>>>>>> \RootI.pm line 128. >>>>>>> ... >>>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>>>>> >>>>>>> >>>>>>> On 9/28/06, Chris Fields wrote: I had >>> problems >>>>>>> with bioperl-db on native WinXP (not cygwin), but I >>>>>>> did manage to get it running in cygwin with some effort. The >>> issue >>>>>>> on native WinXP was related to Bio::Root::Root::throw(), though. >>>>>>> >>>>>>> There is a bug and workaround filed on Bugzilla, but I haven't >>>>>>> worked >>>>>>> on it in a while (and the workaround has some problems as >>> well). I >>>>>>> may try running it again to see what happens. >>>>>>> >>>>>>> http://bugzilla.open-bio.org/show_bug.cgi?id=1938 >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On Sep 28, 2006, at 9:04 AM, Hilmar Lapp wrote: >>>>>>> >>>>>>>> Very odd. This is under Windows, presumably using Cygwin? >>>>>>>> >>>>>>>> The method Bio::Root::Root::throw() clearly exists, and >>>>>>>> PersistentObject inherits from it. The exception it was >>> trying to >>>>>>>> throw has nothing to do with failure or success to find the >>>>>>>> database >>>>>>>> row (actually it did succeed since otherwise it wouldn't >>> construct >>>>>>>> the object) but with dynamically loading a class, presumably >>>>>>>> Bio::DB::Persistent::Seq. >>>>>>>> >>>>>>>> Are you using the 1.5.x release of bioperl? >>>>>>>> >>>>>>>> Does anyone on the list have any experience with these sorts of >>>>>>>> things on Windows? >>>>>>>> >>>>>>>> (Seth, I've moved this thread to the bioperl list, since >>>>>>>> this is >>>>>>> what >>>>>>>> the problem is about.) >>>>>>>> >>>>>>>> -hilmar >>>>>>>> >>>>>>>> On Sep 27, 2006, at 1:39 PM, Seth Johnson wrote: >>>>>>>> >>>>>>>>> Hello guys, >>>>>>>>> >>>>>>>>> I successfully populated the biosql database, thanks to you. >>>>>>>>> Now, >>>>>>>>> I'm >>>>>>>>> trying to retrieve a sequence from it following the example >>> from >>>>>>>>> BOSC2003 >>>>>>>>> slides and ran into uninformative error (at least to me it >>>>>>>>> doesn't >>>>>>>>> mean >>>>>>>>> anyting). I suspect that I'm missing something and hope you >>> can >>>>>>>>> point me in >>>>>>>>> the right direction. Here's my source code: >>>>>>>>> >>>>>>> >>> ------------------------------------------------------------------- >>>>>>> -- >>>>>>>>> - >>>>>>>>> --- >>>>>>>>> #!/usr/bin/perl -w >>>>>>>>> use strict; >>>>>>>>> use warnings; >>>>>>>>> >>>>>>>>> use Bio::Seq; >>>>>>>>> use Bio::Seq::SeqFactory; >>>>>>>>> use Bio::DB::SimpleDBContext; >>>>>>>>> use Bio::DB::BioDB; >>>>>>>>> >>>>>>>>> my $dbc = Bio::DB::SimpleDBContext->new( >>>>>>>>> -driver => 'mysql', >>>>>>>>> -dbname => 'BioSQL_1', >>>>>>>>> -host => ' 192.168.1.3', >>>>>>>>> -user => 'xxxxx', >>>>>>>>> -pass => 'xxxxxx' >>>>>>>>> ); >>>>>>>>> >>>>>>>>> my $db = Bio::DB::BioDB->new(-database => 'biosql', >>>>>>>>> -dbcontext => $dbc); >>>>>>>>> >>>>>>>>> my $seq = Bio::Seq->new(-accession_number => 'NM_014580', - >>>>>>>>> namespace => >>>>>>>>> 'refseq_H_sapiens'); >>>>>>>>> my $seqfact = Bio::Seq::SeqFactory->new(-type => 'Bio::Seq'); >>>>>>>>> my $adp = $db->get_object_adaptor($seq); >>>>>>>>> my $dbseq = $adp->find_by_unique_key($seq, -obj_factory => >>>>>>> $seqfact); >>>>>>>>> >>>>>>>>> my $out = Bio::SeqIO->newFh('-format' => 'EMBL'); >>>>>>>>> print $out $dbseq; >>>>>>>>> >>>>>>>>> exit; >>>>>>>>> >>> ----------------------------------------------------------------- >>>>>>>>> >>>>>>>>> Just when the "find_by_unique_key" function is executed I >>> get the >>>>>>>>> following >>>>>>>>> error: >>>>>>>>> >>>>>>>>> ================================ >>>>>>>>> Undefined subroutine &Bio::Root::Root::throw called at >>>>>>>>> c:/Perl/site/lib/Bio/DB/Persistent/PersistentObject.pm line >>> 199. >>>>>>>>> ================================ >>>>>>>>> >>>>>>>>> The sequence does exist in the database. I checked that. Any >>>>>>>>> ideas??? >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Best Regards, >>>>>>>>> >>>>>>>>> >>>>>>>>> Seth Johnson >>>>>>>>> Senior Bioinformatics Associate >>>>>>>>> _______________________________________________ >>>>>>>>> BioSQL-l mailing list >>>>>>>>> BioSQL-l at lists.open-bio.org >>>>>>>>> http://lists.open-bio.org/mailman/listinfo/biosql-l >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> =========================================================== >>>>>>>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>>>>>>> =========================================================== >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Bioperl-l mailing list >>>>>>>> Bioperl-l at lists.open-bio.org >>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>>>> >>>>>>> Christopher Fields >>>>>>> Postdoctoral Researcher >>>>>>> Lab of Dr. Robert Switzer >>>>>>> Dept of Biochemistry >>>>>>> University of Illinois Urbana-Champaign >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best Regards, >>>>>>> >>>>>>> >>>>>>> Seth Johnson >>>>>>> Senior Bioinformatics Associate >>>>>>> >>>>>>> Ph: (202) 470-0900 >>>>>>> Fx: (775) 251-0358 >>>>>> >>>>>> Christopher Fields >>>>>> Postdoctoral Researcher >>>>>> Lab of Dr. Robert Switzer >>>>>> Dept of Biochemistry >>>>>> University of Illinois Urbana-Champaign >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Best Regards, >>>>> >>>>> >>>>> Seth Johnson >>>>> Senior Bioinformatics Associate >>>>> >>>>> Ph: (202) 470-0900 >>>>> Fx: (775) 251-0358 >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> Christopher Fields >>>> Postdoctoral Researcher >>>> Lab of Dr. Robert Switzer >>>> Dept of Biochemistry >>>> University of Illinois Urbana-Champaign >>>> >>>> >>>> >>> >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> >>> >>> >>> >>> >>> >>> >>> >>> -- >>> Best Regards, >>> >>> >>> Seth Johnson >>> Senior Bioinformatics Associate >>> >>> Ph: (202) 470-0900 >>> Fx: (775) 251-0358 >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> > > > -- > Best Regards, > > > Seth Johnson > Senior Bioinformatics Associate > > Ph: (202) 470-0900 > Fx: (775) 251-0358 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From osborne1 at optonline.net Sun Oct 1 17:49:47 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Sun, 01 Oct 2006 17:49:47 -0400 Subject: [Bioperl-l] bioperl-network warnings when loading the DIP dataset In-Reply-To: <20061001183214.GB12075@iucha.net> Message-ID: Florin, This is fixed in CVS now. What had happened is that the DIP file had some minimal protein (node) entries where the only id available was DIP's internal identifier. Not ideal to have to use these as accessions but there's no other choice. Thank you for the note, and in the future write to bioperl-l since there may be others who are interested in hearing about what you've encountered. Brian O. On 10/1/06 2:32 PM, "Florin Iucha" wrote: > Hello, > > I have downloaded a CVS snapshot [1] of your module, bioperl-network, and > I am using it to read the 20060402 edition release of the DIP [2] dataset. > > Starting with the simple program you show in the man page: > > my $io = Bio::Network::IO->new(-format => 'psi', > -file => $ARGV[0]); > > my $network = $io->next_network; > > I get 772 instances of: > > Use of uninitialized value in string eq at > /usr/local/share/perl/5.8.8/Bio/Network/IO/psi.pm line 326. > > I don't know if it is just an annoyance or something bad, so you might > want to take a look at it. > > Thank you for your work, > florin > > [1] http://cvs.open-bio.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-network/ > [2] http://dip.doe-mbi.ucla.edu/ From osborne1 at optonline.net Sun Oct 1 17:56:39 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Sun, 01 Oct 2006 17:56:39 -0400 Subject: [Bioperl-l] bioperl-network warnings when loading the DIP dataset In-Reply-To: <20061001211844.GC12075@iucha.net> Message-ID: Florin, I'm not seeing any segmentation fault using the same file you're using as input (dip20060402.mif). I'm assuming you don't see this error when you use smaller files as input, like those in the t/data directory. When I watch the script in top I see Perl using about 135Mb (RSIZE) right before the script exits. How much memory do you use? Thank you for the note, and in the future write to bioperl-l since there may be others who are interested in hearing about what you've encountered. Brian O. On 10/1/06 5:18 PM, "Florin Iucha" wrote: > On Sun, Oct 01, 2006 at 01:32:14PM -0500, Florin Iucha wrote: >> I have downloaded a CVS snapshot [1] of your module, bioperl-network, and >> I am using it to read the 20060402 edition release of the DIP [2] dataset. > > Using the attached script, I am getting a segmentation fault at the > end, right after printing "That's all, Folks!" Maybe some cleanup is > going off in a wrong direction. > > florin From florin at iucha.net Sun Oct 1 20:24:03 2006 From: florin at iucha.net (Florin Iucha) Date: Sun, 1 Oct 2006 19:24:03 -0500 Subject: [Bioperl-l] bioperl-network warnings when loading the DIP dataset In-Reply-To: References: <20061001211844.GC12075@iucha.net> Message-ID: <20061002002403.GD12075@iucha.net> On Sun, Oct 01, 2006 at 05:56:39PM -0400, Brian Osborne wrote: > I'm not seeing any segmentation fault using the same file you're using as > input (dip20060402.mif). I'm assuming you don't see this error when you use > smaller files as input, like those in the t/data directory. The t/data files are fine. Also the largest file (full_2_psi1.xml) from release 2006-09-19 of the MINT [1] database does not produce the crash. It has a new warning, however: Can't call method "text" on an undefined value at /usr/local/share/perl/5.8.8/Bio/Network/IO/psi.pm line 290. > When I watch the script in top I see Perl using about 135Mb (RSIZE) right > before the script exits. How much memory do you use? "ps ux" tells me VSZ = 272788 and RSZ = 254992. This is on x86-64 with 64 bit perl. The box has 2 GB of physical memory so these numbers don't seem to be a concern. > Thank you for the note, and in the future write to bioperl-l since there may > be others who are interested in hearing about what you've encountered. Do'h! You have the list address loud and clear in three places, but I got your contact info from the AUTHORS. Will use the proper channel from now on! Thanks, florin [1] ftp://mint.bio.uniroma2.it/pub/release/psi1/ -- If we wish to count lines of code, we should not regard them as lines produced but as lines spent. -- Edsger Dijkstra -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: From cjfields at uiuc.edu Mon Oct 2 00:35:22 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 1 Oct 2006 23:35:22 -0500 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: Message-ID: <000001c6e5dc$2eceabe0$15327e82@pyrimidine> Seth, What version of MySQL and perl are you using? I'm using MySQL 5.0.18 (but am upgrading to 5.0.24 tomorrow) and ActivePerl 5.8.819. I ran into a few problems with bioperl-db tests which were unrelated the ones below, but I'm wondering if it is a difference in MySQL versions. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Seth Johnson > Sent: Saturday, September 30, 2006 6:35 PM > To: Hilmar Lapp > Cc: Chris Fields; Bioperl List > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > Here're complete test details: > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ... > FAILED tests 10-12 > Failed 3/12 tests, 75.00% okay > Failed Test Stat Wstat Total Fail Failed List of Failed > -------------------------------------------------------------------------- > ----- > t\02species.t 65 2 3.08% 63 65 > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > t\12ontology.t 2 512 738 1471 199.32% 3-738 > t\16obda.t 12 3 25.00% 10-12 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From torsten.seemann at infotech.monash.edu.au Mon Oct 2 02:06:50 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Mon, 02 Oct 2006 16:06:50 +1000 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: <2A31B5F3-8D44-45C3-89C6-E42D802C5D14@gmx.net> References: <451C8ED8.2060003@infotech.monash.edu.au> <451CC40D.2030401@sendu.me.uk> <2A31B5F3-8D44-45C3-89C6-E42D802C5D14@gmx.net> Message-ID: <4520AC7A.1050009@infotech.monash.edu.au> >>> I have removed all use/@ISA Bio::Root::Object references from >>> bioperl-live, except for those in Bio::Root::* itself: >> So I'd say they're both relics that can be removed. In fact I was >> planning on getting rid off all references to both of these modules >> before you did, so thanks! :) > I think they can go. It's probably a pre-1.0 deprecation that somehow > was never followed through on. Today I did a fresh CVS checkout of bioperl-live, and deleted the following modules and tests, and all tests passed with BIOPERLDEBUG=0 * Bio::Root::Err * Bio::Root::Global * Bio::Root::IOManager * Bio::Root::Object * Bio::Root::Storable * Bio::Root::Utilities # may be used by third parties? * Bio::Root::Vector * Bio::Root::Xref * t/Root-Utilities.t # need to keep if we keep Utilities.pm * t/RootStorable.t Should we schedule for deprecation, or deprecate immediately as Hilmar suggested they were meant to be deprecated long ago ? -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From bix at sendu.me.uk Mon Oct 2 05:40:02 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 02 Oct 2006 10:40:02 +0100 Subject: [Bioperl-l] Tests involving remote databases In-Reply-To: <3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu> References: <000001c6e3e6$81630010$15327e82@pyrimidine> <6761B0A9-DEF0-4F33-9C89-39AFE104B61F@gmx.net> <79D77258-5A4E-4428-9322-00962D5C0419@uiuc.edu> <451E3707.4090400@sendu.me.uk> <0DA94AF7-DB0E-4DC5-8896-1048089D9AC5@uiuc.edu> <84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net> <3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu> Message-ID: <4520DE72.4000603@sendu.me.uk> Chris Fields wrote: > > The idea is to retain current behavior (remote DB access will not be > run unless BIOPERLDEBUG is set to 1) and apply it to all tests > requiring such access. Otherwise, just those tests are skipped (and > not the rest of the tests, which occurs currently). If BIOPERLDEBUG > is set, the next tests would check the URL, which passes/fails (based > on the specific value of $@), and runs/skips tests based on the mere > presence of $@, which indicates some URL issue. You can do this with > Test::More, but I'm not sure this can be done with Test.pm or > Test::Simple. Firstly, BIOPERLDEBUG should not be abused; it should be used only when you want to see extra debugging messages. There should be another variable that you can set to choose if network-requiring tests are run, and it should also be a configurable choice when you run perl Makefile.PL. (But changing this isn't going to happen for 1.5.2) When the server problem is ambiguous we should not fail the test. Just make the skip message visible and pass all ok... > The current behavior just skips all tests based on a single failed > URL. Then, Test::Harness, as currently set, shows skipped tests as > passed. The last run I posted previously where XEMBL_DB.t remote DB > tests failed, I also ran all tests (make test) and get this, which > doesn't tell us that the remote URL failed: > > ----------------------------------------- > > ... > t/WABA.......................ok > t/XEMBL_DB...................ok > t/ztr........................Bio::SeqIO::staden::read of bioperl-ext > is not installed or is installed incorrectly - skipping ztr.t tests > ok > All tests successful, 5 subtests skipped. All you have to do to make it visible is start the skip message with the work 'Skip': skip('Skip server may be down',1); ... t/WABA.......................ok t/XEMBL_DB...................ok 1/9 skipped: server may be down t/ztr........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping ztr.t tests t/ztr........................ok It's nicer when using Test::More. From bix at sendu.me.uk Mon Oct 2 05:55:27 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 02 Oct 2006 10:55:27 +0100 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: <4520AC7A.1050009@infotech.monash.edu.au> References: <451C8ED8.2060003@infotech.monash.edu.au> <451CC40D.2030401@sendu.me.uk> <2A31B5F3-8D44-45C3-89C6-E42D802C5D14@gmx.net> <4520AC7A.1050009@infotech.monash.edu.au> Message-ID: <4520E20F.6040406@sendu.me.uk> Torsten Seemann wrote: > >>> I have removed all use/@ISA Bio::Root::Object references from > >>> bioperl-live, except for those in Bio::Root::* itself: > > >> So I'd say they're both relics that can be removed. In fact I was > >> planning on getting rid off all references to both of these modules > >> before you did, so thanks! :) > >> I think they can go. It's probably a pre-1.0 deprecation that somehow >> was never followed through on. > > Today I did a fresh CVS checkout of bioperl-live, and deleted the > following modules and tests, and all tests passed with BIOPERLDEBUG=0 > > * Bio::Root::Err > * Bio::Root::Global > * Bio::Root::IOManager > * Bio::Root::Object > * Bio::Root::Storable > * Bio::Root::Utilities # may be used by third parties? > * Bio::Root::Vector > * Bio::Root::Xref > * t/Root-Utilities.t # need to keep if we keep Utilities.pm > * t/RootStorable.t > > Should we schedule for deprecation, or deprecate immediately as Hilmar > suggested they were meant to be deprecated long ago ? I'm happy to get rid of them all straight away. Does anyone object? From florin at iucha.net Sun Oct 1 21:40:07 2006 From: florin at iucha.net (Florin Iucha) Date: Sun, 1 Oct 2006 20:40:07 -0500 Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext on AMD64 Message-ID: <20061002014007.GG12075@iucha.net> Hello, I am trying to install bioperl-network from CVS. I found this to require bioperl from CVS, which requires bioperl-ext from CVS. I have compiled and installed io_lib 1.10.1. After running "perl Makefile.PL; make test" in bioperl-ext I see a lot sources being compiled, then: cc -c -I./libs -D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -O2 -DVERSION=\"1.5.1\" -DXS_VERSION=\"1.5.1\" -fPIC "-I/usr/lib/perl/5.8/CORE" -DPOSIX -DNOERROR Align.c Running Mkbootstrap for Bio::Ext::Align () chmod 644 Align.bs rm -f ../blib/arch/auto/Bio/Ext/Align/Align.so cc -shared -L/usr/local/lib Align.o -o ../blib/arch/auto/Bio/Ext/Align/Align.so libs/libsw.a \ -lm \ /usr/bin/ld: libs/libsw.a(aln.o): relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC libs/libsw.a: could not read symbols: Bad value collect2: ld returned 1 exit status make[1]: *** [../blib/arch/auto/Bio/Ext/Align/Align.so] Error 1 make[1]: Leaving directory `/scratch/dmbio/tools/bioperl-ext/Bio/Ext/Align' make: *** [subdirs] Error 2 This is on a Debian AMD64 box: florin at zeus $ gcc -v Using built-in specs. Target: x86_64-linux-gnu Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr --enable-checking=release x86_64-linux-gnu Thread model: posix gcc version 4.1.2 20060901 (prerelease) (Debian 4.1.1-13) florin at zeus $ perl -V Summary of my perl5 (revision 5 version 8 subversion 8) configuration: Platform: osname=linux, osvers=2.6.16-1-vserver-amd64-k8, archname=x86_64-linux-gnu-thread-multi uname='linux excelsior 2.6.16-1-vserver-amd64-k8 #2 smp tue apr 4 03:40:49 utc 2006 x86_64 gnulinux ' config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=x86_64-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.8 -Darchlib=/usr/lib/perl/5.8 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.8.8 -Dsitearch=/usr/local/lib/perl/5.8.8 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Uusesfio -Uusenm -Duseshrplib -Dlibperl=libperl.so.5.8.8 -Dd_dosuid -des' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=define use64bitall=define uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2', cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include' ccversion='', gccversion='4.1.2 20060729 (prerelease) (Debian 4.1.1-10)', gccosandvers='' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='cc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt perllibs=-ldl -lm -lpthread -lc -lcrypt libc=/lib/libc-2.3.6.so, so=so, useshrplib=true, libperl=libperl.so.5.8.8 gnulibc_version='2.3.6' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: MULTIPLICITY PERL_IMPLICIT_CONTEXT PERL_MALLOC_WRAP THREADS_HAVE_PIDS USE_64_BIT_ALL USE_64_BIT_INT USE_ITHREADS USE_LARGE_FILES USE_PERLIO USE_REENTRANT_API The compiler command line for aln.o is lacking -fPIC: cc -D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DPOSIX -DNOERROR -c -o aln.o aln.c Adding -fPIC to the CCFLAGS variable in Bio/Ext/Align/Makefile and Makefile seems to take build further, but it fails with a similar error in Bio/SeqIO/staden/_Inline/build/Bio/SeqIO/staden/read. That Makefile seems to be regenerated every time I run 'make test' in the top level directory. The error in ../staden/read is: rm -f blib/arch/auto/Bio/SeqIO/staden/read/read.so cc -shared -L/usr/local/lib read.o -o blib/arch/auto/Bio/SeqIO/staden/read/read.so \ -L/usr/local/lib -lread -lz \ /usr/bin/ld: /usr/local/lib/libread.a(libread_a-Read.o): relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC /usr/local/lib/libread.a: could not read symbols: Bad value collect2: ld returned 1 exit status make: *** [blib/arch/auto/Bio/SeqIO/staden/read/read.so] Error 1 So, the questions appears to be: - should "-fPIC" be appended to CFLAGS in the generated Makefiles? - is there anything wrong with io_lib flags? - has anybody built bioperl-ext on AMD64? I can help with debugging or testing if given a gentle nudge in the right direction, but I have little experience with the interactions between perl and static libraries on 64 bit. Thanks, florin -- If we wish to count lines of code, we should not regard them as lines produced but as lines spent. -- Edsger Dijkstra -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: From bix at sendu.me.uk Mon Oct 2 06:52:47 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 02 Oct 2006 11:52:47 +0100 Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext on AMD64 In-Reply-To: <20061002014007.GG12075@iucha.net> References: <20061002014007.GG12075@iucha.net> Message-ID: <4520EF7F.40908@sendu.me.uk> Florin Iucha wrote: > Hello, > > I am trying to install bioperl-network from CVS. I found this to > require bioperl from CVS, which requires bioperl-ext from CVS. I can't help with the compile problems you encountered (other than to say I also have problems under AMD64), but from where did you get the idea that bioperl (live/core) requires bioperl-ext? It doesn't, though recent changes to Makefile.PL may give that impression... From cjfields at uiuc.edu Mon Oct 2 08:26:57 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 07:26:57 -0500 Subject: [Bioperl-l] Tests involving remote databases In-Reply-To: <4520DE72.4000603@sendu.me.uk> References: <000001c6e3e6$81630010$15327e82@pyrimidine> <6761B0A9-DEF0-4F33-9C89-39AFE104B61F@gmx.net> <79D77258-5A4E-4428-9322-00962D5C0419@uiuc.edu> <451E3707.4090400@sendu.me.uk> <0DA94AF7-DB0E-4DC5-8896-1048089D9AC5@uiuc.edu> <84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net> <3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu> <4520DE72.4000603@sendu.me.uk> Message-ID: On Oct 2, 2006, at 4:40 AM, Sendu Bala wrote: > Chris Fields wrote: >> >> The idea is to retain current behavior (remote DB access will not be >> run unless BIOPERLDEBUG is set to 1) and apply it to all tests >> requiring such access. Otherwise, just those tests are skipped (and >> not the rest of the tests, which occurs currently). If BIOPERLDEBUG >> is set, the next tests would check the URL, which passes/fails (based >> on the specific value of $@), and runs/skips tests based on the mere >> presence of $@, which indicates some URL issue. You can do this with >> Test::More, but I'm not sure this can be done with Test.pm or >> Test::Simple. > > Firstly, BIOPERLDEBUG should not be abused; it should be used only > when > you want to see extra debugging messages. There should be another > variable that you can set to choose if network-requiring tests are > run, > and it should also be a configurable choice when you run perl > Makefile.PL. > > (But changing this isn't going to happen for 1.5.2) > > When the server problem is ambiguous we should not fail the test. Just > make the skip message visible and pass all ok... I agree, as well as with your assessment of BIOPERLDEBUG (which I alluded to in a previous post). Torsten suggested creating a new env. variable for network tests. It's obvious this won't be done before 1.5.2, but we can make plans towards the next release. >> The current behavior just skips all tests based on a single failed >> URL. Then, Test::Harness, as currently set, shows skipped tests as >> passed. The last run I posted previously where XEMBL_DB.t remote DB >> tests failed, I also ran all tests (make test) and get this, which >> doesn't tell us that the remote URL failed: >> >> ----------------------------------------- >> >> ... >> t/WABA.......................ok >> t/XEMBL_DB...................ok >> t/ztr........................Bio::SeqIO::staden::read of bioperl-ext >> is not installed or is installed incorrectly - skipping ztr.t tests >> ok >> All tests successful, 5 subtests skipped. > > All you have to do to make it visible is start the skip message > with the > work 'Skip': > > skip('Skip server may be down',1); > > ... > t/WABA.......................ok > > t/XEMBL_DB...................ok > > 1/9 skipped: server may be down > t/ztr........................Bio::SeqIO::staden::read of bioperl- > ext is > not installed or is installed incorrectly - skipping ztr.t tests > t/ztr........................ok > > > It's nicer when using Test::More. Okay, if Test::Harness picks that up it would be okay. We could use skip blocks to skip subsets of tests that require remote access (like SeqFeature.t) as opposed to skipping all tests. I think we want to avoid promoting running tests with BIOPERLDEBUG (or similar) upon installation for everyday installation anyway (such as from CPAN, which Hilmar points out). It's not something everybody installing a new BioPerl should be running unless they run into problems. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From florin at iucha.net Mon Oct 2 08:15:06 2006 From: florin at iucha.net (Florin Iucha) Date: Mon, 2 Oct 2006 07:15:06 -0500 Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext on AMD64 In-Reply-To: <4520EF7F.40908@sendu.me.uk> References: <20061002014007.GG12075@iucha.net> <4520EF7F.40908@sendu.me.uk> Message-ID: <20061002121506.GB14409@iucha.net> On Mon, Oct 02, 2006 at 11:52:47AM +0100, Sendu Bala wrote: > Florin Iucha wrote: > > I am trying to install bioperl-network from CVS. I found this to > > require bioperl from CVS, which requires bioperl-ext from CVS. > > I can't help with the compile problems you encountered (other than to > say I also have problems under AMD64), but from where did you get the > idea that bioperl (live/core) requires bioperl-ext? It doesn't, though > recent changes to Makefile.PL may give that impression... Running the tests for bioperl-live mention in some places that 'this test has been skipped since $foo is not available' and I found the 'foos' in bioperl-ext. florin -- If we wish to count lines of code, we should not regard them as lines produced but as lines spent. -- Edsger Dijkstra -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: From bix at sendu.me.uk Mon Oct 2 10:05:11 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 02 Oct 2006 15:05:11 +0100 Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext on AMD64 In-Reply-To: <20061002121506.GB14409@iucha.net> References: <20061002014007.GG12075@iucha.net> <4520EF7F.40908@sendu.me.uk> <20061002121506.GB14409@iucha.net> Message-ID: <45211C97.2060800@sendu.me.uk> Florin Iucha wrote: > On Mon, Oct 02, 2006 at 11:52:47AM +0100, Sendu Bala wrote: >> Florin Iucha wrote: >>> I am trying to install bioperl-network from CVS. I found this to >>> require bioperl from CVS, which requires bioperl-ext from CVS. >> I can't help with the compile problems you encountered (other than to >> say I also have problems under AMD64), but from where did you get the >> idea that bioperl (live/core) requires bioperl-ext? It doesn't, though >> recent changes to Makefile.PL may give that impression... > > Running the tests for bioperl-live mention in some places that 'this > test has been skipped since $foo is not available' and I found the > 'foos' in bioperl-ext. Right, yes. The idea is, you'd only need to install bioperl-ext if you wanted to use the modules that the complaining tests test. So if none of the things that were skipped matter to you, don't install ext. I guess this needs to be clarified in documentation somewhere. From cjfields at uiuc.edu Mon Oct 2 10:13:56 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 09:13:56 -0500 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: <4520AC7A.1050009@infotech.monash.edu.au> Message-ID: <001801c6e62d$02c883d0$15327e82@pyrimidine> > >>> I have removed all use/@ISA Bio::Root::Object references from > >>> bioperl-live, except for those in Bio::Root::* itself: > > >> So I'd say they're both relics that can be removed. In fact I was > >> planning on getting rid off all references to both of these modules > >> before you did, so thanks! :) > > > I think they can go. It's probably a pre-1.0 deprecation that somehow > > was never followed through on. > > Today I did a fresh CVS checkout of bioperl-live, and deleted the > following modules and tests, and all tests passed with BIOPERLDEBUG=0 > > * Bio::Root::Err > * Bio::Root::Global > * Bio::Root::IOManager > * Bio::Root::Object > * Bio::Root::Storable > * Bio::Root::Utilities # may be used by third parties? > * Bio::Root::Vector > * Bio::Root::Xref > * t/Root-Utilities.t # need to keep if we keep Utilities.pm > * t/RootStorable.t > > Should we schedule for deprecation, or deprecate immediately as Hilmar > suggested they were meant to be deprecated long ago ? I vote for quick deprecation; I had also noticed that these were superfluous and added them as possible deprecations to the wiki page. However, we need to be careful about that 'third-party use' caveat you have for Bio::Root::Utilities; there's another one with Bio::Root::Storable and Ensembl: http://thread.gmane.org/gmane.comp.lang.perl.bio.general/2924/focus=2924 and it seems to have it's users: http://thread.gmane.org/gmane.comp.lang.perl.bio.general/8242/focus=8242 The others (including Bio::Root::Utilities) haven't had any major threads on the mail lists in a very long time. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -- > Dr Torsten Seemann http://www.vicbioinformatics.com > Victorian Bioinformatics Consortium, Monash University, Australia > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Mon Oct 2 10:16:31 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 09:16:31 -0500 Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-exton AMD64 In-Reply-To: <20061002121506.GB14409@iucha.net> Message-ID: <001901c6e62d$5c4fac80$15327e82@pyrimidine> They're not absolutely necessary; the tests are skipped w/o failure because bioperl-ext is optional. These are only necessary if you want the ability to read sequence trace files. BTW, you might have a rough time on trying to install bioperl-ext depending on your platform. Note the following bug report: http://bugzilla.open-bio.org/show_bug.cgi?id=2074 Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Florin Iucha > Sent: Monday, October 02, 2006 7:15 AM > To: bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] Failure to compile the CVS snapshot of bioperl- > exton AMD64 > > On Mon, Oct 02, 2006 at 11:52:47AM +0100, Sendu Bala wrote: > > Florin Iucha wrote: > > > I am trying to install bioperl-network from CVS. I found this to > > > require bioperl from CVS, which requires bioperl-ext from CVS. > > > > I can't help with the compile problems you encountered (other than to > > say I also have problems under AMD64), but from where did you get the > > idea that bioperl (live/core) requires bioperl-ext? It doesn't, though > > recent changes to Makefile.PL may give that impression... > > Running the tests for bioperl-live mention in some places that 'this > test has been skipped since $foo is not available' and I found the > 'foos' in bioperl-ext. > > florin > > -- > If we wish to count lines of code, we should not regard them as lines > produced but as lines spent. -- Edsger Dijkstra From osborne1 at optonline.net Mon Oct 2 10:14:13 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Mon, 02 Oct 2006 10:14:13 -0400 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: <4520E20F.6040406@sendu.me.uk> Message-ID: Sendu, No objection but someone should check the scripts in examples/root to make sure that they are not used there. Brian O. On 10/2/06 5:55 AM, "Sendu Bala" wrote: > Torsten Seemann wrote: >>>>> I have removed all use/@ISA Bio::Root::Object references from >>>>> bioperl-live, except for those in Bio::Root::* itself: >> >>>> So I'd say they're both relics that can be removed. In fact I was >>>> planning on getting rid off all references to both of these modules >>>> before you did, so thanks! :) >> >>> I think they can go. It's probably a pre-1.0 deprecation that somehow >>> was never followed through on. >> >> Today I did a fresh CVS checkout of bioperl-live, and deleted the >> following modules and tests, and all tests passed with BIOPERLDEBUG=0 >> >> * Bio::Root::Err >> * Bio::Root::Global >> * Bio::Root::IOManager >> * Bio::Root::Object >> * Bio::Root::Storable >> * Bio::Root::Utilities # may be used by third parties? >> * Bio::Root::Vector >> * Bio::Root::Xref >> * t/Root-Utilities.t # need to keep if we keep Utilities.pm >> * t/RootStorable.t >> >> Should we schedule for deprecation, or deprecate immediately as Hilmar >> suggested they were meant to be deprecated long ago ? > > I'm happy to get rid of them all straight away. Does anyone object? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From johnson.biotech at gmail.com Mon Oct 2 10:21:50 2006 From: johnson.biotech at gmail.com (Seth Johnson) Date: Mon, 2 Oct 2006 10:21:50 -0400 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: <000001c6e5dc$2eceabe0$15327e82@pyrimidine> References: <000001c6e5dc$2eceabe0$15327e82@pyrimidine> Message-ID: I'm using MySQL 5.0.19 and Perl v5.8.7 [MSWin32-x86-multi-thread] On 10/2/06, Chris Fields wrote: > > Seth, > > What version of MySQL and perl are you using? I'm using MySQL 5.0.18 (but > am upgrading to 5.0.24 tomorrow) and ActivePerl 5.8.819. > > I ran into a few problems with bioperl-db tests which were unrelated the > ones below, but I'm wondering if it is a difference in MySQL versions. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > -----Original Message----- > > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > > bounces at lists.open-bio.org] On Behalf Of Seth Johnson > > Sent: Saturday, September 30, 2006 6:35 PM > > To: Hilmar Lapp > > Cc: Chris Fields; Bioperl List > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > > > Here're complete test details: > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > ... > > > FAILED tests 10-12 > > Failed 3/12 tests, 75.00% okay > > Failed Test Stat Wstat Total Fail Failed List of Failed > > > -------------------------------------------------------------------------- > > ----- > > t\02species.t 65 2 3.08% 63 65 > > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > > t\12ontology.t 2 512 738 1471 199.32% 3-738 > > t\16obda.t 12 3 25.00% 10-12 > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- Best Regards, Seth Johnson Senior Bioinformatics Associate Ph: (202) 470-0900 Fx: (775) 251-0358 From osborne1 at optonline.net Mon Oct 2 10:08:50 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Mon, 02 Oct 2006 10:08:50 -0400 Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext on AMD64 In-Reply-To: <20061002014007.GG12075@iucha.net> Message-ID: Florian, Minor correction here, the Bioperl package does not require bioperl-ext. However we see there is a problem compiling bioperl-ext... Brian O. On 10/1/06 9:40 PM, "Florin Iucha" wrote: > I am trying to install bioperl-network from CVS. I found this to > require bioperl from CVS, which requires bioperl-ext from CVS. From JK at novozymes.com Mon Oct 2 10:05:34 2006 From: JK at novozymes.com (JK (Jesper Agerbo Krogh)) Date: Mon, 2 Oct 2006 16:05:34 +0200 Subject: [Bioperl-l] Blast parser. Message-ID: <934F95E71B6C9347A873C42AE3C196190B84C5D6@NZT0004E.dknz.nzcorp.net> Hi. I've tried to use the blast-parser but I cannot get the original alignment out of the parser. Is it possible to get that out of the Bio::Search::HSP::BlastHSP in some way. It seems quite odd to get a clustalw alignment out when it isn't that type of alignment people are used to get from blast. Thanks Jesper From cjfields at uiuc.edu Mon Oct 2 10:36:31 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 09:36:31 -0500 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: Message-ID: <001d01c6e630$27792fb0$15327e82@pyrimidine> > Sendu, > > No objection but someone should check the scripts in examples/root to make > sure that they are not used there. > > Brian O. I suppose it's also possible that the other bioperl distributions (like bioperl-run) could use them as well. If they do we can take care of them as they pop up. These are really old and haven't been revised in a long time. The only one I worry about is Bio::Root::Storable b/c of Ensembl. Does anyone know where Will Spooner is? He's the maintainer for Bio::Root::Storable. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 2 11:01:44 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 10:01:44 -0500 Subject: [Bioperl-l] Blast parser. In-Reply-To: <934F95E71B6C9347A873C42AE3C196190B84C5D6@NZT0004E.dknz.nzcorp.net> Message-ID: <000001c6e633$ad0a6ce0$15327e82@pyrimidine> The alignment that you get should come from GenericHSP, not BLASTHSP. Either way, the HSP alignment that is retrieved using $hsp->get_aln() should be a Bio::SimpleAlign object. You can then output that to the proper AlignIO format using an AlignIO stream object or use the Bio::SimpleAlign methods for further analysis. my $aln = $hsp->get_aln(); my $alnout = Bio::AlignIO->new(-format => 'msf', -fh => \*STDOUT); $alnout->write_aln($aln); Quick note: not all AlignIO formats have write_aln() support at this time, but most do. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of JK (Jesper Agerbo Krogh) > Sent: Monday, October 02, 2006 9:06 AM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] Blast parser. > > > Hi. > > I've tried to use the blast-parser but I cannot get the original alignment > out of the parser. Is it possible to get that out of the > Bio::Search::HSP::BlastHSP in some way. It seems quite odd to get a > clustalw alignment out when it isn't that type of alignment people are > used to get from blast. > > Thanks > > Jesper > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From whs at ebi.ac.uk Mon Oct 2 12:00:19 2006 From: whs at ebi.ac.uk (Will Spooner) Date: Mon, 2 Oct 2006 17:00:19 +0100 (BST) Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: <001d01c6e630$27792fb0$15327e82@pyrimidine> References: <001d01c6e630$27792fb0$15327e82@pyrimidine> Message-ID: On Mon, 2 Oct 2006, Chris Fields wrote: >> Sendu, >> >> No objection but someone should check the scripts in examples/root to make >> sure that they are not used there. >> >> Brian O. > > I suppose it's also possible that the other bioperl distributions (like > bioperl-run) could use them as well. > > If they do we can take care of them as they pop up. These are really old > and haven't been revised in a long time. > > The only one I worry about is Bio::Root::Storable b/c of Ensembl. Does > anyone know where Will Spooner is? He's the maintainer for > Bio::Root::Storable. > Hi Chris, I'm still lurking... If the tests for Bio::Root::Storable still pass (I assume that they do), then the module is working as advertised. The idea behind Storable is very simple; object instances of any inhereting class can be serialised/retrieved from disk. BioPerl objects will probably not want this functionality by default, but it is trival to implement if needed. Will From cjfields at uiuc.edu Mon Oct 2 13:58:15 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 12:58:15 -0500 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: Message-ID: <000601c6e64c$5746f990$15327e82@pyrimidine> > On Mon, 2 Oct 2006, Chris Fields wrote: > > >> Sendu, > >> > >> No objection but someone should check the scripts in examples/root to > make > >> sure that they are not used there. > >> > >> Brian O. > > > > I suppose it's also possible that the other bioperl distributions (like > > bioperl-run) could use them as well. > > > > If they do we can take care of them as they pop up. These are really > old > > and haven't been revised in a long time. > > > > The only one I worry about is Bio::Root::Storable b/c of Ensembl. Does > > anyone know where Will Spooner is? He's the maintainer for > > Bio::Root::Storable. > > > > Hi Chris, > > I'm still lurking... > > If the tests for Bio::Root::Storable still pass (I assume that they do), > then the module is working as advertised. > > The idea behind Storable is very simple; object instances of any > inhereting class can be serialised/retrieved from disk. BioPerl objects > will probably not want this functionality by default, but it is trival to > implement if needed. > > Will Okay, nice to know you're listening in! Based on that we should keep it in. The rest that Torsten mentioned could probably be removed right away. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From osborne1 at optonline.net Mon Oct 2 13:59:58 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Mon, 02 Oct 2006 13:59:58 -0400 Subject: [Bioperl-l] bioperl-network warnings when loading the DIP dataset In-Reply-To: <20061002002403.GD12075@iucha.net> Message-ID: Florin, OK, this is fixed in CVS now. The problem is that there's some variability in how the PSI MI "standard" is used. In this case there was a species that was not given a value for its scientific name ("fullName"), I had to use common name in its place. Fortunately there's an NCBI taxon id behind all this. Thanks again, Brian O. On 10/1/06 8:24 PM, "Florin Iucha" wrote: > Also the largest file (full_2_psi1.xml) from release 2006-09-19 of the > MINT [1] database does not produce the crash. It has a new warning, however: > > Can't call method "text" on an undefined value at > /usr/local/share/perl/5.8.8/Bio/Network/IO/psi.pm line 290. From mmacho at gmail.com Mon Oct 2 13:43:13 2006 From: mmacho at gmail.com (ende) Date: Mon, 2 Oct 2006 19:43:13 +0200 Subject: [Bioperl-l] Variable scope Message-ID: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com> Hi this may be a typical perl topic and then out of this list center topic. My apologize for any inconvenience. It is a annoying problem that is making me waste lot of time. I have a package with its new object, etc... and constants in it like: #----- use constant False => 0; use constant True => 1; our %CLRFG = ( PLASMIDO => RED, POLY_A => GREEN, RESTR_SITES => BLUE, CONECTORS => MAGENTA, CONTAMINANTS => CYAN, ); our %CLRBG = ( PLASMIDO => "", POLY_A => "", RESTR_SITES => "", CONECTORS => "", CONTAMINANTS => "", ); #------ this constants are include with require "h.pl" from the main package file. I use this module from the mail command line driver to test it "using" it. In the command line driver I can use with no gripe the constants False and True directly, for example "return True", etc without any reference to the origin of that constant. But, with respect to the variables (I would like they also were constants.. but how?), %CLRFG and %CLRBG I can't find the way of refering those int the module. Finally I have desisted and _copy_ the definitions where I have needed it (in the sub were I print Ansi terminal colouring seqs...). I don't find how to refer those variables out of the module. I have tried %modulename::CLRFG, for example, but Perl gives me errors. Any help? -- Juan Falgueras Profesor del Depto. de Lenguajes y Ciencias de la Computaci?n Universidad de M?laga From cjfields at uiuc.edu Mon Oct 2 16:52:11 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 15:52:11 -0500 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: Message-ID: <000001c6e664$a25538d0$15327e82@pyrimidine> I have updated the Deprecation page with the Bio::Root::* modules that we plan on deprecating (note that I have them being removed for rel. 1.5.2). I have left out Bio::Root::Storable for now based on Will's response. http://www.bioperl.org/wiki/Deprecated_modules I'll update the DEPRECATED doc in CVS as well. There is a tentative schedule for when warnings are added for modules before they are removed. In relation to the recent trend for house-cleaning, I noticed that all of the Bio::Tools::BP* BLAST-related modules all are still present but haven't been modified or had deprecation warnings added. BPLite was marked for deprecation around rel 1.5 since the functionality is present in Bio::SearchIO, as well as the others. Judging by the mail list, no one has used these in quite a while, and everyone has been redirected to use Bio::SearchIO instead. Based on that I have added warnings in CVS for deprecation to BPlite and the related modules BPpsilite and BPbl2seq. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Brian Osborne > Sent: Monday, October 02, 2006 9:14 AM > To: Sendu Bala; bioperl-l > Subject: Re: [Bioperl-l] Do we need Bio::Root::Object anymore? > > Sendu, > > No objection but someone should check the scripts in examples/root to make > sure that they are not used there. > > Brian O. > > > On 10/2/06 5:55 AM, "Sendu Bala" wrote: > > > Torsten Seemann wrote: > >>>>> I have removed all use/@ISA Bio::Root::Object references from > >>>>> bioperl-live, except for those in Bio::Root::* itself: > >> > >>>> So I'd say they're both relics that can be removed. In fact I was > >>>> planning on getting rid off all references to both of these modules > >>>> before you did, so thanks! :) > >> > >>> I think they can go. It's probably a pre-1.0 deprecation that somehow > >>> was never followed through on. > >> > >> Today I did a fresh CVS checkout of bioperl-live, and deleted the > >> following modules and tests, and all tests passed with BIOPERLDEBUG=0 > >> > >> * Bio::Root::Err > >> * Bio::Root::Global > >> * Bio::Root::IOManager > >> * Bio::Root::Object > >> * Bio::Root::Storable > >> * Bio::Root::Utilities # may be used by third parties? > >> * Bio::Root::Vector > >> * Bio::Root::Xref > >> * t/Root-Utilities.t # need to keep if we keep Utilities.pm > >> * t/RootStorable.t > >> > >> Should we schedule for deprecation, or deprecate immediately as Hilmar > >> suggested they were meant to be deprecated long ago ? > > > > I'm happy to get rid of them all straight away. Does anyone object? > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From florin at iucha.net Mon Oct 2 16:47:01 2006 From: florin at iucha.net (Florin Iucha) Date: Mon, 2 Oct 2006 15:47:01 -0500 Subject: [Bioperl-l] Variable scope In-Reply-To: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com> References: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com> Message-ID: <20061002204701.GG14409@iucha.net> On Mon, Oct 02, 2006 at 07:43:13PM +0200, ende wrote: > It is a annoying problem that is making me waste lot of time. > > I have a package with its new object, etc... and constants in it like: > > #----- > use constant False => 0; > use constant True => 1; > > our %CLRFG = ( > PLASMIDO => RED, > POLY_A => GREEN, > RESTR_SITES => BLUE, > CONECTORS => MAGENTA, > CONTAMINANTS => CYAN, > ); > > our %CLRBG = ( > PLASMIDO => "", > POLY_A => "", > RESTR_SITES => "", > CONECTORS => "", > CONTAMINANTS => "", > ); > #------ > > this constants are include with require "h.pl" from the main package > file. > > I use this module from the mail command line driver to test it > "using" it. In the command line driver I can use with no gripe the > constants False and True directly, for example "return True", etc > without any reference to the origin of that constant. It is possible you get them from somewhere else. > But, with respect to the variables (I would like they also were > constants.. but how?), %CLRFG and %CLRBG I can't find the way of > refering those int the module. Finally I have desisted and _copy_ > the definitions where I have needed it (in the sub were I print Ansi > terminal colouring seqs...). I don't find how to refer those > variables out of the module. > > I have tried %modulename::CLRFG, for example, but Perl gives me errors. Did you actually declare a package name in "h.pl" ? Is there any reason you don't call the file ".pm" and load it with "use"? I have attached a small example of importing that works. florin -- If we wish to count lines of code, we should not regard them as lines produced but as lines spent. -- Edsger Dijkstra -------------- next part -------------- A non-text attachment was scrubbed... Name: one.pm Type: text/x-perl Size: 118 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: two.pl Type: text/x-perl Size: 69 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: From Kevin.M.Brown at asu.edu Mon Oct 2 19:44:50 2006 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Mon, 2 Oct 2006 16:44:50 -0700 Subject: [Bioperl-l] ClustalW alignment and Bioperl::Run module Message-ID: <1A4207F8295607498283FE9E93B775B4021960CD@EX02.asurite.ad.asu.edu> Well, for anyone that wants to know, I found a way to capture the output of ClustalW to get at things like the score. Copy STDOUT to another handle open(OUTCOPY, ">&STDOUT") or die "Couldn't dup STDOUT: $!"; Change where STDOUT goes open(STDOUT, ">log.test") or die "Couldn't open log.test: $!"; Run the alignment and its output will be captured by the STDOUT redirection $aln, $factory->align(\@seq); Restore STDOUT to its normal location for the rest of the script close STDOUT; open(STDOUT, ">&OUTCOPY"); I guess I can understand why most of this is just dropped by the ClustalW.pm module since there doesn't seem to be a way to hold it all in a SimpleAlign object. > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Kevin Brown > Sent: Thursday, September 28, 2006 2:48 PM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] ClustalW alignment and Bioperl::Run module > > I've gotten a very simple script to run using bioperl that creates an > alignment using clustalw of two sequences. I see that clustal outputs > to stdout information like the score, but I don't see any way to store > that or retrieve that from the alignment object that is > returned (unless > I'm just blind). What follows is my very basic script which used code > found in the Wiki. > > print $aln->score() spits out an error about using an uninitialized > value. > > > #!/usr/bin/perl -w > > use strict; > use Bio::SeqIO; > use Bio::Perl; > use Bio::AlignIO; > use Getopt::Long qw(:config no_ignore_case bundling pass_through); > use POSIX; > use Bio::Tools::Run::Alignment::Clustalw; > > my $fileName = ""; # filename(s) to be parsed for > information > my $output_dir = ""; > my $format = 'fasta'; # default format for SeqIO module > > GetOptions( > 'file=s' => \$fileName, > 'output=s' => \$output_dir, > ); > > # Parse the input file for the needed information > # SeqIO supports several normal formats including , and > > > my @files = split(/\|/, $fileName); > my @seq_array; > > my $stream_out = > Bio::AlignIO->new(-file => '>test.msf', -format => 'msf', -flush => > 0); > > foreach my $fileName (@files) > { > my $file = Bio::SeqIO->new(-format => $format, -file => > $fileName); > my $seq; > while ($seq = $file->next_seq()) > { > push(@seq_array, $seq); > } > } > > my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM'); > my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params); > my $ktuple = 3; > $factory->ktuple($ktuple); # change the parameter before executing > # where @seq_array is an array of {{PM|Bio::Seq}} objects > > open my $out, ">seq.txt"; > > for (my $i = 1 ; $i <= $#seq_array ; $i++) > { > my @seq = ($seq_array[0], $seq_array[$i]); > my $aln = $factory->align(\@seq); > $stream_out->write_aln($aln); > print $aln->score; > for my $seq ($aln->each_seq) { > print $out $seq->display_id() ."\t". $seq->seq()."\n"; > } > } > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From bix at sendu.me.uk Mon Oct 2 19:48:34 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 03 Oct 2006 00:48:34 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 Message-ID: <4521A552.60301@sendu.me.uk> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll upload tar.gz files when I have access to the server, then reply here with links. In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for instructions on getting and testing this RC. Developers: Make sure you're in the AUTHORS file in all 4 packages, as appropriate. Users: Even though 1.5.2 is a 'developer' release, we consider it the most stable and capable version of Bioperl, and recommend that you use it in all but the most critical production environments. Please try it out and let us know of any problems or difficulties you run into. Thank you, Sendu. From lincoln.stein at gmail.com Mon Oct 2 17:53:38 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Mon, 2 Oct 2006 21:53:38 +0000 Subject: [Bioperl-l] Variable scope In-Reply-To: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com> References: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com> Message-ID: <6dce9a0b0610021453va2132c7u73747b9253211a66@mail.gmail.com> Hi, Read the documentation in Export. It is much better to formally export constants, variables and functions and to import them with "use" than to use "require". Also be sure that you understand how namespaces and modules work. This is not a BioPerl topic and should have been directed to a general Perl discussion list, such as Perl Monks. Lincoln On 10/2/06, ende wrote: > > > Hi > > this may be a typical perl topic and then out of this list center > topic. My apologize for any inconvenience. > > It is a annoying problem that is making me waste lot of time. > > I have a package with its new object, etc... and constants in it like: > > #----- > use constant False => 0; > use constant True => 1; > > our %CLRFG = ( > PLASMIDO => RED, > POLY_A => GREEN, > RESTR_SITES => BLUE, > CONECTORS => MAGENTA, > CONTAMINANTS => CYAN, > ); > > our %CLRBG = ( > PLASMIDO => "", > POLY_A => "", > RESTR_SITES => "", > CONECTORS => "", > CONTAMINANTS => "", > ); > #------ > > this constants are include with require "h.pl" from the main package > file. > > I use this module from the mail command line driver to test it > "using" it. In the command line driver I can use with no gripe the > constants False and True directly, for example "return True", etc > without any reference to the origin of that constant. > > But, with respect to the variables (I would like they also were > constants.. but how?), %CLRFG and %CLRBG I can't find the way of > refering those int the module. Finally I have desisted and _copy_ > the definitions where I have needed it (in the sub were I print Ansi > terminal colouring seqs...). I don't find how to refer those > variables out of the module. > > I have tried %modulename::CLRFG, for example, but Perl gives me errors. > > Any help? > > > > > -- > Juan Falgueras > Profesor del Depto. de Lenguajes y Ciencias de la Computaci?n > Universidad de M?laga > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From florin at iucha.net Mon Oct 2 22:30:31 2006 From: florin at iucha.net (Florin Iucha) Date: Mon, 2 Oct 2006 21:30:31 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <4521A552.60301@sendu.me.uk> References: <4521A552.60301@sendu.me.uk> Message-ID: <20061003023031.GI14409@iucha.net> On Tue, Oct 03, 2006 at 12:48:34AM +0100, Sendu Bala wrote: > Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll > upload tar.gz files when I have access to the server, then reply here > with links. > > In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for > instructions on getting and testing this RC. [I won't create a wiki account just to report this.] Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG not set. Lots of warnings about missing packages and all, but this looks interesting: Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423. Otherwise: Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, 99.99% okay. The failed test is: t/ESEfinder..................dubious Test returned status 255 (wstat 65280, 0xff00) DIED. FAILED test 15 florin -- If we wish to count lines of code, we should not regard them as lines produced but as lines spent. -- Edsger Dijkstra From cjfields at uiuc.edu Mon Oct 2 23:50:47 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 22:50:47 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <4521A552.60301@sendu.me.uk> References: <4521A552.60301@sendu.me.uk> Message-ID: <7F5CF5FF-5A1D-4DEA-A861-D4C47756F67D@uiuc.edu> So far all tests pass on Mac OS X. I'll add this to the release page. This RC will throw warnings for four tests I didn't remove in time (BPlite.t, BPpsilite, BPbl2seq.t, and RestrictionEnzyme.t), which correspond to their namesake deprecated Bio::Tools modules. These are no longer in CVS HEAD so should be gone by the next RC, and the relevant modules marked for deprecation. I can verify the Bio::DB::SeqFeature.t warning on Mac OS X that Florin reported, but ESEFinder.t works fine: t/BioDBSeqFeature............Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423. ok .... I'll report WinXP tests tomorrow on the wiki. Chris On Oct 2, 2006, at 6:48 PM, Sendu Bala wrote: > Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll > upload tar.gz files when I have access to the server, then reply here > with links. > > In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for > instructions on getting and testing this RC. > > Developers: > Make sure you're in the AUTHORS file in all 4 packages, as > appropriate. > > Users: > Even though 1.5.2 is a 'developer' release, we consider it the most > stable and capable version of Bioperl, and recommend that you use > it in all but the most critical production environments. Please > try it out and let us know of any problems or difficulties you run > into. > > > Thank you, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 2 23:54:29 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 22:54:29 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <20061003023031.GI14409@iucha.net> References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net> Message-ID: > [I won't create a wiki account just to report this.] > > Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG > not set. Lots of warnings about missing packages and all, but this > looks interesting: > > Argument "+" isn't numeric in numeric lt (<) at Bio/DB/ > SeqFeature/Segment.pm line 423. This is verified on Mac OS X. > Otherwise: > > Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, > 99.99% okay. > > The failed test is: > > t/ESEfinder..................dubious > Test returned status 255 (wstat 65280, 0xff00) > DIED. FAILED test 15 What do you get when you run that set of tests using 'perl -I. -w t/ ESEFinder.t'? The bad status code is odd and could be a remote server issue. Chris > > florin > > -- > If we wish to count lines of code, we should not regard them as lines > produced but as lines spent. -- Edsger Dijkstra > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From torsten.seemann at infotech.monash.edu.au Tue Oct 3 00:30:06 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Tue, 03 Oct 2006 14:30:06 +1000 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm Message-ID: <4521E74E.1040404@infotech.monash.edu.au> My understanding is that all Bioperl-compliant classes should inherit from Bio::Root::Root, not Bio::Root::RootI. Additionally, if functions such as throw() or _rearrange() are to be used without a class instance reference, they are to be used as class methods via Bio::Root::Root, not Bio::Root::RootI. Is this correct? My naive audit of bioperl-live CVS brought up the following statistics: # Root.pm /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l 26 /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | wc -l 346 # RootI.pm /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l 9 /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | wc -l 79 My guess would be that all RootI should be changed to plain Root ? Any help appreciated, -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From jason at bioperl.org Tue Oct 3 02:03:17 2006 From: jason at bioperl.org (Jason Stajich) Date: Mon, 2 Oct 2006 23:03:17 -0700 Subject: [Bioperl-l] t/ESEFinder.t fixed on branch Message-ID: <8927E580-997E-402A-BA81-D09FF9AD5D9A@bioperl.org> Looks like good work everyone. All tests pass for me on OSX perl 5.8.6. w and w/o BIOPERLDEBUG=1 with RC1 except for the t/ESEFinder problem which I've fixed. It skipped too few tests when BIOPERLDEBUG=0. Don't forget to merge branch changes back to head for this test when it is done. I don't want to muddy water so I'm holding off migrating the changes to main trunk as the files is substantially different (I presume pre-Test::More adoption?). -jason From bix at sendu.me.uk Tue Oct 3 03:28:48 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 03 Oct 2006 08:28:48 +0100 Subject: [Bioperl-l] t/ESEFinder.t fixed on branch In-Reply-To: <8927E580-997E-402A-BA81-D09FF9AD5D9A@bioperl.org> References: <8927E580-997E-402A-BA81-D09FF9AD5D9A@bioperl.org> Message-ID: <45221130.2060405@sendu.me.uk> Jason Stajich wrote: > Looks like good work everyone. > > All tests pass for me on OSX perl 5.8.6. w and w/o BIOPERLDEBUG=1 > with RC1 except for the t/ESEFinder problem which I've fixed. > > It skipped too few tests when BIOPERLDEBUG=0. > > Don't forget to merge branch changes back to head for this test when > it is done. I don't want to muddy water so I'm holding off > migrating the changes to main trunk as the files is substantially > different (I presume pre-Test::More adoption?). Actually, it was the same until Torsten made his own (different) fixes to HEAD but not to branch. It was my mistake and I've corrected in yet a third way, and now branch and HEAD match. No harm done :) From bix at sendu.me.uk Tue Oct 3 03:31:10 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 03 Oct 2006 08:31:10 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <7F5CF5FF-5A1D-4DEA-A861-D4C47756F67D@uiuc.edu> References: <4521A552.60301@sendu.me.uk> <7F5CF5FF-5A1D-4DEA-A861-D4C47756F67D@uiuc.edu> Message-ID: <452211BE.6080107@sendu.me.uk> Chris Fields wrote: > So far all tests pass on Mac OS X. I'll add this to the release page. > > This RC will throw warnings for four tests I didn't remove in time > (BPlite.t, BPpsilite, BPbl2seq.t, and RestrictionEnzyme.t), which > correspond to their namesake deprecated Bio::Tools modules. These > are no longer in CVS HEAD so should be gone by the next RC, and the > relevant modules marked for deprecation. Thanks Chris. Sorry I missed these. From bix at sendu.me.uk Tue Oct 3 03:32:08 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 03 Oct 2006 08:32:08 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <20061003023031.GI14409@iucha.net> References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net> Message-ID: <452211F8.8040104@sendu.me.uk> Florin Iucha wrote: > On Tue, Oct 03, 2006 at 12:48:34AM +0100, Sendu Bala wrote: >> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll >> upload tar.gz files when I have access to the server, then reply here >> with links. >> >> In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for >> instructions on getting and testing this RC. > > [I won't create a wiki account just to report this.] > > Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG > not set. Lots of warnings about missing packages and all, but this > looks interesting: > > Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423. > > Otherwise: > > Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, 99.99% okay. > > The failed test is: > > t/ESEfinder..................dubious > Test returned status 255 (wstat 65280, 0xff00) > DIED. FAILED test 15 Thanks for your feedback Florin. The ESEfinder fail will be fixed in the next RC. From bix at sendu.me.uk Tue Oct 3 04:29:37 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 03 Oct 2006 09:29:37 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <4521A552.60301@sendu.me.uk> References: <4521A552.60301@sendu.me.uk> Message-ID: <45221F71.40206@sendu.me.uk> Sendu Bala wrote: > Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll > upload tar.gz files when I have access to the server, then reply here > with links. Live/core: http://bioperl.org/DIST/bioperl-1.5.2-RC1.tar.gz http://bioperl.org/DIST/bioperl-1.5.2-RC1.tar.bz2 http://bioperl.org/DIST/bioperl-1.5.2-RC1.zip Run: http://bioperl.org/DIST/bioperl-run-1.5.2-RC1.tar.gz http://bioperl.org/DIST/bioperl-run-1.5.2-RC1.tar.bz2 http://bioperl.org/DIST/bioperl-run-1.5.2-RC1.zip DB: http://bioperl.org/DIST/bioperl-db-1.5.2-RC1.tar.gz http://bioperl.org/DIST/bioperl-db-1.5.2-RC1.tar.bz2 http://bioperl.org/DIST/bioperl-db-1.5.2-RC1.zip Network: http://bioperl.org/DIST/bioperl-network-1.5.2-RC1.tar.gz http://bioperl.org/DIST/bioperl-network-1.5.2-RC1.tar.bz2 http://bioperl.org/DIST/bioperl-network-1.5.2-RC1.zip Md5 checksums are in: http://bioperl.org/DIST/SIGNATURES.md5 From jason at bioperl.org Tue Oct 3 02:11:30 2006 From: jason at bioperl.org (Jason Stajich) Date: Mon, 2 Oct 2006 23:11:30 -0700 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm Message-ID: <87F9B64E-8BDA-464B-814D-3F117AA646A1@bioperl.org> I only briefly saw your question - but RootI is for interfaces, Root.pm is for instantiated objects. From florin at iucha.net Tue Oct 3 07:39:12 2006 From: florin at iucha.net (Florin Iucha) Date: Tue, 3 Oct 2006 06:39:12 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net> Message-ID: <20061003113912.GJ14409@iucha.net> On Mon, Oct 02, 2006 at 10:54:29PM -0500, Chris Fields wrote: > >Otherwise: > > > > Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, > >99.99% okay. > > > >The failed test is: > > > > t/ESEfinder..................dubious > > Test returned status 255 (wstat 65280, 0xff00) > > DIED. FAILED test 15 $ perl -I. -w t/ESEfinder.t 1..15 ok 1 - use Bio::Tools::Analysis::DNA::ESEfinder; ok 2 - use Data::Dumper; ok 3 - use Bio::PrimarySeq; ok 4 - use Bio::Seq; ok 5 ok 6 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 7 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 8 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 9 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 10 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 11 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 12 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 13 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 14 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test # Looks like you planned 15 tests but only ran 14. $ grep Id t/ESEfinder.t # $Id: ESEfinder.t,v 1.13.6.2 2006/10/02 23:10:39 sendu Exp $ florin -- If we wish to count lines of code, we should not regard them as lines produced but as lines spent. -- Edsger Dijkstra From hlapp at gmx.net Tue Oct 3 08:27:46 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 3 Oct 2006 08:27:46 -0400 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <4521E74E.1040404@infotech.monash.edu.au> References: <4521E74E.1040404@infotech.monash.edu.au> Message-ID: <338034D7-BCB0-4653-9C80-2E7EC9E09E9B@gmx.net> The interface classes (those ending in 'I') should actually inherit from RootI, not Root. In reality this recommendation is more theoretical than it makes that much of a difference I think. The motivation is that interface classes should not determine the actual implementation of a class (hash ref, array ref, whatever), and since Root.pm contains lots of implementation using a hash ref that decision will basically have been made. On the contrary though, RootI contains implementation too, although I'm not sure it would prescribe the object implementation as opposed to merely implementing static methods (like throw(), warn(), etc). That would need to be checked. -hilmar On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote: > My understanding is that all Bioperl-compliant classes should inherit > from Bio::Root::Root, not Bio::Root::RootI. > > Additionally, if functions such as throw() or _rearrange() are to be > used without a class instance reference, they are to be used as class > methods via Bio::Root::Root, not Bio::Root::RootI. > > Is this correct? > > My naive audit of bioperl-live CVS brought up the following > statistics: > > # Root.pm > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l > 26 > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | wc -l > 346 > > # RootI.pm > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l > 9 > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | wc -l > 79 > > My guess would be that all RootI should be changed to plain Root ? > > Any help appreciated, > > -- > Dr Torsten Seemann http://www.vicbioinformatics.com > Victorian Bioinformatics Consortium, Monash University, Australia > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Tue Oct 3 08:33:37 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 3 Oct 2006 07:33:37 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <20061003113912.GJ14409@iucha.net> References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net> <20061003113912.GJ14409@iucha.net> Message-ID: <44724E16-74CD-4778-B04F-529475B47E37@uiuc.edu> Florin, Looks like this is fixed and should be working in the next release. Chris On Oct 3, 2006, at 6:39 AM, Florin Iucha wrote: > On Mon, Oct 02, 2006 at 10:54:29PM -0500, Chris Fields wrote: >>> Otherwise: >>> >>> Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, >>> 99.99% okay. >>> >>> The failed test is: >>> >>> t/ESEfinder..................dubious >>> Test returned status 255 (wstat 65280, 0xff00) >>> DIED. FAILED test 15 > > $ perl -I. -w t/ESEfinder.t > 1..15 > ok 1 - use Bio::Tools::Analysis::DNA::ESEfinder; > ok 2 - use Data::Dumper; > ok 3 - use Bio::PrimarySeq; > ok 4 - use Bio::Seq; > ok 5 > ok 6 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 7 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 8 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 9 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 10 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 11 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 12 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 13 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 14 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > # Looks like you planned 15 tests but only ran 14. > $ grep Id t/ESEfinder.t > # $Id: ESEfinder.t,v 1.13.6.2 2006/10/02 23:10:39 sendu Exp $ > > florin > > -- > If we wish to count lines of code, we should not regard them as lines > produced but as lines spent. -- Edsger Dijkstra > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 3 10:29:51 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 3 Oct 2006 09:29:51 -0500 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <338034D7-BCB0-4653-9C80-2E7EC9E09E9B@gmx.net> Message-ID: <002101c6e6f8$67b4ae10$15327e82@pyrimidine> > The interface classes (those ending in 'I') should actually inherit > from RootI, not Root. > > In reality this recommendation is more theoretical than it makes that > much of a difference I think. The motivation is that interface > classes should not determine the actual implementation of a class > (hash ref, array ref, whatever), and since Root.pm contains lots of > implementation using a hash ref that decision will basically have > been made. > > On the contrary though, RootI contains implementation too, although > I'm not sure it would prescribe the object implementation as opposed > to merely implementing static methods (like throw(), warn(), etc). > That would need to be checked. > > -hilmar The constructor in Bio::Root::RootI lets one know that its use is deprecated, so you shouldn't have any cases of 'our qw(Bio::Root::RootI)'; there should be some way of inheriting Root directly or indirectly. I would say that any direct use of RootI is not good practice, though. For the current implementation we should only inherit Bio::Root::Root, which implements RootI. Is there any reason to shut off the warning with BIOPERLDEBUG? >From RootI: sub new { my $class = shift; my @args = @_; unless ( $ENV{'BIOPERLDEBUG'} ) { carp("Use of new in Bio::Root::RootI is deprecated. Please use Bio::Root::Root instead"); } eval "require Bio::Root::Root"; return Bio::Root::Root->new(@args); } Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > > On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote: > > > My understanding is that all Bioperl-compliant classes should inherit > > from Bio::Root::Root, not Bio::Root::RootI. > > > > Additionally, if functions such as throw() or _rearrange() are to be > > used without a class instance reference, they are to be used as class > > methods via Bio::Root::Root, not Bio::Root::RootI. > > > > Is this correct? > > > > My naive audit of bioperl-live CVS brought up the following > > statistics: > > > > # Root.pm > > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l > > 26 > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | wc -l > > 346 > > > > # RootI.pm > > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l > > 9 > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | wc -l > > 79 > > > > My guess would be that all RootI should be changed to plain Root ? > > > > Any help appreciated, > > > > -- > > Dr Torsten Seemann http://www.vicbioinformatics.com > > Victorian Bioinformatics Consortium, Monash University, Australia > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From slenk at emich.edu Tue Oct 3 13:31:47 2006 From: slenk at emich.edu (Stephen Gordon Lenk) Date: Tue, 03 Oct 2006 13:31:47 -0400 Subject: [Bioperl-l] Perl 6 has 'roles' - may be cleanly applicable to the Root/RootI issue Message-ID: <5147da5514e402.514e4025147da5@emich.edu> I looked at the Perl6 site, there is an RFC on interfaces: http://dev.perl.org/perl6/rfc/265.html Roles seem to be the Perl 6 answer to the Root/RootI issue in Bioperl. Maybe it is too early to suggest this. http://dev.perl.org/perl6/doc/design/apo/A12.html: The primary role of a class is to manage instances, that is, objects. So a class must worry about object creation and destruction, and everything that happens in between. Classes have a secondary role as units of software reuse, in that they can be inherited from or delegated to. However, because this is a secondary role, and because of weaknesses in models of inheritance, composition, and delegation, Perl 6 will split out the notion of software reuse into a separate class-like entity called a "role". Roles are an abstraction mechanism for use by classes that don't care about the secondary aspects of software reuse, or that (looking at it the other way) care so much about it that they want to encapsulate any decisions about implementation, composition, delegation, and maybe even inheritance. Sounds fancy, but just think of them as includes of partial classes, with some safety checks. Roles don't manage objects. They manage interfaces and other abstract behavior (like default implementations), and they help classes manage objects. As such, a role may only be composed into a class or into another role, never inherited from or delegated to. That's what classes are for. From slenk at emich.edu Tue Oct 3 12:45:15 2006 From: slenk at emich.edu (Stephen Gordon Lenk) Date: Tue, 03 Oct 2006 12:45:15 -0400 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm Message-ID: <5120d6a511f5a7.511f5a75120d6a@emich.edu> The separation of interface and implementation is generally regarded as a good idea. Right now the Bioperl community is doing this as part of the implementation of Bioperl. I suggest that this is an example of something which you might want to have as part of the Perl implementation. If Perl 6 (or even Perl 5) does not have this as a core part of the language or as a standard package (reusable by all in a common fashion), you may want to suggest to the Perl implementers that a way for interface/implementation distinctions be made part of the core language. My 2 cents, as you people are the experts on your own code. ----- Original Message ----- From: Chris Fields Date: Tuesday, October 3, 2006 10:29 am Subject: Re: [Bioperl-l] Use of Root.pm versus RootI.pm > > The interface classes (those ending in 'I') should actually inherit > > from RootI, not Root. > > > > In reality this recommendation is more theoretical than it makes > that> much of a difference I think. The motivation is that interface > > classes should not determine the actual implementation of a class > > (hash ref, array ref, whatever), and since Root.pm contains lots of > > implementation using a hash ref that decision will basically have > > been made. > > > > On the contrary though, RootI contains implementation too, although > > I'm not sure it would prescribe the object implementation as opposed > > to merely implementing static methods (like throw(), warn(), etc). > > That would need to be checked. > > > > -hilmar > > The constructor in Bio::Root::RootI lets one know that its use is > deprecated, so you shouldn't have any cases of 'our > qw(Bio::Root::RootI)';there should be some way of inheriting Root > directly or indirectly. I would > say that any direct use of RootI is not good practice, though. > For the > current implementation we should only inherit Bio::Root::Root, which > implements RootI. > > Is there any reason to shut off the warning with BIOPERLDEBUG? > > >From RootI: > > sub new { > my $class = shift; > my @args = @_; > unless ( $ENV{'BIOPERLDEBUG'} ) { > carp("Use of new in Bio::Root::RootI is deprecated. Please use > Bio::Root::Root instead"); > } > eval "require Bio::Root::Root"; > return Bio::Root::Root->new(@args); > } > > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > > > > > On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote: > > > > > My understanding is that all Bioperl-compliant classes should > inherit> > from Bio::Root::Root, not Bio::Root::RootI. > > > > > > Additionally, if functions such as throw() or _rearrange() are > to be > > > used without a class instance reference, they are to be used > as class > > > methods via Bio::Root::Root, not Bio::Root::RootI. > > > > > > Is this correct? > > > > > > My naive audit of bioperl-live CVS brought up the following > > > statistics: > > > > > > # Root.pm > > > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l > > > 26 > > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | > wc -l > > > 346 > > > > > > # RootI.pm > > > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l > > > 9 > > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | > wc -l > > > 79 > > > > > > My guess would be that all RootI should be changed to plain > Root ? > > > > > > Any help appreciated, > > > > > > -- > > > Dr Torsten Seemann http://www.vicbioinformatics.com > > > Victorian Bioinformatics Consortium, Monash University, Australia > > > > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > -- > > =========================================================== > > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > > =========================================================== > > > > > > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at uiuc.edu Tue Oct 3 13:49:35 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 3 Oct 2006 12:49:35 -0500 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <5120d6a511f5a7.511f5a75120d6a@emich.edu> Message-ID: <000001c6e714$4c2cbb80$15327e82@pyrimidine> Perl6 already has added flexibility for separation of implementation/interface (I believe they are called roles). http://dev.perl.org/perl6/doc/design/syn/S12.html To tell the truth, I'm not sure about Perl 5, except the way the Bioperl devs have up the distinction between interface and implementation. However, I find the way we use interfaces is very simple (set up interface with some/all methods as unimplemented, use the module as an abstract base class, then override the unimplemented methods). It works for me. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: Stephen Gordon Lenk [mailto:slenk at emich.edu] > Sent: Tuesday, October 03, 2006 11:45 AM > To: Chris Fields > Cc: 'Hilmar Lapp'; 'Torsten Seemann'; 'bioperl-l' > Subject: Re: [Bioperl-l] Use of Root.pm versus RootI.pm > > The separation of interface and implementation is generally > regarded as a good idea. Right now the Bioperl community is > doing this as part of the implementation of Bioperl. I suggest > that this is an example of something which you might want to > have as part of the Perl implementation. If Perl 6 (or even > Perl 5) does not have this as a core part of the language or > as a standard package (reusable by all in a common fashion), > you may want to suggest to the Perl implementers that a way > for interface/implementation distinctions be made part of the > core language. My 2 cents, as you people are the experts on > your own code. > > > ----- Original Message ----- > From: Chris Fields > Date: Tuesday, October 3, 2006 10:29 am > Subject: Re: [Bioperl-l] Use of Root.pm versus RootI.pm > > > > The interface classes (those ending in 'I') should actually inherit > > > from RootI, not Root. > > > > > > In reality this recommendation is more theoretical than it makes > > that> much of a difference I think. The motivation is that interface > > > classes should not determine the actual implementation of a class > > > (hash ref, array ref, whatever), and since Root.pm contains lots of > > > implementation using a hash ref that decision will basically have > > > been made. > > > > > > On the contrary though, RootI contains implementation too, although > > > I'm not sure it would prescribe the object implementation as > opposed > > > to merely implementing static methods (like throw(), warn(), etc). > > > That would need to be checked. > > > > > > -hilmar > > > > The constructor in Bio::Root::RootI lets one know that its use is > > deprecated, so you shouldn't have any cases of 'our > > qw(Bio::Root::RootI)';there should be some way of inheriting Root > > directly or indirectly. I would > > say that any direct use of RootI is not good practice, though. > > For the > > current implementation we should only inherit Bio::Root::Root, which > > implements RootI. > > > > Is there any reason to shut off the warning with BIOPERLDEBUG? > > > > >From RootI: > > > > sub new { > > my $class = shift; > > my @args = @_; > > unless ( $ENV{'BIOPERLDEBUG'} ) { > > carp("Use of new in Bio::Root::RootI is deprecated. Please use > > Bio::Root::Root instead"); > > } > > eval "require Bio::Root::Root"; > > return Bio::Root::Root->new(@args); > > } > > > > > > Christopher Fields > > Postdoctoral Researcher - Switzer Lab > > Dept. of Biochemistry > > University of Illinois Urbana-Champaign > > > > > > > > > > > > On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote: > > > > > > > My understanding is that all Bioperl-compliant classes should > > inherit> > from Bio::Root::Root, not Bio::Root::RootI. > > > > > > > > Additionally, if functions such as throw() or _rearrange() are > > to be > > > > used without a class instance reference, they are to be used > > as class > > > > methods via Bio::Root::Root, not Bio::Root::RootI. > > > > > > > > Is this correct? > > > > > > > > My naive audit of bioperl-live CVS brought up the following > > > > statistics: > > > > > > > > # Root.pm > > > > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l > > > > 26 > > > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | > > wc -l > > > > 346 > > > > > > > > # RootI.pm > > > > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l > > > > 9 > > > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | > > wc -l > > > > 79 > > > > > > > > My guess would be that all RootI should be changed to plain > > Root ? > > > > > > > > Any help appreciated, > > > > > > > > -- > > > > Dr Torsten Seemann http://www.vicbioinformatics.com > > > > Victorian Bioinformatics Consortium, Monash University, Australia > > > > > > > > _______________________________________________ > > > > Bioperl-l mailing list > > > > Bioperl-l at lists.open-bio.org > > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > > > -- > > > =========================================================== > > > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > > > =========================================================== > > > > > > > > > > > > > > > > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cmlapid at up.edu.ph Tue Oct 3 22:06:06 2006 From: cmlapid at up.edu.ph (Carlo Lapid) Date: Wed, 4 Oct 2006 10:06:06 +0800 Subject: [Bioperl-l] genbank mirror Message-ID: Hi, I'm trying to set up a local mirror of a large part of the Genbank database. For users to access the local database, I need to create a web-based search tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank flat files I've downloaded based on a query entered by the user. I'm trying to use Bioperl to create this from scratch, but I'm having a very hard time, especially since I want the user to have reasonable flexibility in customizing his search. The best that I've been able to accomplish is a search function that retrieves genbank sequence objects based on their primary IDs or accession numbers; by using the fetch method of the Bio::Index::GenBank module. But this doesn't help users who don't know the exact IDs for the sequences they want. Can anybody suggest a way to use Bioperl to search for an ordinary word or phrase, like "16S gene", which could be matched against the description field, or the entire genbank entry? (Alternatively, is there some other freely available tool or software that can do this?) I've been scouring the Bioperl documentation, but I couldn't find anything. I just need to be pointed in the right direction. What I thought was a relatively simple problem has been driving me crazy for days; if anybody has any suggestions I would really, really appreciate it. From torsten.seemann at infotech.monash.edu.au Tue Oct 3 22:58:03 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Wed, 04 Oct 2006 12:58:03 +1000 Subject: [Bioperl-l] genbank mirror In-Reply-To: References: Message-ID: <4523233B.7030505@infotech.monash.edu.au> > I'm trying to set up a local mirror of a large part of the Genbank database. > For users to access the local database, I need to create a web-based search > tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank > flat files I've downloaded based on a query entered by the user. Have you coinsidered bioperl-db / BioSQL ? http://www.bioperl.org/wiki/BioPerl_db http://lists.open-bio.org/pipermail/biosql-l/ -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From osborne1 at optonline.net Tue Oct 3 23:16:20 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Tue, 03 Oct 2006 23:16:20 -0400 Subject: [Bioperl-l] genbank mirror In-Reply-To: Message-ID: Carlo, You might want to look at the Bio::DB::Query::GenBank module: http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_dat abase However this works through NCBI's own eutils API, setting it up to query a local mirror may be very difficult. Brian O. On 10/3/06 10:06 PM, "Carlo Lapid" wrote: > Hi, > > I'm trying to set up a local mirror of a large part of the Genbank database. > For users to access the local database, I need to create a web-based search > tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank > flat files I've downloaded based on a query entered by the user. > > I'm trying to use Bioperl to create this from scratch, but I'm having a very > hard time, especially since I want the user to have reasonable flexibility > in customizing his search. The best that I've been able to accomplish is a > search function that retrieves genbank sequence objects based on their > primary IDs or accession numbers; by using the fetch method of the > Bio::Index::GenBank module. But this doesn't help users who don't know the > exact IDs for the sequences they want. > > Can anybody suggest a way to use Bioperl to search for an ordinary word or > phrase, like "16S gene", which could be matched against the description > field, or the entire genbank entry? (Alternatively, is there some other > freely available tool or software that can do this?) I've been scouring the > Bioperl documentation, but I couldn't find anything. I just need to be > pointed in the right direction. What I thought was a relatively simple > problem has been driving me crazy for days; if anybody has any suggestions I > would really, really appreciate it. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From osborne1 at optonline.net Tue Oct 3 23:28:06 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Tue, 03 Oct 2006 23:28:06 -0400 Subject: [Bioperl-l] genbank mirror In-Reply-To: <4523233B.7030505@infotech.monash.edu.au> Message-ID: Torsten and Carlo, Right. For some simple examples of using Bio::DB::Query::BioQuery to query a BioSQL db take a look at Bio::DB::BioSQL::OBDA. You may also want to take a look at NCBI's eutils API, it's quite powerful but not local. Or the ENSEMBL API, people have set up their own local ENSEMBL dbs. There's an example of this API here: http://www.bioperl.org/wiki/Getting_Genomic_Sequences Brian O. On 10/3/06 10:58 PM, "Torsten Seemann" wrote: >> I'm trying to set up a local mirror of a large part of the Genbank database. >> For users to access the local database, I need to create a web-based search >> tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank >> flat files I've downloaded based on a query entered by the user. > > Have you coinsidered bioperl-db / BioSQL ? > > http://www.bioperl.org/wiki/BioPerl_db > http://lists.open-bio.org/pipermail/biosql-l/ From torsten.seemann at infotech.monash.edu.au Wed Oct 4 01:21:24 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Wed, 04 Oct 2006 15:21:24 +1000 Subject: [Bioperl-l] Clean-up of Bio::Root::IO Message-ID: <452344D4.8070908@infotech.monash.edu.au> Hi all, Now that we have Perl 5.6.1 as a minimum, the following modules are standard: File::Spec, File::Temp, File::Path Bio::Root::IO has functions catfile(), tmpdir(), tmpfile(), rmtree() which currently dispatch to the File:: version, or try to emulate it. We don't need to emulate anymore. Jason Stajich suggested in a previous post that they should be deprecated, and that users should use directly the File:: functions themselves. I have an uncommitted simplified version of Bio::Root::IO which does this, and "all tests pass". The functions currently (silently) dispatch directly to their native counterparts. The only tricky function is tempfile() which is *mostly* like File::Temp::tempfile(), but does some voodoo of converting (TEMPLATE=>'xxx') to the non-hash first parameter of the File:: version, so I'm hesitant to commit. It may do other magic - Hilmar? Comments? -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From gianluca.debellis at itb.cnr.it Wed Oct 4 05:25:26 2006 From: gianluca.debellis at itb.cnr.it (Gianluca De Bellis) Date: Wed, 04 Oct 2006 11:25:26 +0200 Subject: [Bioperl-l] Bioperl under WinXP Message-ID: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB> I'm trying to use Bioperl under WinXP-SP2 (novice) Bioperl has been just downloaded (v 1.2.3) Even the simplest program with a single command (use Bio::Perl;) ends up in an error of the Perl interpreter with these details AppName: perl.exe AppVer: 5.8.8.819 ModName: win32.dll ModVer: 0.0.0.0 Offset: 00003294 Coming from the windos reporting system Where is the problem? Thanks in advance From epsteinj at mail.nih.gov Wed Oct 4 07:25:57 2006 From: epsteinj at mail.nih.gov (Epstein, Jonathan A (NIH/NICHD) [E]) Date: Wed, 4 Oct 2006 07:25:57 -0400 Subject: [Bioperl-l] genbank mirror References: Message-ID: <42504F69898FE546B3F0238C9BD03275532603@NIHCESMLBX7.nih.gov> There's Seqhound: http://seqhound.blueprint.org/report.html We set this up locally, and it's probably the most comprehensive free solution out there, but it's non-trivial to setup. Also, since the Blueprint&BIND have lost most of their funding, I'm not sure how long you can count on SeqHound to remain operational (although for now it is being updated). Jonathan -----Original Message----- From: Carlo Lapid [mailto:cmlapid at up.edu.ph] Sent: Tue 10/3/2006 10:06 PM To: bioperl-l at bioperl.org Subject: [Bioperl-l] genbank mirror Hi, I'm trying to set up a local mirror of a large part of the Genbank database. For users to access the local database, I need to create a web-based search tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank flat files I've downloaded based on a query entered by the user. I'm trying to use Bioperl to create this from scratch, but I'm having a very hard time, especially since I want the user to have reasonable flexibility in customizing his search. The best that I've been able to accomplish is a search function that retrieves genbank sequence objects based on their primary IDs or accession numbers; by using the fetch method of the Bio::Index::GenBank module. But this doesn't help users who don't know the exact IDs for the sequences they want. Can anybody suggest a way to use Bioperl to search for an ordinary word or phrase, like "16S gene", which could be matched against the description field, or the entire genbank entry? (Alternatively, is there some other freely available tool or software that can do this?) I've been scouring the Bioperl documentation, but I couldn't find anything. I just need to be pointed in the right direction. What I thought was a relatively simple problem has been driving me crazy for days; if anybody has any suggestions I would really, really appreciate it. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From bix at sendu.me.uk Wed Oct 4 09:19:45 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 04 Oct 2006 14:19:45 +0100 Subject: [Bioperl-l] Bioperl under WinXP In-Reply-To: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB> References: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB> Message-ID: <4523B4F1.3010305@sendu.me.uk> Gianluca De Bellis wrote: > I'm trying to use Bioperl under WinXP-SP2 (novice) > > Bioperl has been just downloaded (v 1.2.3) > > Even the simplest program with a single command (use Bio::Perl;) ends up in > an error of the Perl interpreter with these details > > AppName: perl.exe AppVer: 5.8.8.819 ModName: win32.dll > > ModVer: 0.0.0.0 Offset: 00003294 > > Coming from the windos reporting system > > Where is the problem? Hard to say. Do non-bioperl scripts work? Make sure to follow the Bioperl installation instructions carefully: http://bioperl.org/wiki/Installing_Bioperl_on_Windows And make sure to install at least version 1.4. 1.2.3 is ancient and effectively unsupported. From cjfields at uiuc.edu Wed Oct 4 10:03:34 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 4 Oct 2006 09:03:34 -0500 Subject: [Bioperl-l] Bioperl under WinXP In-Reply-To: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB> Message-ID: <000601c6e7bd$e22ad190$15327e82@pyrimidine> If you're using PPM, you can install a (much) newer version of BioPerl from here: http://www.gmod.org/ggb/ppm/ Add that as one of your repositories in PPM4 (seeing that you are using ActivePerl 5.8.8.819), then search for bioperl. The version should be 1.512. In a few weeks we'll be releasing a new developer release. A WinXP PPM is expected, as well as a bundled package to install all prerequisites. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Gianluca De Bellis > Sent: Wednesday, October 04, 2006 4:25 AM > To: bioperl-l at bioperl.org > Subject: [Bioperl-l] Bioperl under WinXP > > I'm trying to use Bioperl under WinXP-SP2 (novice) > > Bioperl has been just downloaded (v 1.2.3) > > Even the simplest program with a single command (use Bio::Perl;) ends up > in > an error of the Perl interpreter with these details > > AppName: perl.exe AppVer: 5.8.8.819 ModName: win32.dll > > ModVer: 0.0.0.0 Offset: 00003294 > > Coming from the windos reporting system > > Where is the problem? > > > > Thanks in advance > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From hlapp at gmx.net Wed Oct 4 10:25:23 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 4 Oct 2006 10:25:23 -0400 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <002101c6e6f8$67b4ae10$15327e82@pyrimidine> References: <002101c6e6f8$67b4ae10$15327e82@pyrimidine> Message-ID: <28FB0A46-24C9-4EDD-B89C-BF7A108A1EDC@gmx.net> On Oct 3, 2006, at 10:29 AM, Chris Fields wrote: > The constructor in Bio::Root::RootI lets one know that its use is > deprecated, so you shouldn't have any cases of 'our qw > (Bio::Root::RootI)'; Don't confuse the constructor with the inheritance tree. Interface classes should never be instantiated, hence the constructor, consistent with the documentation, should never get executed. > there should be some way of inheriting Root directly or > indirectly. I would > say that any direct use of RootI is not good practice, though. I don't know what you mean by 'directly' or 'indirectly' but inheritance from interfaces, and interfaces extending (inheriting from) other interfaces, is certainly standard practice. I'm not sure at all why it would be a bad one. > For the current implementation we should only inherit > Bio::Root::Root, which > implements RootI. For the implementation classes, yes. For the interface classes, no. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Wed Oct 4 10:43:54 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 4 Oct 2006 10:43:54 -0400 Subject: [Bioperl-l] Clean-up of Bio::Root::IO In-Reply-To: <452344D4.8070908@infotech.monash.edu.au> References: <452344D4.8070908@infotech.monash.edu.au> Message-ID: <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net> On Oct 4, 2006, at 1:21 AM, Torsten Seemann wrote: > Bio::Root::IO has functions catfile(), tmpdir(), tmpfile(), rmtree() > which currently dispatch to the File:: version, or try to emulate > it. We > don't need to emulate anymore. Jason Stajich suggested in a previous > post that they should be deprecated, and that users should use > directly > the File:: functions themselves. I don't think there's a need to deprecate - if the methods just plain delegate to whatever File:: module is appropriate their implementation (supposedly) will become very simple and hence won't pose a maintenance burden anymore. One can still recommend for all new scripts or modules or code written to use the File:: modules directly, just I'm not sure there's a need to tell users that they should start changing their existing stuff. > > I have an uncommitted simplified version of Bio::Root::IO which does > this, and "all tests pass". The functions currently (silently) > dispatch > directly to their native counterparts. > > The only tricky function is tempfile() which is *mostly* like > File::Temp::tempfile(), but does some voodoo of converting > (TEMPLATE=>'xxx') to the non-hash first parameter of the File:: > version, > so I'm hesitant to commit. It may do other magic - Hilmar? Not that I would know of. If the tests pass (without having to change them!) I'd give it a try. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Oct 4 11:35:16 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 4 Oct 2006 10:35:16 -0500 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <28FB0A46-24C9-4EDD-B89C-BF7A108A1EDC@gmx.net> Message-ID: <001901c6e7ca$b12fd5b0$15327e82@pyrimidine> ... > Don't confuse the constructor with the inheritance tree. > > Interface classes should never be instantiated, hence the > constructor, consistent with the documentation, should never get > executed. I know that interfaces shouldn't be instantiated. I had noticed there are cases of 'our qw (Bio::Root::RootI)' where it is completely acceptable to inherit the interface. Makes sense to me now. > > there should be some way of inheriting Root directly or > > indirectly. I would > > say that any direct use of RootI is not good practice, though. > > I don't know what you mean by 'directly' or 'indirectly' but > inheritance from interfaces, and interfaces extending (inheriting > from) other interfaces, is certainly standard practice. I'm not sure > at all why it would be a bad one. I was talking specifically about inheriting RootI, and not about all Bioperl interfaces in general. I completely understand the use of interface/implementation in Bioperl. However, I missed one small fact until yesterday (of course AFTER I posed my reply), which was that interfaces may inherit RootI directly. My oops. I had understood that, in general, any Bioperl implementation should not inherit the RootI interface directly (they should inherit Root, since that implements RootI). The 'constructor' present in RootI is essentially to make sure that no one inherits from the wrong class. Probably a bad use of the terms 'direct' and 'indirect', so maybe I didn't get that across very well. What I meant was that all classes inherit Root in some way, either 'directly' (as the direct parent class) or 'indirectly' (through the inheritance tree). Probably comes from being primarily a molecular microbiologist and not a computer scientist. OT, but it would be nice to have an updated class diagram to sort out the inheritance hierarchy a bit easier. In the meantime, the Deobfuscator does help quite a bit. > > For the current implementation we should only inherit > > Bio::Root::Root, which > > implements RootI. > > For the implementation classes, yes. For the interface classes, no. I agree (see above). That's the one small bit about interfaces I missed along the way. Makes sense; they use throw_not_implemented(), which is a RootI method. > -hilmar Chris From pmiguel at purdue.edu Wed Oct 4 15:38:51 2006 From: pmiguel at purdue.edu (Phillip San Miguel) Date: Wed, 04 Oct 2006 15:38:51 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <4521A552.60301@sendu.me.uk> References: <4521A552.60301@sendu.me.uk> Message-ID: <45240DCB.2080204@purdue.edu> Sendu Bala wrote: > Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll > upload tar.gz files when I have access to the server, then reply here > with links. > > In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for > instructions on getting and testing this RC. > > Developers: > Make sure you're in the AUTHORS file in all 4 packages, as > appropriate. > > Users: > Even though 1.5.2 is a 'developer' release, we consider it the most > stable and capable version of Bioperl, and recommend that you use > it in all but the most critical production environments. Please > try it out and let us know of any problems or difficulties you run > into. > > > Thank you, > Sendu. > I didn't see any tests done under solaris, so I asked our sys admin to do the install on one of our machines. Just another data point: He installed this release candidate on a Sun E450 box running solaris. uname -a gives: SunOS descartes 5.10 Generic_118833-18 sun4u sparc SUNW,Ultra-4 perl -v gives: This is perl, v5.8.8 built for sun4-solaris (etc.) $ time make test PERL_DL_NONLAZY=1 /usr/local/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t t/AAChange...................ok t/AAReverseMutate............ok t/abi........................Bio::SeqIO::staden::read from bioperl-ext is not installed or is installed incorrectly - skipping abi.t tests t/abi........................ok t/ace........................ok t/AlignIO....................ok t/AlignStats.................ok t/AlignUtil..................ok t/alignUtilities.............ok t/Allele.....................ok t/Alphabet...................ok t/Annotation.................ok t/AnnotationAdaptor..........ok t/asciitree..................ok t/Assembly...................ok 1/19 skipped: t/Biblio.....................ok t/Biblio_biofetch............ok t/Biblio_eutils..............ok t/BiblioReferences...........ok t/BioDBGFF...................ok t/BioDBSeqFeature............ok 1/46Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423. t/BioDBSeqFeature............ok t/BioDBSeqFeature_BDB........ok t/BioDBSeqFeature_mysql......ok 3/46prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 prepare_cached(SELECT sequence,offset FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname= ? AND offset >= ? AND offset <= ? ORDER BY offset ) statement handle DBI::st=HASH(0x8c5048) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 prepare_cached(SELECT sequence,offset FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname= ? AND offset >= ? AND offset <= ? ORDER BY offset ) statement handle DBI::st=HASH(0x8c5048) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 t/BioDBSeqFeature_mysql......ok t/BioFetch_DB................ok t/BioGraphics................ok t/BlastIndex.................ok 1/13 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BlastIndex.................ok t/BPbl2seq................... -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPbl2seq...................ok 1/108 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPbl2seq...................ok t/BPlite.....................ok 1/97 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPlite.....................ok 52/97 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPlite.....................ok 88/97 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead STACK Bio::Tools::BPlite::new /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/Tools/BPlite.pm:197 STACK toplevel t/BPlite.t:127 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPlite.....................ok t/BPpsilite.................. -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPpsilite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPpsilite..................ok 4/11 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPpsilite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPpsilite..................ok t/bsml_sax...................ok t/Chain......................ok t/chaosxml...................ok t/cigarstring................ok t/ClusterIO..................ok t/Coalescent.................ok t/CodonTable.................ok t/Compatible.................ok t/consed.....................ok t/CoordinateGraph............ok t/CoordinateMapper...........ok t/Correlate..................ok t/ctf........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping ctf.t tests t/ctf........................ok t/CytoMap....................ok t/DB.........................skipped all skipped: Skipping all tests since they require network access, set BIOPERLDEBUG=1 to test t/DBCUTG.....................ok 11/34 skipped: Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test t/DBFasta....................ok t/DNAMutation................ok t/Domcut.....................ok t/ECnumber...................ok t/ELM........................ok 1/13 -------------------- WARNING --------------------- MSG: sleeping for 1 seconds --------------------------------------------------- t/ELM........................ok t/embl.......................ok t/EMBL_DB....................ok t/EMBOSS_Tools...............ok t/EncodedSeq.................ok t/entrezgene.................ok 491/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467. t/entrezgene.................ok 695/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467. t/entrezgene.................ok 723/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467. t/entrezgene.................ok 824/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467. t/entrezgene.................ok t/ePCR.......................ok t/ESEfinder..................ok 1/15# Looks like you planned 15 tests but only ran 14. t/ESEfinder..................dubious Test returned status 255 (wstat 65280, 0xff00) DIED. FAILED test 15 Failed 1/15 tests, 93.33% okay (less 9 skipped tests: 5 okay, 33.33%) t/est2genome.................ok t/EUtilities.................skipped all skipped: Set BIOPERLDEBUG=1 to run tests t/Exception..................ok t/Exonerate..................ok t/exp........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping exp.t tests t/exp........................ok t/fasta......................ok t/FeatureIO..................ok 7/33 -------------------- WARNING --------------------- MSG: '##feature-ontology' directive handling not yet implemented --------------------------------------------------- -------------------- WARNING --------------------- MSG: '##attribute-ontology' directive handling not yet implemented --------------------------------------------------- -------------------- WARNING --------------------- MSG: '##source-ontology' directive handling not yet implemented --------------------------------------------------- t/FeatureIO..................ok t/flat.......................ok t/FootPrinter................ok t/game.......................ok t/GbrowseGFF.................ok t/gcg........................ok t/GDB........................ok t/Gel........................ok t/genbank....................ok t/GeneCoordinateMapper.......ok t/Geneid.....................ok t/Genewise...................ok 2/51 skipped: t/Genomewise.................ok t/Genpred....................ok t/GFF........................ok t/GOR4.......................ok t/GOterm.....................ok t/GraphAdaptor...............ok t/GuessSeqFormat.............ok t/hmmer......................ok t/hmmer_pull.................ok t/HNN........................ok t/HtSNP......................ok t/Index......................ok t/InstanceSite...............ok t/interpro...................ok t/InterProParser.............ok t/IUPAC......................ok t/kegg.......................ok t/largefasta.................ok t/LargeLocatableSeq..........ok t/largepseq..................ok t/lasergene..................ok t/LinkageMap.................ok t/LiveSeq....................ok t/LocatableSeq...............ok t/Location...................ok t/LocationFactory............ok t/LocusLink..................ok t/lucy.......................ok t/Map........................ok t/MapIO......................ok t/masta......................ok t/Matrix.....................ok t/Measure....................ok t/MeSH.......................ok t/metafasta..................ok t/MetaSeq....................ok t/MicrosatelliteMarker.......ok t/MiniMIMentry...............ok t/MitoProt...................ok t/Molphy.....................ok t/MultiFile..................ok t/multiple_fasta.............ok t/Mutation...................ok t/Mutator....................ok t/NetPhos....................ok 10/14 skipped: Skipping tests which require network access, set BIOPERLDEBUG=1 to test t/Node.......................ok t/obo_parser.................ok t/OddCodes...................ok t/OMIMentry..................ok t/OMIMentryAllelicVariant....ok t/OMIMparser.................ok t/Ontology...................ok t/OntologyEngine.............ok t/OntologyStore..............ok t/PAML.......................ok t/Perl.......................ok t/phd........................ok t/Phenotype..................ok t/PhylipDist.................ok t/PhysicalMap................ok t/pICalculator...............ok t/Pictogram..................ok t/pir........................ok t/pln........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping pln.t tests t/pln........................ok t/PopGen.....................ok 2/89 skipped: t/PopGenSims.................ok t/primaryqual................ok t/PrimarySeq.................ok t/primedseq..................ok t/Primer.....................ok t/primer3....................ok t/Promoterwise...............ok t/ProtDist...................ok t/protgraph..................ok t/ProtMatrix.................ok t/ProtPsm....................ok t/Pseudowise.................ok t/psm........................ok t/QRNA.......................ok t/qual.......................ok t/RandDistFunctions..........ok t/RandomTreeFactory..........ok t/Range......................ok t/RangeI.....................ok t/raw........................ok t/RefSeq.....................ok t/Registry...................ok t/Relationship...............ok t/RelationshipType...........ok t/RemoteBlast................ok 11/13 skipped: to avoid timeout t/RepeatMasker...............ok t/RestrictionAnalysis........ok t/RestrictionEnzyme..........ok 1/14 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::RestrictionEnzyme is deprecatedUse Bio::Restriction classes instead --------------------------------------------------- t/RestrictionEnzyme..........ok t/RestrictionIO..............ok t/RNAChange..................ok t/rnamotif...................ok t/RootI......................ok t/RootIO.....................ok 2/27 skipped: various reasons t/RootStorable...............ok t/Scansite...................ok t/scf........................ok t/SearchDist.................ok t/SearchIO...................ok t/Seg........................ok t/Seq........................ok t/seq_quality................ok t/SeqAnalysisParser..........ok t/SeqBuilder.................ok t/SeqDiff....................ok t/SeqFeatCollection..........ok t/SeqFeature.................ok t/seqfeaturePrimer...........ok t/SeqHound_DB................ok 4/14Writing into 'shoundlog' log file. t/SeqHound_DB................ok t/SeqIO......................ok t/SeqPattern.................ok t/seqread_fail...............ok t/SeqStats...................ok t/SequenceFamily.............ok t/sequencetrace..............ok t/SeqUtils...................ok t/SeqVersion.................ok t/seqwithquality.............ok t/SeqWords...................ok t/Sigcleave..................ok t/Signalp....................ok t/Sim4.......................ok t/SimilarityPair.............ok t/SimpleAlign................ok t/simpleGOparser.............ok t/singlet....................ok t/sirna......................ok t/SiteMatrix.................ok t/SNP........................ok t/Sopma......................ok t/Species....................ok 5/20 skipped: Skipping tests which require network access, set BIOPERLDEBUG=1 to test t/Spidey.....................ok t/splicedseq.................ok t/StandAloneBlast............ok t/StructIO...................ok t/Structure..................ok t/swiss......................ok t/Symbol.....................ok t/tab........................ok t/table......................ok t/TagHaplotype...............ok t/Taxonomy...................ok 44/98 skipped: Skipping tests which require network access, set BIOPERLDEBUG=1 to test t/TaxonTree..................ok t/Tempfile...................ok t/Term.......................ok t/tigrxml....................ok t/tinyseq....................ok t/Tmhmm......................ok t/Tools......................ok t/Tree.......................ok t/TreeBuild..................ok t/TreeIO.....................ok t/trim.......................ok t/tRNAscanSE.................ok t/UCSCParsers................ok t/Unflattener................ok t/Unflattener2...............ok t/UniGene....................ok t/Variation_IO...............ok t/WABA.......................ok t/XEMBL_DB...................ok 1/9 skipped: server may be down t/ztr........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping ztr.t tests t/ztr........................ok Failed Test Stat Wstat Total Fail Failed List of Failed ------------------------------------------------------------------------------- t/ESEfinder.t 255 65280 15 2 13.33% 15 2 tests and 98 subtests skipped. Failed 1/240 test scripts, 99.58% okay. 1/11910 subtests failed, 99.99% okay. *** Error code 29 make: Fatal error: Command failed for target `test_dynamic' real 13m10.064s user 11m14.891s sys 0m45.417s $ TEST_VERBOSE=1 perl t/ESEfinder.t 1..15 ok 1 - use Bio::Tools::Analysis::DNA::ESEfinder; ok 2 - use Data::Dumper; ok 3 - use Bio::PrimarySeq; ok 4 - use Bio::Seq; ok 5 ok 6 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 7 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 8 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 9 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 10 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 11 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 12 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 13 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 14 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test # Looks like you planned 15 tests but only ran 14. From bix at sendu.me.uk Thu Oct 5 03:19:39 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 08:19:39 +0100 Subject: [Bioperl-l] EUtilities term handling Message-ID: <4524B20B.5010703@sendu.me.uk> This is actually a general question and not limited to EUtilities. As I see it EUtiltiies lets you do queries in Bioperl that you can do on a website. The question is, should a Bioperl module always work with queries that the website it is a front-end to works with? So for example, Bio::DB::EUtilities::esearch in -db mode 'gene' is essentially a frontend onto: http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?retmode=xml&db=gene&term= With a web-browser you can complete that url by supplying a term. For example, the term 'BRCA2+9606[taxid]' works and returns results: http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?retmode=xml&db=gene&term=BRCA2+9606[taxid] If you supply the exact same term to EUtilities::esearch like so: my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db => "gene", -term "BRCA2+9606[taxid]"); The search fails. From my 'user' perspective this is highly unexpected. Chris (the author) and I both understand /why/ it fails, but Chris doesn't think it is a bug, or at least something than can/should be changed. What do other people think? At the very least, if something unexpected happens, I'd suggest making a note of it in the POD somewhere. Eg. "Do not use + in term strings, even though they might work on the website". Chris: what is the disadvantage of always submitting '+' as '+' to the server? From bix at sendu.me.uk Thu Oct 5 03:24:45 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 08:24:45 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <4524B20B.5010703@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> Message-ID: <4524B33D.9070607@sendu.me.uk> Sendu Bala wrote: > > With a web-browser you can complete that url by supplying a term. For > example, the term 'BRCA2+9606[taxid]' works and returns results: > > http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?retmode=xml&db=gene&term=BRCA2+9606[taxid] > > > If you supply the exact same term to EUtilities::esearch like so: > > my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db => > "gene", -term "BRCA2+9606[taxid]"); *cough* my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db => "gene", -term => "BRCA2+9606[taxid]"); > The search fails. From m.weimer at dkfz-heidelberg.de Thu Oct 5 08:15:53 2006 From: m.weimer at dkfz-heidelberg.de (Marc Weimer) Date: Thu, 05 Oct 2006 14:15:53 +0200 Subject: [Bioperl-l] Bio::DB::SwissProt Error Message-ID: <1160050554.18691.11.camel@localhost> When running -------------------------------------------------------------- #! /usr/bin/perl -w use strict; use Bio::DB::SwissProt; my $db_obj = new Bio::DB::SwissProt(-verbose=>1); my $seq_obj = $db_obj->get_Seq_by_acc('P43780'); ------------------------------------------------------------- using Bioperl 1.4-1 I get the error message --------------------------------------------------------------------------------- request is POST http://www.ebi.ac.uk/cgi-bin/dbfetch Content-Length: 45 Content-Type: application/x-www-form-urlencoded format=swissprot&db=swall&style=raw&id=P43780 ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: swissprot stream with no ID. Not swissprot in my book STACK: Error::throw STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328 STACK Bio::SeqIO::swiss::next_seq /usr/share/perl5/Bio/SeqIO/swiss.pm:179 STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc /usr/share/perl5/Bio/DB/WebDBSeqI.pm:187 STACK: ./putativeGele.pl:8 ----------------------------------------------------------- -------------------------------------------------------------------------------- Any suggestions? Thanks, Marc From bix at sendu.me.uk Thu Oct 5 09:21:23 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 14:21:23 +0100 Subject: [Bioperl-l] Bio::DB::SwissProt Error In-Reply-To: <1160050554.18691.11.camel@localhost> References: <1160050554.18691.11.camel@localhost> Message-ID: <452506D3.5050501@sendu.me.uk> Marc Weimer wrote: [snip] > my $db_obj = new Bio::DB::SwissProt(-verbose=>1); > > my $seq_obj = $db_obj->get_Seq_by_acc('P43780'); [snip] > using Bioperl 1.4-1 I get the error message [snip] > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: swissprot stream with no ID. Not swissprot in my book [snip] > Any suggestions? It works with the latest Bioperl. I'm not sure if 1.5.1 works (the most recent official release), but 1.5.2 does (http://bioperl.org/wiki/Release_1.5.2), as does a checkout from CVS (http://bioperl.org/wiki/Getting_BioPerl#CVS). From m.weimer at dkfz-heidelberg.de Thu Oct 5 09:35:06 2006 From: m.weimer at dkfz-heidelberg.de (Marc Weimer) Date: Thu, 05 Oct 2006 15:35:06 +0200 Subject: [Bioperl-l] Bio::DB::SwissProt Error In-Reply-To: <452506D3.5050501@sendu.me.uk> References: <1160050554.18691.11.camel@localhost> <452506D3.5050501@sendu.me.uk> Message-ID: <1160055306.18691.14.camel@localhost> Works fine with 1.5.2 Thanks, Marc > Marc Weimer wrote: > [snip] > > my $db_obj = new Bio::DB::SwissProt(-verbose=>1); > > > > my $seq_obj = $db_obj->get_Seq_by_acc('P43780'); > [snip] > > using Bioperl 1.4-1 I get the error message > [snip] > > ------------- EXCEPTION: Bio::Root::Exception ------------- > > MSG: swissprot stream with no ID. Not swissprot in my book > [snip] > > Any suggestions? > > It works with the latest Bioperl. I'm not sure if 1.5.1 works (the most > recent official release), but 1.5.2 does > (http://bioperl.org/wiki/Release_1.5.2), as does a checkout from CVS > (http://bioperl.org/wiki/Getting_BioPerl#CVS). -- ######################################## Dr. Marc Weimer German Cancer Research Center Central Unit Biostatistics Im Neuenheimer Feld 280 D-69120 Heidelberg Phone: +49 (0) 6221/42-2387 Fax: +49 (0) 6221/42-2397 ######################################## From hlapp at gmx.net Thu Oct 5 09:55:58 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 5 Oct 2006 09:55:58 -0400 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <4524B20B.5010703@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> Message-ID: <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> On Oct 5, 2006, at 3:19 AM, Sendu Bala wrote: > This is actually a general question and not limited to EUtilities. > As I > see it EUtiltiies lets you do queries in Bioperl that you can do on a > website. The question is, should a Bioperl module always work with > queries that the website it is a front-end to works with? I think yes, but stick to this definition. Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez website it will actually not work. Hence, it should be no surprise that it doesn't work either using Bio::DB::EUtilities. The URL you are using to make your point is much more an example for using a web-service (SOAP, REST, or not) than it is for using a website. Using the web-service URL with a space in place of the '+' works, but yields a different result (just searches for BRCA2), so if tested for correct result the test fails. I.e., you don't expect an input form on a website to accept URL- encoded input. Instead, you expect it to do any URL-encoding for you that needs to be done. Conversely, if you are using a URL to retrieve stuff using e.g. wget or curl, it is clear that you will need to do URL encoding yourself unless there is a command line option that lets you instruct the querying program to do so. I would be careful with mangling the two definitions into one, resulting in a module that needs to serve two masters. You could consider providing an option though that lets you turn off the URL encoding on demand. Aside from that, one of the advantages of having the service wrapped in Bioperl is in fact that you can have it accept a wider variety of parameters that the actual service would allow you to have, e.g., arrays, hashes, or whatever seems appropriate. My $0.02. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Thu Oct 5 10:08:01 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 15:08:01 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> Message-ID: <452511C1.5020709@sendu.me.uk> Hilmar Lapp wrote: > > On Oct 5, 2006, at 3:19 AM, Sendu Bala wrote: > >> This is actually a general question and not limited to EUtilities. As I >> see it EUtiltiies lets you do queries in Bioperl that you can do on a >> website. The question is, should a Bioperl module always work with >> queries that the website it is a front-end to works with? > > I think yes, but stick to this definition. > > Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez > website it will actually not work. Hence, it should be no surprise that > it doesn't work either using Bio::DB::EUtilities. On the contrary, I find it a surprise because EUtilities is an interface to NCBI's eutils, not the entrez website. If I had previously read instructions on using eutils: http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.section.constructing-urls I might (do) expect that I /should/ use + in my term. > Aside from that, one of the advantages of having the service wrapped in > Bioperl is in fact that you can have it accept a wider variety of > parameters that the actual service would allow you to have, e.g., > arrays, hashes, or whatever seems appropriate. I was going to suggest that terms be supplied as an array, leaving Bioperl code to decide how to 'AND' all the terms (elements in the array) together. It would also further force the user not to think of how eutils normally works, but to only consider the Bioperl instructions on how to form a query. But I'm not sure of the value of all that. From cjfields at uiuc.edu Thu Oct 5 10:06:50 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 09:06:50 -0500 Subject: [Bioperl-l] Bio::DB::SwissProt Error In-Reply-To: <452506D3.5050501@sendu.me.uk> References: <1160050554.18691.11.camel@localhost> <452506D3.5050501@sendu.me.uk> Message-ID: <1AC863FF-3C44-4017-B20F-BD6DB413B318@uiuc.edu> On Oct 5, 2006, at 8:21 AM, Sendu Bala wrote: > Marc Weimer wrote: > [snip] >> my $db_obj = new Bio::DB::SwissProt(-verbose=>1); >> >> my $seq_obj = $db_obj->get_Seq_by_acc('P43780'); > [snip] >> using Bioperl 1.4-1 I get the error message > [snip] >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: swissprot stream with no ID. Not swissprot in my book > [snip] >> Any suggestions? > > It works with the latest Bioperl. I'm not sure if 1.5.1 works (the > most > recent official release), but 1.5.2 does > (http://bioperl.org/wiki/Release_1.5.2), as does a checkout from CVS > (http://bioperl.org/wiki/Getting_BioPerl#CVS). Mark, you'll have to update to 1.5.2 or CVS, as Sendu suggested. There were server changes for biofetch which were fixed about 4-6 months ago (post rel. 1.5.1); I think several changes were made to Bio::SeqIO::swiss as well during this period. I think the error here results from Bio::SeqIO::swiss trying to parse an empty byte stream. Sendu, do you think that Bio::SeqIO::swiss (and other SeqIO parsers) should throw a more specific message for getting an empty byte stream? Or is it more trouble than it's worth? Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Thu Oct 5 10:14:40 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 15:14:40 +0100 Subject: [Bioperl-l] Bio::DB::SwissProt Error In-Reply-To: <1AC863FF-3C44-4017-B20F-BD6DB413B318@uiuc.edu> References: <1160050554.18691.11.camel@localhost> <452506D3.5050501@sendu.me.uk> <1AC863FF-3C44-4017-B20F-BD6DB413B318@uiuc.edu> Message-ID: <45251350.5030608@sendu.me.uk> Chris Fields wrote: > >>> ------------- EXCEPTION: Bio::Root::Exception ------------- >>> MSG: swissprot stream with no ID. Not swissprot in my book [snip] > I think the error here results from Bio::SeqIO::swiss trying to parse an > empty byte stream. Sendu, do you think that Bio::SeqIO::swiss (and > other SeqIO parsers) should throw a more specific message for getting an > empty byte stream? Or is it more trouble than it's worth? Trouble wise, I've no idea without looking into it. Generally speaking though I can say that the error message is pretty useless and I'm always in favour of better error messages. From hlapp at gmx.net Thu Oct 5 10:21:49 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 5 Oct 2006 10:21:49 -0400 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <452511C1.5020709@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> Message-ID: On Oct 5, 2006, at 10:08 AM, Sendu Bala wrote: >> >> On Oct 5, 2006, at 3:19 AM, Sendu Bala wrote: >> >>> This is actually a general question and not limited to >>> EUtilities. As I >>> see it EUtiltiies lets you do queries in Bioperl that you can do >>> on a >>> website. The question is, should a Bioperl module always work with >>> queries that the website it is a front-end to works with? >> >> I think yes, but stick to this definition. >> >> Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez >> website it will actually not work. Hence, it should be no surprise >> that >> it doesn't work either using Bio::DB::EUtilities. > > On the contrary, I find it a surprise because EUtilities is an > interface > to NCBI's eutils, not the entrez website. > > If I had previously read instructions on using eutils: > http://www.ncbi.nlm.nih.gov/books/bv.fcgi? > rid=coursework.section.constructing-urls > I might (do) expect that I /should/ use + in my term. This is my point - stick to your definitions. Are you wrapping a query form on a website or are you wrapping a web service (i.e., a URL)? The examples you give are about wrapping a web-service. Your original question was about wrapping a website. Yet another question is what the author of Bio::DB::EUtilities intended to wrap. The other thing to consider is user-friendliness. If you are wrapping a web-service, do you still make not URL-encoding the user input the default? What will 90% of the users probably want or expect to be able to do? URL-encode all input themselves or expect the module to do this for them unless they turn it off? As far as I'm concerned, I'll happily count myself among those who are lazy and ignorant, don't read NCBI's documentation, don't want to know how to URL encode and why this needs to be done, but just want it to work. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Thu Oct 5 10:31:06 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 09:31:06 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <4524B20B.5010703@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> Message-ID: On Oct 5, 2006, at 2:19 AM, Sendu Bala wrote: > This is actually a general question and not limited to EUtilities. > As I > see it EUtiltiies lets you do queries in Bioperl that you can do on a > website. The question is, should a Bioperl module always work with > queries that the website it is a front-end to works with? > > So for example, Bio::DB::EUtilities::esearch in -db mode 'gene' is > essentially a frontend onto: > > http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi? > retmode=xml&db=gene&term= > > With a web-browser you can complete that url by supplying a term. For > example, the term 'BRCA2+9606[taxid]' works and returns results: > > http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi? > retmode=xml&db=gene&term=BRCA2+9606[taxid] > > If you supply the exact same term to EUtilities::esearch like so: > > my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db => > "gene", -term "BRCA2+9606[taxid]"); > > The search fails. From my 'user' perspective this is highly > unexpected. > Chris (the author) and I both understand /why/ it fails, but Chris > doesn't think it is a bug, or at least something than can/should be > changed. What do other people think? At the very least, if something > unexpected happens, I'd suggest making a note of it in the POD > somewhere. Eg. "Do not use + in term strings, even though they might > work on the website". > > Chris: what is the disadvantage of always submitting '+' as '+' to the > server? A few reasons: 1) According to NCBI, you can use '+' in queries, but not as a boolean. Global changes of '+' to a space may change the meaning of the query in a few rare occasions. So, if you really wanted to search for the string 'BRCA2+ATG', NCBI looks for that term literally. 2) '+' is a URI reserved symbol for a space delimiter. Therefore, any parameters containing '+' are URI-encoded into %2B, which is decoded on NCBI's end back to '+' (The is demonstrable with current EUtilities output and the returned XML data). 3) Why not just use a space (implicit AND)? Or an explicit boolean? Or '&' (which apparently works but is not specified in the NCBI Entrez docs)? The bug is in the query and not in the code, i.e. is is a user- generated bug, not an EUtilities bug. And it shouldn't be unexpected, as NCBI has very specific rules for building queries for Entrez (just like any other database). If I were to use nonstandard queries for MySQL, BioFetch, UCSC, or anything else, I would expect to get bad results. As the old saying goes, garbage in, garbage out. The following link has their updated rules: http://www.ncbi.nlm.nih.gov/books/bv.fcgi? rid=helpentrez.chapter.EntrezHelp Here is their old one: http://www.ncbi.nlm.nih.gov/entrez/query/static/help/helpdoc.html We could, of course, put something in POD, but you never presented that option to me before. I'll grant that the EUtilities API needs some cleaning up, not easy to do when the returned data varies from each utility. But it does get the URL encoding correct, at least in this case. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Thu Oct 5 10:32:49 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 15:32:49 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> Message-ID: <45251791.9040409@sendu.me.uk> Hilmar Lapp wrote: > > On Oct 5, 2006, at 10:08 AM, Sendu Bala wrote: > >> On the contrary, I find it a surprise because EUtilities is an interface >> to NCBI's eutils, not the entrez website. >> >> If I had previously read instructions on using eutils: >> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.section.constructing-urls >> >> I might (do) expect that I /should/ use + in my term. > > This is my point - stick to your definitions. Are you wrapping a query > form on a website or are you wrapping a web service (i.e., a URL)? > > The examples you give are about wrapping a web-service. Your original > question was about wrapping a website. Right... I don't see that that changes the answer to my question though does it? "The question is, should a Bioperl module always work with queries that the web-service it is a front-end to works with?" For me, the answer is still yes. > As far as I'm concerned, I'll happily count myself among those who are > lazy and ignorant, don't read NCBI's documentation, don't want to know > how to URL encode and why this needs to be done, but just want it to work. That's a reasonable attitude to take. Which comes back to the question I asked of Chris - naively, if you send + as + you can please everyone, can't you? Both people who have read the docs on the web-service and those who haven't? Or are there real queries in which a user may want to search for a phrase with a literal + in it (and where such a search works via eutils)? From bix at sendu.me.uk Thu Oct 5 10:44:33 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 15:44:33 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: References: <4524B20B.5010703@sendu.me.uk> Message-ID: <45251A51.6020802@sendu.me.uk> Chris Fields wrote: > The bug is in the query and not in the code, i.e. is is a > user-generated bug, not an EUtilities bug. And it shouldn't be > unexpected, as NCBI has very specific rules for building queries for > Entrez (just like any other database). So I guess this comes down to something Hilmar mentioned and I never even considered before. You consider your EUtilities stuff as a frontend to entrez, and therefore consider valid queries as queries that are valid for entrez and not eutils? If that's the case, fine. I understand why you don't think this is a bug. Again, something that might warrant a mention in the POD. Currently the naming of the modules and the explicit references to eutils (and me knowing the implementation uses eutils) got me confused. From cjfields at uiuc.edu Thu Oct 5 10:51:28 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 09:51:28 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <452511C1.5020709@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> Message-ID: <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> On Oct 5, 2006, at 9:08 AM, Sendu Bala wrote: >>> This is actually a general question and not limited to >>> EUtilities. As I >>> see it EUtiltiies lets you do queries in Bioperl that you can do >>> on a >>> website. The question is, should a Bioperl module always work with >>> queries that the website it is a front-end to works with? >> >> I think yes, but stick to this definition. >> >> Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez >> website it will actually not work. Hence, it should be no surprise >> that >> it doesn't work either using Bio::DB::EUtilities. > > On the contrary, I find it a surprise because EUtilities is an > interface > to NCBI's eutils, not the entrez website. It uses NCBI's CGI interface for eutils, not the SOAP interface. Very different. I have considered using the NCBI SOAP-based interface, but the web services are still somewhat incomplete, unlike the CGI interface. > If I had previously read instructions on using eutils: > http://www.ncbi.nlm.nih.gov/books/bv.fcgi? > rid=coursework.section.constructing-urls > I might (do) expect that I /should/ use + in my term. You are looking at part of the naked URL on that page. Here's what that page says: "When constructing URLs for the eUtils, please use lowercase characters for all parameters except &WebEnv. There is no required order for the URL parameters in an eUtils URL, and null values or inappropriate parameters are ignored. Avoid placing spaces in the URLs, particularly in queries. If a space is required, use a plus sign (+) instead of a space: * Incorrect: &id=352, 25125, 234, ... * Correct: &id=352,25125,234,... * Incorrect: &term=biomol mrna[properties] AND mouse[organism] * Correct: &term=biomol+mrna[properties]+AND+mouse[organism] Other special characters, such as the # symbol used in referring to a query key on the History server, should be represented by their URL encodings (%23 for #).top link" I use URI for building the URL with the parameters. URI specifically encodes all of this for you, so spaces convert to '+' and '+' converts to %2B. >> Aside from that, one of the advantages of having the service >> wrapped in >> Bioperl is in fact that you can have it accept a wider variety of >> parameters that the actual service would allow you to have, e.g., >> arrays, hashes, or whatever seems appropriate. > > I was going to suggest that terms be supplied as an array, leaving > Bioperl code to decide how to 'AND' all the terms (elements in the > array) together. It would also further force the user not to think of > how eutils normally works, but to only consider the Bioperl > instructions > on how to form a query. But I'm not sure of the value of all that. Why do we need to intuit what the user is thinking at an particular time? How would I know that someone actually wanted to search using the literal string 'abc+123' as opposed to 'abc 123'? I see value in your last suggestion but I think a class or set of classes would be best suited for that: MySQL Query | in out | MySQL Query Entrez Query |-----> Generic Query class----->| Entrez Query SRS Query | | SRS Query ad infinitum... The generic query object could then be used in DB searches as an option besides using a raw string. Though it would get tricky with SQL's complexity... Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Thu Oct 5 10:54:04 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 5 Oct 2006 10:54:04 -0400 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45251791.9040409@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <45251791.9040409@sendu.me.uk> Message-ID: <9916EDEE-EA3C-4C55-A004-A46F37B559BF@gmx.net> On Oct 5, 2006, at 10:32 AM, Sendu Bala wrote: >> The examples you give are about wrapping a web-service. Your >> original question was about wrapping a website. > > Right... I don't see that that changes the answer to my question > though does it? > > "The question is, should a Bioperl module always work with > queries that the web-service it is a front-end to works with?" > > For me, the answer is still yes. The answer is still yes. My point was the query that works with a website is not necessarily the query that works with a web-service, even if that web-service also powers the website. > >> As far as I'm concerned, I'll happily count myself among those who >> are lazy and ignorant, don't read NCBI's documentation, don't want >> to know how to URL encode and why this needs to be done, but just >> want it to work. > > That's a reasonable attitude to take. Which comes back to the > question I asked of Chris - naively, if you send + as + you can > please everyone, can't you? Both people who have read the docs on > the web-service and those who haven't? Or are there real queries in > which a user may want to search for a phrase with a literal + in it > (and where such a search works via eutils)? So are you suggesting to URL-encode some characters but not others? This would move you into muddy waters and I'm wondering what the gain is from that, and for whom it is a gain. It sounds like it will mostly benefit those who have studied the NCBI documentation and know exactly the URL they want to send and want to ignore the EUtilities POD. My humble guess is the far majority of people will either not read any documentation, or read the module's POD. Maybe a better way to serve both types of people is to accept a parameter -querystring that is expected to include everything from 'term=' onwards (including 'term=' itself) which gives you complete control and freedom if you know what you are doing, and otherwise implement what you suggested before: > I was going to suggest that terms be supplied as an array, leaving > Bioperl code to decide how to 'AND' all the terms (elements in the > array) together. It would also further force the user not to think of > how eutils normally works, but to only consider the Bioperl > instructions > on how to form a query. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Thu Oct 5 11:02:01 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 16:02:01 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> Message-ID: <45251E69.7040507@sendu.me.uk> Chris Fields wrote: > > On Oct 5, 2006, at 9:08 AM, Sendu Bala wrote: > >> On the contrary, I find it a surprise because EUtilities is an interface >> to NCBI's eutils, not the entrez website. > > It uses NCBI's CGI interface for eutils, not the SOAP interface. Very > different. I have considered using the NCBI SOAP-based interface, but > the web services are still somewhat incomplete, unlike the CGI interface. I don't know anything about the SOAP interface. I'm talking about the CGI interface that you use. >> If I had previously read instructions on using eutils: >> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.section.constructing-urls >> >> I might (do) expect that I /should/ use + in my term. > > You are looking at part of the naked URL on that page. Here's what that > page says: I know what it says... > * Correct: &term=biomol+mrna[properties]+AND+mouse[organism] The correct query is the one that has +s in it. > I use URI for building the URL with the parameters. URI specifically > encodes all of this for you, so spaces convert to '+' and '+' converts > to %2B. Well, yes. This causes what I thought of as a bug. It prevents me from submitting a /correct/ eutils term. However it isn't a bug if you explain to users they shouldn't be submitting valid eutils terms, but only valid /entrez/ terms. From cjfields at uiuc.edu Thu Oct 5 11:15:49 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 10:15:49 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45251A51.6020802@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <45251A51.6020802@sendu.me.uk> Message-ID: On Oct 5, 2006, at 9:44 AM, Sendu Bala wrote: > Chris Fields wrote: >> The bug is in the query and not in the code, i.e. is is a user- >> generated bug, not an EUtilities bug. And it shouldn't be >> unexpected, as NCBI has very specific rules for building queries >> for Entrez (just like any other database). > > So I guess this comes down to something Hilmar mentioned and I > never even considered before. You consider your EUtilities stuff as > a frontend to entrez, and therefore consider valid queries as > queries that are valid for entrez and not eutils? The eutils tools access the same databases as the web page, in the same way, using the same search terms. From the EUtilities docs: "The eUtils access the core search and retrieval engine of the Entrez system and, therefore, are only capable of retrieving data that are already in Entrez." > If that's the case, fine. I understand why you don't think this is > a bug. Again, something that might warrant a mention in the POD. > Currently the naming of the modules and the explicit references to > eutils (and me knowing the implementation uses eutils) got me > confused. I'll note that in there is URI encoding in POD, but that should be a no-brainer. I don't think every Bio::DB* class specifies this, mainly because it is taken for granted. Pretty much anything that builds URL strings needs to encode based on the URI standard, and any server that accepts URLs is expected to decode using the same standard. So, again, why does that have to be specifically outlined in POD? Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Oct 5 11:24:39 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 10:24:39 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45251E69.7040507@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> Message-ID: >> I use URI for building the URL with the parameters. URI >> specifically encodes all of this for you, so spaces convert to '+' >> and '+' converts to %2B. > > Well, yes. This causes what I thought of as a bug. It prevents me > from submitting a /correct/ eutils term. However it isn't a bug if > you explain to users they shouldn't be submitting valid eutils > terms, but only valid /entrez/ terms. I can specify in POD that URI encoding is in effect if that placates you, and maybe add a bit about how terms are to be built (based on the website). I also noticed that the esearch POD doesn't have a demo in the SYNOPSIS yet (my fault). However, I think this is all a bit silly. This is something most people already realize and take for granted (it's standard for any CGI interface to use URI encoding). Also, most Entrez users do not use a term like 'BRCA2+Human [ORGANISM]'. They use 'BRCA2 AND Human[ORGANISM]' or 'BRCA2 Human [ORGANISM]', the latter which is implicit. All of this is on the Entrez website. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From MEC at stowers-institute.org Thu Oct 5 11:12:02 2006 From: MEC at stowers-institute.org (Cook, Malcolm) Date: Thu, 5 Oct 2006 10:12:02 -0500 Subject: [Bioperl-l] using nfreeze instead of freeze in Bio::SeqFeature::Store Message-ID: Lincoln, I committed a change to Bio::SeqFeature::Store to use nfreeze instead of freeze which should allow SeqFeature objects to survive database freeze/thaw cycles across architectures. I hope I was not presumptuous or in error in doing this.... Regards, Malcolm Cook Database Applications Manager - Bioinformatics Stowers Institute for Medical Research - Kansas City, Missouri From bix at sendu.me.uk Thu Oct 5 11:28:55 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 16:28:55 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: References: <4524B20B.5010703@sendu.me.uk> <45251A51.6020802@sendu.me.uk> Message-ID: <452524B7.5080003@sendu.me.uk> Chris Fields wrote: > > On Oct 5, 2006, at 9:44 AM, Sendu Bala wrote: > >> Chris Fields wrote: >>> The bug is in the query and not in the code, i.e. is is a >>> user-generated bug, not an EUtilities bug. And it shouldn't be >>> unexpected, as NCBI has very specific rules for building queries for >>> Entrez (just like any other database). >> >> So I guess this comes down to something Hilmar mentioned and I never >> even considered before. You consider your EUtilities stuff as a >> frontend to entrez, and therefore consider valid queries as queries >> that are valid for entrez and not eutils? > > The eutils tools access the same databases as the web page, in the same > way, using the same search terms. It doesn't. The eutils interface behaves differently with +s than does the entrez website interface. In eutils + means space, whilst in entrez, + means the plus symbol. >> If that's the case, fine. I understand why you don't think this is a >> bug. Again, something that might warrant a mention in the POD. >> Currently the naming of the modules and the explicit references to >> eutils (and me knowing the implementation uses eutils) got me confused. > > I'll note that in there is URI encoding in POD, but that should be a > no-brainer. Just that it is URI encoded isn't the problem. The problem is the difference in behaviour outlined above. > I don't think every Bio::DB* class specifies this, mainly > because it is taken for granted. Pretty much anything that builds URL > strings needs to encode based on the URI standard, and any server that > accepts URLs is expected to decode using the same standard. > > So, again, why does that have to be specifically outlined in POD? Because they're different. If I construct a valid eutils query it might not work. You ought to explain why. "EUtilities takes any valid entrez query and transforms it into a valid eutils query for submission. Do not try and provide a valid eutils query of your own, or the extra transformation will result in no results" From bix at sendu.me.uk Thu Oct 5 11:30:44 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 16:30:44 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> Message-ID: <45252524.7030006@sendu.me.uk> Chris Fields wrote: >>> I use URI for building the URL with the parameters. URI specifically >>> encodes all of this for you, so spaces convert to '+' and '+' >>> converts to %2B. >> >> Well, yes. This causes what I thought of as a bug. It prevents me from >> submitting a /correct/ eutils term. However it isn't a bug if you >> explain to users they shouldn't be submitting valid eutils terms, but >> only valid /entrez/ terms. > > I can specify in POD that URI encoding is in effect if that placates > you, and maybe add a bit about how terms are to be built (based on the > website). I also noticed that the esearch POD doesn't have a demo in > the SYNOPSIS yet (my fault). > > However, I think this is all a bit silly. This is something most people > already realize and take for granted (it's standard for any CGI > interface to use URI encoding). > > Also, most Entrez users do not use a term like 'BRCA2+Human[ORGANISM]'. > They use 'BRCA2 AND Human[ORGANISM]' or 'BRCA2 Human[ORGANISM]', the > latter which is implicit. All of this is on the Entrez website. Exactly. You're assuming an entrez user and expecting an entrez query. I don't think its silly given the name of the modules for the user to assume the code needs an eutils query, which is a different thing with different behaviour /independent/ of URI encoding. From cjfields at uiuc.edu Thu Oct 5 11:50:51 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 10:50:51 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45251E69.7040507@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> Message-ID: <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu> > I know what it says... Ah, that's the Sendu I know and love. > >> * Correct: &term=biomol+mrna[properties]+AND+mouse[organism] > > The correct query is the one that has +s in it. Yes, that's because it's a URL, not a raw search term string (it has been URI-encoded so spaces are converted to '+'). If you use that as a direct query in Entrez you will not get the same response. You do get something if you use the new NCBI global query form on the main page, but clicking on the nucleotide or PMC hits reveals that the URL is malformed and no term is present. That is exactly the same response in EUtilities: 0 0 0 Note the QueryTranslation tag is empty. The only noticeable difference is using egquery (which I just fixed in CVS yesterday). The returned XML gives no hits for any database, which is true based on individual esearch queries for those database, and is actually more consistent than the website version. >> I use URI for building the URL with the parameters. URI specifically >> encodes all of this for you, so spaces convert to '+' and '+' >> converts >> to %2B. > > Well, yes. This causes what I thought of as a bug. It prevents me from > submitting a /correct/ eutils term. However it isn't a bug if you > explain to users they shouldn't be submitting valid eutils terms, but > only valid /entrez/ terms. If you mean that most users will actually use a URL-like search term, then I would say you have a point. But that simply isn't the case. If clarifying the docs makes it better, then so be it. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Oct 5 11:59:53 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 10:59:53 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45252524.7030006@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> Message-ID: <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> On Oct 5, 2006, at 10:30 AM, Sendu Bala wrote: > Chris Fields wrote: >>>> I use URI for building the URL with the parameters. URI >>>> specifically encodes all of this for you, so spaces convert to >>>> '+' and '+' converts to %2B. >>> >>> Well, yes. This causes what I thought of as a bug. It prevents me >>> from submitting a /correct/ eutils term. However it isn't a bug >>> if you explain to users they shouldn't be submitting valid eutils >>> terms, but only valid /entrez/ terms. >> I can specify in POD that URI encoding is in effect if that >> placates you, and maybe add a bit about how terms are to be built >> (based on the website). I also noticed that the esearch POD >> doesn't have a demo in the SYNOPSIS yet (my fault). >> However, I think this is all a bit silly. This is something most >> people already realize and take for granted (it's standard for any >> CGI interface to use URI encoding). >> Also, most Entrez users do not use a term like 'BRCA2+Human >> [ORGANISM]'. They use 'BRCA2 AND Human[ORGANISM]' or 'BRCA2 Human >> [ORGANISM]', the latter which is implicit. All of this is on the >> Entrez website. > > Exactly. You're assuming an entrez user and expecting an entrez > query. I don't think its silly given the name of the modules for > the user to assume the code needs an eutils query, which is a > different thing with different behaviour /independent/ of URI > encoding. It's a silly distinction. The POD for Bio::DB::EUtilities states: Bio::DB::EUtilities - interface for handling web queries and data retrieval from NCBI's Entrez Utilities. My question is this : why would anyone (particularly the everyday bioperl user) want to use URL-encoded parameters for a query? That seems to be your main argument here. If so, wouldn't I just paste them together then send them off NCBI eutils? Would I devote ~ 10 classes to that? I could do that in a short program using an array, join, and LWP::Simple. The purpose is quite clearly stated, but if you feel that by badgering me to add something to POD I consider common sense, then you're right. You've succeeded. Bravo. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Thu Oct 5 12:02:05 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 17:02:05 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu> Message-ID: <45252C7D.3050009@sendu.me.uk> Chris Fields wrote: > >>> * Correct: &term=biomol+mrna[properties]+AND+mouse[organism] >> >> The correct query is the one that has +s in it. > > Yes, that's because it's a URL, not a raw search term string (it has > been URI-encoded so spaces are converted to '+'). If you use that as a > direct query in Entrez you will not get the same response. But we're not doing Entrez queries. We're using a module called EUtilities to do an eutils query, which involves forming a url in which spaces should to be converted to +. That's the source of confusion. Is the user supposed to do this, or is EUtilities? All you had to do 8 emails ago is tell me that EUtilities is supposed to do that. You /still/ haven't told me that. I give up. From cjfields at uiuc.edu Thu Oct 5 12:12:11 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 11:12:11 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45252C7D.3050009@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu> <45252C7D.3050009@sendu.me.uk> Message-ID: On Oct 5, 2006, at 11:02 AM, Sendu Bala wrote: > Chris Fields wrote: >> >>>> * Correct: &term=biomol+mrna[properties]+AND+mouse[organism] >>> >>> The correct query is the one that has +s in it. >> Yes, that's because it's a URL, not a raw search term string (it >> has been URI-encoded so spaces are converted to '+'). If you use >> that as a direct query in Entrez you will not get the same response. > > But we're not doing Entrez queries. We're using a module called > EUtilities to do an eutils query, which involves forming a url in > which spaces should to be converted to +. That's the source of > confusion. Is the user supposed to do this, or is EUtilities? > > All you had to do 8 emails ago is tell me that EUtilities is > supposed to do that. You /still/ haven't told me that. I give up. It should be apparent from the documentation and the URLs posted in debugging output the first few times you used it. Again, why would I dedicate ~ 10 classes to pasting together URI-encoded strings? Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Thu Oct 5 12:22:36 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 17:22:36 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> Message-ID: <4525314C.7020205@sendu.me.uk> Chris Fields wrote: > > On Oct 5, 2006, at 10:30 AM, Sendu Bala wrote: > >> Exactly. You're assuming an entrez user and expecting an entrez query. >> I don't think its silly given the name of the modules for the user to >> assume the code needs an eutils query, which is a different thing with >> different behaviour /independent/ of URI encoding. > > It's a silly distinction. The POD for Bio::DB::EUtilities states: > > Bio::DB::EUtilities - interface for handling web queries and data > retrieval from NCBI's Entrez Utilities. > > My question is this : why would anyone (particularly the everyday > bioperl user) want to use URL-encoded parameters for a query? Well I'll tell you why I was trying to use URL-encoded parameters, if that helps you any. I read the pod for EUtilities but all the examples have very simple -term s defined with just a single word. So I wonder how I'm supposed to make an 'AND' term. I also have no idea what utilities I'm supposed to use, or what databases etc. I need to get the answer I want. The POD points me here: http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html Combined with the EUtilities synopsis I know I'm supposed to start with esearch so I look at: http://eutils.ncbi.nlm.nih.gov/entrez/query/static/esearch_help.html And figure out what my terms are supposed to be. Then I test some example terms in my web browser using the esearch base url (http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?) to see if they work, and copy/paste the terms into my EUtilities-using perl script, replacing variable terms with perl variables. Then I find that my terms don't work, ask you about it, and you fail to tell me I should be testing my terms at http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=search&DB=gene. If you think I'm stupid, fine, but I'm probably not the only stupid person on the planet. Which is why I suggested a POD addition. You don't have to make any POD change if you don't want to. I simply thought it might help avoid anyone 'badgering' you in the future with a similar problem. From bix at sendu.me.uk Thu Oct 5 12:28:51 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 17:28:51 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu> <45252C7D.3050009@sendu.me.uk> Message-ID: <452532C3.9030804@sendu.me.uk> Chris Fields wrote: > > On Oct 5, 2006, at 11:02 AM, Sendu Bala wrote: > >> Chris Fields wrote: >>> >>>>> * Correct: &term=biomol+mrna[properties]+AND+mouse[organism] >>>> >>>> The correct query is the one that has +s in it. >>> Yes, that's because it's a URL, not a raw search term string (it has >>> been URI-encoded so spaces are converted to '+'). If you use that as >>> a direct query in Entrez you will not get the same response. >> >> But we're not doing Entrez queries. We're using a module called >> EUtilities to do an eutils query, which involves forming a url in >> which spaces should to be converted to +. That's the source of >> confusion. Is the user supposed to do this, or is EUtilities? >> >> All you had to do 8 emails ago is tell me that EUtilities is supposed >> to do that. You /still/ haven't told me that. I give up. > > It should be apparent from the documentation and the URLs posted in > debugging output the first few times you used it. Again, why would I > dedicate ~ 10 classes to pasting together URI-encoded strings? I'm not sure how not doing URI-encoding would suddenly make your classes worthless. I find them to be very useful (even when I didn't know there was any URI-encoding, was incorrectly using +s and it happened to work anyway). From bernd.web at gmail.com Thu Oct 5 10:09:38 2006 From: bernd.web at gmail.com (Bernd Web) Date: Thu, 5 Oct 2006 16:09:38 +0200 Subject: [Bioperl-l] Eutilities Batch Message-ID: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com> Hi, I am using the new EUtilities. It looks great. I was trying to use epost followed by elink but i get an error. The same error is actually given with the example on http://doc.bioperl.org/bioperl-live/Bio/DB/EUtilities/elink.html: Can't call method "get_databases" on an undefined value at EU.pl line 25. For completeness, the code is shown below too. Any suggestions what is going wrong? Regards, Bernd # chain EUtilities for complex queries use Bio::DB::EUtilities; my $esearch = Bio::DB::EUtilities->new(-eutil => 'esearch', -db => 'pubmed', -term => 'hutP', -usehistory => 'y'); $esearch->get_response; # parse the response, fetch a cookie my $elink = Bio::DB::EUtilities->new(-eutil => 'elink', -db => 'protein,taxonomy', -dbfrom => 'pubmed', -cookie => $esearch->next_cookie, -cmd => 'neighbor'); # this retrieves the Bio::DB::EUtilities::ElinkData object my ($linkset) = $elink->next_linkset; my @ids; # step through IDs for each linked database in the ElinkData object for my $db ($linkset->get_databases) { @ids = $linkset->get_LinkIds_by_db($db); #returns primary ID's # do something here } From cjfields at uiuc.edu Thu Oct 5 13:31:33 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 12:31:33 -0500 Subject: [Bioperl-l] Eutilities Batch In-Reply-To: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com> References: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com> Message-ID: I'll look into it. I'm busy updating the EUtilities tools now. Chris On Oct 5, 2006, at 9:09 AM, Bernd Web wrote: > Hi, > > I am using the new EUtilities. It looks great. > I was trying to use epost followed by elink but i get an error. The > same error is actually given with the example on > http://doc.bioperl.org/bioperl-live/Bio/DB/EUtilities/elink.html: > Can't call method "get_databases" on an undefined value at EU.pl > line 25. > > For completeness, the code is shown below too. > > Any suggestions what is going wrong? > > Regards, > Bernd > > # chain EUtilities for complex queries > > use Bio::DB::EUtilities; > > my $esearch = Bio::DB::EUtilities->new(-eutil => 'esearch', > -db => 'pubmed', > -term => 'hutP', > -usehistory => 'y'); > > $esearch->get_response; # parse the response, fetch a cookie > > my $elink = Bio::DB::EUtilities->new(-eutil => 'elink', > -db => > 'protein,taxonomy', > -dbfrom => 'pubmed', > -cookie => $esearch- > >next_cookie, > -cmd => 'neighbor'); > > # this retrieves the Bio::DB::EUtilities::ElinkData object > > my ($linkset) = $elink->next_linkset; > my @ids; > > # step through IDs for each linked database in the ElinkData object > > for my $db ($linkset->get_databases) { > @ids = $linkset->get_LinkIds_by_db($db); #returns primary ID's > # do something here > } > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From daniel.lang at biologie.uni-freiburg.de Thu Oct 5 13:12:02 2006 From: daniel.lang at biologie.uni-freiburg.de (Daniel Lang) Date: Thu, 05 Oct 2006 19:12:02 +0200 Subject: [Bioperl-l] Bio::DB::SeqFeature Message-ID: <45253CE2.1070208@biologie.uni-freiburg.de> Hi, we are storing Bio::SeqFeature::Gene::GeneStructure objects (with multiple transcripts) using Bio::DB::SeqFeature::Store in a mysql db (latest bioperl-live checkout). The Bio::SeqFeature::Gene::GeneStructure's are generated from scratch out of a database. The first observation is that is seems to work (fetched objects behave like Bio::SeqFeature::Gene::GeneStructure's) despite the fact that we get these warnings: Can't store item CODE(0x113db10) at lib/Storable.pm (autosplit into lib/auto/Storable/_freeze.al) line 287, line 1. Can't store item CODE(0x11786f0) at lib/Storable.pm (autosplit into lib/auto/Storable/_freeze.al) line 287, line 1. (in cleanup) Not a CODE reference at /home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, line 1. prepare_cached(SELECT f.id,f.object FROM feature as f WHERE ( f.seqid=? AND f.end>=? AND f.start<=? AND ((f.tier=? AND f.bin between ? AND ?) OR (f.tier=? AND f.bin between ? AND ?) OR (f.tier=? AND f.bin between ? AND ?) OR (f.tier=? AND f.bin between ? AND ?) OR (f.tier=? AND f.bin between ? AND ?) OR (f.tier=? AND f.bin between ? AND ?) OR (f.tier=? AND f.bin between ? AND ?)) ) ) statement handle DBI::st=HASH(0x1c317cf0) still Active at /home/lang/bioperl/bioperl-live//Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 (in cleanup) Not a CODE reference at /home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, line 1. Is this something serious? Does this mean that the stored object doesn't have everything it had before freezing? Or are we using Bio::DB::SeqFeature inappropriately? The other question would be, if we can visualize these stored feature objects easily using gbrowse? I didn't find a hint mentioning Bio::DB::SeqFeature as being supported by gbrowse on the gmod pages... Is it working already? Will it? Thanks in advance, Daniel -- Daniel Lang University of Freiburg, Plant Biotechnology Schaenzlestr. 1, D-79104 Freiburg fax: +49 761 203 6945 phone: +49 761 203 6974 homepage: http://www.plant-biotech.net/ e-mail: daniel.lang at biologie.uni-freiburg.de ################################################# My software never has bugs. It just develops random features. ################################################# From cjfields at uiuc.edu Thu Oct 5 13:45:40 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 12:45:40 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <452532C3.9030804@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu> <45252C7D.3050009@sendu.me.uk> <452532C3.9030804@sendu.me.uk> Message-ID: <003DD8C4-6E59-44C2-9A1C-117E036D93BC@uiuc.edu> On Oct 5, 2006, at 11:28 AM, Sendu Bala wrote: > I'm not sure how not doing URI-encoding would suddenly make your > classes worthless. I find them to be very useful (even when I > didn't know there was any URI-encoding, was incorrectly using +s > and it happened to work anyway). That's not my point (and sincerest apologies for the 'badgering' bit). If you made the assumption that all the parameters had to be URI-encoded, why couldn't I do something like: my %param = (#make up your list of parameters here#); my $eutil = 'esearch'; my $url = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/$eutil.fcgi"; # join the key value pairs with '=', then join all those with & # add to end of url # post and retrieve via LWP::Simple It's more user-friendly to set up the parameters so that you wouldn't have to encode everything yourself, esp. when the most reliable way to encode URI strings is to 'use URI'. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Oct 5 14:11:25 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 13:11:25 -0500 Subject: [Bioperl-l] Eutilities Batch In-Reply-To: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com> References: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com> Message-ID: <4A340977-C6AD-4728-8947-BF5A8A782807@uiuc.edu> On Oct 5, 2006, at 9:09 AM, Bernd Web wrote: > Hi, > > I am using the new EUtilities. It looks great. > I was trying to use epost followed by elink but i get an error. The > same error is actually given with the example on > http://doc.bioperl.org/bioperl-live/Bio/DB/EUtilities/elink.html: > Can't call method "get_databases" on an undefined value at EU.pl > line 25. > > For completeness, the code is shown below too. > > Any suggestions what is going wrong? > > Regards, > Bernd Grr...that's my error, sorry Bernd. The POD wasn't updated to match the change I made and has a few errors. The elink object, for starters, doesn't fetch the response using get_response(). Also, the ElinkData method has changed slightly but accomplishes the same thing. Odd, since I copied and pasted that from working code... Just a note: these are considered highly experimental at the moment, though they should be ready for general use and toying around. I would like any suggestions on methods and so on you may have (Sendu has made some very helpful ones off-list which I plan on implementing). Feel free to let me know if something doesn't work. Note that, because of their experimental nature, you will want to take note of any methods changes in particular as I try to solidify the API and clean up the POD, so expect some momentary 'outages'. I plan on setting up a remedial interface for all the container objects (like ElinkData) which will help clarify things and solidify the API in the next few weeks, at least to a point where the class methods have a consistent naming scheme. I plan on using this as a backend web agent for a general Entrez interface at some point to get data into Bio* objects. In the meantime, try this: use Bio::DB::EUtilities; my $esearch = Bio::DB::EUtilities->new(-eutil => 'esearch', -db => 'pubmed', -term => 'hutP', -usehistory => 'y'); $esearch->get_response; # parse the response, fetch a cookie my $elink = Bio::DB::EUtilities->new(-eutil => 'elink', -db => 'protein,taxonomy', -dbfrom => 'pubmed', -cookie => $esearch- >next_cookie, -cmd => 'neighbor'); $elink->get_response; # this retrieves the Bio::DB::EUtilities::ElinkData object my $linkset = $elink->next_linkset; my @ids; # step through IDs for each linked database in the ElinkData object for my $db ($linkset->get_all_linkdbs) { @ids = $linkset->get_LinkIds_by_db($db); #returns primary ID's print join q(,), @ids; # do something here } Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From dmessina at wustl.edu Thu Oct 5 14:07:56 2006 From: dmessina at wustl.edu (David Messina) Date: Thu, 5 Oct 2006 13:07:56 -0500 Subject: [Bioperl-l] BioPerl Deobfuscator updated Message-ID: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> I'm pleased to announce a revised version of the BioPerl Deobfuscator is now available. Many thanks to Mauricio Cuadra for updating bioperl.org's installation: http://bioperl.org/cgi-bin/deob_interface.cgi I've incorporated many of the suggestions you all sent in after the first release, and many of the modules that had non-standard documentation have been updated in the meantime, too, so hopefully you'll find it much improved. There are still some issues with a few modules; please report any problems you see. Also, it's now indexing bioperl-live instead of 1.4, which should make it a little more useful, too. A complete list of changes is below. I welcome your bug reports and suggestions for improvements, via email, this list, Bugzilla, or the Wiki page. Thanks, Dave Changes 0.0.3 Mon Oct 2 20:01:45 CDT 2006 FIX: change default $deob_detail_path to be a relative URL instead of having localhost hardcoded. Thanks to Jason Stajich for pointing this out. FIX: Bio::Ontology modules are no longer missing their prefix in the class list, and their methods are now shown in the lower pane as expected. Thanks to Hilmar Lapp for reporting this bug. FIX: can now handle (and ignore) VERSION POD section. FIX: missing SYNOPSIS section now handled properly. In fact, the SYNOPSIS and DESCRIPTION sections can be in reverse order now, although for consistency this is not recommended. FIX: Bug #2114: "Obfuscator doesn't show "Bio:Matrix:Generic" has been fixed. This bug turned out to afflict multiple modules, which weren't getting parsed correctly by deob_index.pl. NEW: Table cells have been padded out to get rid of that "scrunched" look. Thanks to Sendu Bala for this great suggestion. NEW: If the 'Returns' subsection of a method's documentation contains a POD L<> link, the Deobfuscator assumes this to be a package name, and wraps it in an href for display. This feature is not robust, but seems to work well enough for now. NEW: the list of classes is now sorted alphabetically depth- first, so that subclasses appear just after their parent class. Thanks to Amir Karger for noticing the strange sorting behavior. NEW: HTML page title now 'BioPerl Deobfuscator' to distinguish it from other Deobfuscators out there. Thanks to Amir Karger for suggesting this. NEW: 'No match' search string now more prominent. Yep, kudos to Amir Karger again -- another great idea! NEW: Search box caption now explicitly states that only package names can be searched. Big ups to Amir Karger for this suggestion. The ability to search method names is planned for a future version. NEW: added -x option to deob_index.pl. This allows the use of an 'excluded modules' file. This feature was added to resolve an issue with four modules which rely on external modules to compile. Class::Inspector, used by the Deobfuscator needs to load a module to traverse its inheritance tree, and modules must compile before they can be loaded. CHANGE: using short name now when traversing with File::Find to help identify excluded modules (deob_index.pl). From lincoln.stein at gmail.com Thu Oct 5 14:41:08 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Thu, 5 Oct 2006 14:41:08 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net> Message-ID: <6dce9a0b0610051141x6b61407ar1c0a13cf7616b35f@mail.gmail.com> The non-numeric comparison bug in Bio::DB::SeqFeature is fixed in the latest CVS. Do I need to do anything special to get the CVS fixes into the release candidate? Lincoln On 10/2/06, Chris Fields wrote: > > [I won't create a wiki account just to report this.] > > > > Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG > > not set. Lots of warnings about missing packages and all, but this > > looks interesting: > > > > Argument "+" isn't numeric in numeric lt (<) at Bio/DB/ > > SeqFeature/Segment.pm line 423. > > This is verified on Mac OS X. > > > Otherwise: > > > > Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, > > 99.99% okay. > > > > The failed test is: > > > > t/ESEfinder..................dubious > > Test returned status 255 (wstat 65280, 0xff00) > > DIED. FAILED test 15 > > What do you get when you run that set of tests using 'perl -I. -w t/ > ESEFinder.t'? The bad status code is odd and could be a remote > server issue. > > Chris > > > > > > florin > > > > -- > > If we wish to count lines of code, we should not regard them as lines > > produced but as lines spent. -- Edsger Dijkstra > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From MEC at stowers-institute.org Thu Oct 5 15:18:08 2006 From: MEC at stowers-institute.org (Cook, Malcolm) Date: Thu, 5 Oct 2006 14:18:08 -0500 Subject: [Bioperl-l] using nfreeze instead of freeze in Bio::SeqFeature::Store Message-ID: Yes, there is overhead (c.f. perldoc Storable) "When writing in network order, all fields are written out as standard lengths, which allows full interworking, but takes longer to read and write)" And, I suppose there is also risk of loosing precision in using network order: You can also store data in network order to allow easy sharing across multiple platforms, or when storing on a socket known to be remotely connected. The routines to call have an initial "n" prefix for *network*, as in "nstore" and "nstore_fd". At retrieval time, your data will be correctly restored so you don't have to know whether you're restoring from native or network ordered data. Double values are stored stringified to ensure portability as well, at the slight risk of loosing some precision in the last decimals. So, I agree, it should be configuration option, perhaps defaulting to using network order. However, given the factoring of ../Bio/DB/SeqFeature/Store.pm I'm not sure how to best make it a configuration option since the two provided serializers don't share a common interface. Possibly something like: =head1 Methods for Connecting and Initializating a Database =head2 new Title : new Usage : $db = Bio::DB::SeqFeature::Store->new(@options) Function: connect to a database Returns : A descendent of Bio::DB::Seqfeature::Store Args : several - see below Status : public This class method creates a new database connection. The following -name=E$value arguments are accepted:http://iowg.brcdevel.org/gff3.html#a_fasta Name Value ---- ----- -adaptor The name of the Adaptor class (default DBI::mysql) -serializer The name of the serializer class (default Storable) -network_order Strive to 'preserve network order' (if the serializer implements it. Currently, only Storable.pm does, and this will cause it to use nfreeze instead of freeze. (default 1) -index_subfeatures Whether or not to make subfeatures searchable (default true) -cache Activate LRU caching feature -- size of cache -compress Compresses features before storing them in database using Compress::Zlib Malcolm Cook Database Applications Manager - Bioinformatics Stowers Institute for Medical Research - Kansas City, Missouri > -----Original Message----- > From: Lincoln Stein [mailto:lincoln.stein at gmail.com] > Sent: Thursday, October 05, 2006 1:43 PM > To: Cook, Malcolm > Cc: lstein at cshl.org; bioperl-l > Subject: Re: using nfreeze instead of freeze in Bio::SeqFeature::Store > > I think it's fine unless there is a significant performance hit, in > which case the change should be made into a configuration option. Do > you know if there is any overhead on doing this? > > Lincoln > > On 10/5/06, Cook, Malcolm wrote: > > Lincoln, > > > > I committed a change to Bio::SeqFeature::Store to use > nfreeze instead of > > freeze which should allow SeqFeature objects to survive database > > freeze/thaw cycles across architectures. > > > > I hope I was not presumptuous or in error in doing this.... > > > > Regards, > > > > Malcolm Cook > > Database Applications Manager - Bioinformatics > > Stowers Institute for Medical Research - Kansas City, Missouri > > > > > > > -- > Lincoln D. Stein > Cold Spring Harbor Laboratory > 1 Bungtown Road > Cold Spring Harbor, NY 11724 > (516) 367-8380 (voice) > (516) 367-8389 (fax) > FOR URGENT MESSAGES & SCHEDULING, > PLEASE CONTACT MY ASSISTANT, > SANDRA MICHELSEN, AT michelse at cshl.edu > From lincoln.stein at gmail.com Thu Oct 5 14:32:40 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Thu, 5 Oct 2006 14:32:40 -0400 Subject: [Bioperl-l] Bio::DB::SeqFeature In-Reply-To: <45253CE2.1070208@biologie.uni-freiburg.de> References: <45253CE2.1070208@biologie.uni-freiburg.de> Message-ID: <6dce9a0b0610051132p7d7fcf84g27578731f9727f3f@mail.gmail.com> Hi Daniel, The warnings you are seeing are occurring because Bio::SeqFeature::Gene::GeneStructure contains a CODE reference. I think it must be registering a cleanup method via its Bio::Root::Root ancestor. When Storable serializes the object, it complains that it can't serialize the CODE reference and instead converts it into the string "CODE(0xXXXXX)". Then, after you thaw the object, Bio::Root::Root is complaining that the CODE reference is invalid because it is a string, not a reference. Yuck. I think, however, that I can fix this by setting some magic variables in Storable version 2.05 that will decompile and compile the CODE references. I will try this and send you a note when the code is in CVS. GBrowse does run off Bio::DB::SeqFeature::Store and is noticeably faster than the original Bio::DB::GFF adaptor. Nothing really changes except that you set the db_adaptor option to Bio::DB::SeqFeature::Store. I haven't tried it using Bio::SeqFeature::Gene::GeneStructure, so no guarantees, but I am hopeful that it will work. Lincoln On 10/5/06, Daniel Lang wrote: > Hi, > > we are storing Bio::SeqFeature::Gene::GeneStructure objects (with > multiple transcripts) using Bio::DB::SeqFeature::Store in a mysql db > (latest bioperl-live checkout). > > The Bio::SeqFeature::Gene::GeneStructure's are generated from scratch > out of a database. > > The first observation is that is seems to work (fetched objects behave > like Bio::SeqFeature::Gene::GeneStructure's) despite the fact that we > get these warnings: > > Can't store item CODE(0x113db10) at lib/Storable.pm (autosplit into > lib/auto/Storable/_freeze.al) line 287, line 1. > Can't store item CODE(0x11786f0) at lib/Storable.pm (autosplit into > lib/auto/Storable/_freeze.al) line 287, line 1. > (in cleanup) Not a CODE reference at > /home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, line 1. > prepare_cached(SELECT f.id,f.object > FROM feature as f > WHERE ( f.seqid=? > AND f.end>=? AND f.start<=? AND ((f.tier=? AND f.bin between ? AND ?) > OR (f.tier=? AND f.bin between ? AND ?) > OR (f.tier=? AND f.bin between ? AND ?) > OR (f.tier=? AND f.bin between ? AND ?) > OR (f.tier=? AND f.bin between ? AND ?) > OR (f.tier=? AND f.bin between ? AND ?) > OR (f.tier=? AND f.bin between ? AND ?)) > ) > > ) statement handle DBI::st=HASH(0x1c317cf0) still Active at > /home/lang/bioperl/bioperl-live//Bio/DB/SeqFeature/Store/DBI/mysql.pm > line 1422 > (in cleanup) Not a CODE reference at > /home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, line 1. > > Is this something serious? Does this mean that the stored object doesn't > have everything it had before freezing? Or are we using > Bio::DB::SeqFeature inappropriately? > > The other question would be, if we can visualize these stored feature > objects easily using gbrowse? I didn't find a hint mentioning > Bio::DB::SeqFeature as being supported by gbrowse on the gmod pages... > Is it working already? Will it? > > Thanks in advance, > Daniel > > -- > > Daniel Lang > University of Freiburg, Plant Biotechnology > Schaenzlestr. 1, D-79104 Freiburg > fax: +49 761 203 6945 > phone: +49 761 203 6974 > homepage: http://www.plant-biotech.net/ > e-mail: daniel.lang at biologie.uni-freiburg.de > > ################################################# > My software never has bugs. > It just develops random features. > ################################################# > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From hlapp at gmx.net Thu Oct 5 16:34:49 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 5 Oct 2006 16:34:49 -0400 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <4525314C.7020205@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> <4525314C.7020205@sendu.me.uk> Message-ID: <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net> On Oct 5, 2006, at 12:22 PM, Sendu Bala wrote: > If you think I'm stupid, fine, but I'm probably not the only stupid > person on the planet. That's a great suggestion that I hope we can all agree on? I'll happily count myself among the stupid ones too so you're not alone, and stupid people and even more so those who are lucky enough not to be stupid have an obligation to document stuff so that even the stupid can understand, no matter how silly the documentation might get. Is that agreeable without causing yet more progressive hair loss? Actually - I'm having second thoughts. Isn't it a distinguishing feature of stupid people that - among other things - they are stupid enough to believe they don't need to read documentation? You admitted publicly that you read documentation - are you just faking the stupid? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Thu Oct 5 17:11:06 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 16:11:06 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> <4525314C.7020205@sendu.me.uk> <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net> Message-ID: <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu> On Oct 5, 2006, at 3:34 PM, Hilmar Lapp wrote: > > On Oct 5, 2006, at 12:22 PM, Sendu Bala wrote: > >> If you think I'm stupid, fine, but I'm probably not the only stupid >> person on the planet. > > That's a great suggestion that I hope we can all agree on? I'll > happily count myself among the stupid ones too so you're not alone, > and stupid people and even more so those who are lucky enough not > to be stupid have an obligation to document stuff so that even the > stupid can understand, no matter how silly the documentation might > get. > > Is that agreeable without causing yet more progressive hair loss? > > Actually - I'm having second thoughts. Isn't it a distinguishing > feature of stupid people that - among other things - they are > stupid enough to believe they don't need to read documentation? You > admitted publicly that you read documentation - are you just faking > the stupid? > > -hilmar If lack of good documentation == stupid, I know of a few other modules in trouble besides mine. Based on that we're in for a whole lot of stupid! And I feel stupid for my earlier remarks, Sendu, so apologies. And Hilmar, you're too late on the hair loss, at least on my end. I have corrected the EUtilities POD to reflect that all text input needs to be raw as URI encoding is done in the module, which should work (I think). I plan on committing it tonight. It also indicates that EUtilities search queries need to be made as if they are regular Entrez queries. Would that be sufficient? Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From pmiguel at purdue.edu Thu Oct 5 16:42:00 2006 From: pmiguel at purdue.edu (Phillip San Miguel) Date: Thu, 05 Oct 2006 16:42:00 -0400 Subject: [Bioperl-l] BioPerl Deobfuscator updated In-Reply-To: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> Message-ID: <45256E18.3080103@purdue.edu> David Messina wrote: > I'm pleased to announce a revised version of the BioPerl Deobfuscator > is now available. Many thanks to Mauricio Cuadra for updating > bioperl.org's installation: > > http://bioperl.org/cgi-bin/deob_interface.cgi > > I've incorporated many of the suggestions you all sent in after the > first release, and many of the modules that had non-standard > documentation have been updated in the meantime, too, so hopefully > you'll find it much improved. There are still some issues with a few > modules; please report any problems you see. Also, it's now indexing > bioperl-live instead of 1.4, which should make it a little more > useful, too. A complete list of changes is below. > > I welcome your bug reports and suggestions for improvements, via > email, this list, Bugzilla, or the Wiki page. > > > Thanks, > Dave > > Here are some comments: Would be good to have the column headings for the methods table in the fixed part of the page, rather than the scroll box. That way you could always see the column headings from anywhere in the list. Second, I've noticed that there are a fair number of methods that have "not documented" for "Returns" and "Usage". But in every case I've checked both of these were documented. For example, consider methods for Bio::Seq::SeqWithQuality. The method "accession_number" is listed as "not documented". But if you click on Bio::Seq:SeqWithQuality link to the documentation, usage is defined as: "$unique_biological_key = $obj->accession_number;" and returns is defined as "A string". Finally, it would be good to have the version of bioperl being deobfuscated on the deob_interface.cgi page. Just as a quick sanity-checking measure. After poking around a bit I found that bioperl-live is being indexed in the wiki. But, I can tell, it is just the sort of thing I'm going to forget and look for every time come back to the page after a few months... Overall very nice, though. Just what is needed when I'm trying to remember "which was the method that returns subseq string and which one returns an object?" Phillip SanMiguel Purdue University From bix at sendu.me.uk Thu Oct 5 17:24:34 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 22:24:34 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> <4525314C.7020205@sendu.me.uk> <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net> <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu> Message-ID: <45257812.5050008@sendu.me.uk> Chris Fields wrote: > > I have corrected the EUtilities POD to reflect that all text input needs > to be raw as URI encoding is done in the module, which should work (I > think). I plan on committing it tonight. It also indicates that > EUtilities search queries need to be made as if they are regular Entrez > queries. Would that be sufficient? You may not even need to mention anything about URI encoding, which might frighten some people. Something as simple as: =head1 SYNOPSIS use Bio::DB::EUtilities; my $esearch = Bio::DB::EUtilities->new(-eutil => 'esearch', -db => 'pubmed', -term => 'hutP AND xyz', ... and/or some POD for the new() method: =head2 new Title : new ... Args : -eutil => ... -db => ... -term => string, an entrez-style query =cut would get the point across, I think. BTW, can the term string be supplied anywhere else other than new()? It doesn't matter at all if it can't, I'm just idly wondering if I missed anything. From dmessina at wustl.edu Thu Oct 5 17:42:49 2006 From: dmessina at wustl.edu (David Messina) Date: Thu, 5 Oct 2006 16:42:49 -0500 Subject: [Bioperl-l] BioPerl Deobfuscator updated In-Reply-To: <45256E18.3080103@purdue.edu> References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> <45256E18.3080103@purdue.edu> Message-ID: <5861A24A-E0F9-4B06-9AD1-68857E6561F9@wustl.edu> Thanks so much, Phillip, for taking the time to check out the new version and send your comments. I really appreciate it! I've added them to the wiki page so I can track them. Best, Dave From cjfields at uiuc.edu Thu Oct 5 17:50:11 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 16:50:11 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45257812.5050008@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> <4525314C.7020205@sendu.me.uk> <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net> <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu> <45257812.5050008@sendu.me.uk> Message-ID: Sendu, I have the parameters all set up as get/sets at this point, but I'm open to suggestions on that. Note in the BEGIN block the heredoc eval {} block. Yes, nasty I know, but I hate AUTOLOAD. It works as a quick way of getting parameter get/sets up-and-running. I plan on making those explicit get/sets as soon as I can then sorting out particular ones to the various eutil modules where they are primarily used. Long story short, every parameter is a get/set at this time (including term()). The common ones needed for most EUtilities are initialized in the parent EUtilities::_initialize(), and eutil- specific parameters are initialized in the individual eutil plugins. Each eutil plugin only sets whatever parameters may be needed for operation (though you could circumvent that, since all of them are inherited via EUtilities). We could always simplify it to accept simple key-value pairs, but get/ sets (at least to me) allow more flexibility as long as you remember which parameters are set and to what. Chris On Oct 5, 2006, at 4:24 PM, Sendu Bala wrote: > Chris Fields wrote: >> I have corrected the EUtilities POD to reflect that all text input >> needs to be raw as URI encoding is done in the module, which >> should work (I think). I plan on committing it tonight. It also >> indicates that EUtilities search queries need to be made as if >> they are regular Entrez queries. Would that be sufficient? > > You may not even need to mention anything about URI encoding, which > might frighten some people. Something as simple as: > > =head1 SYNOPSIS > > use Bio::DB::EUtilities; > > my $esearch = Bio::DB::EUtilities->new(-eutil => 'esearch', > -db => 'pubmed', > -term => 'hutP AND > xyz', > ... > > and/or some POD for the new() method: > > =head2 new > > Title : new > ... > Args : -eutil => ... > -db => ... > -term => string, an entrez-style query > > =cut > > would get the point across, I think. > > BTW, can the term string be supplied anywhere else other than new > ()? It doesn't matter at all if it can't, I'm just idly wondering > if I missed anything. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Oct 5 17:51:06 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 16:51:06 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45257812.5050008@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> <4525314C.7020205@sendu.me.uk> <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net> <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu> <45257812.5050008@sendu.me.uk> Message-ID: <5B2E844F-7B8B-4F69-9005-138826B835FB@uiuc.edu> > You may not even need to mention anything about URI encoding, which > might frighten some people. Something as simple as: > > =head1 SYNOPSIS > > use Bio::DB::EUtilities; > > my $esearch = Bio::DB::EUtilities->new(-eutil => 'esearch', > -db => 'pubmed', > -term => 'hutP AND > xyz', > ... > > and/or some POD for the new() method: > > =head2 new > > Title : new > ... > Args : -eutil => ... > -db => ... > -term => string, an entrez-style query > > =cut > > would get the point across, I think. Oops, forgot. I'll add this in and update new() when I can. Thanks! Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From arareko at campus.iztacala.unam.mx Thu Oct 5 18:12:49 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Thu, 05 Oct 2006 17:12:49 -0500 Subject: [Bioperl-l] BioPerl Deobfuscator updated In-Reply-To: <45256E18.3080103@purdue.edu> References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> <45256E18.3080103@purdue.edu> Message-ID: <45258361.8080803@campus.iztacala.unam.mx> Phillip San Miguel wrote: > Finally, it would be good to have the version of bioperl being > deobfuscated on the deob_interface.cgi page. Just as a quick > sanity-checking measure. After poking around a bit I found that > bioperl-live is being indexed in the wiki. But, I can tell, it is just > the sort of thing I'm going to forget and look for every time come back > to the page after a few months... Dave, I think this value can be stored in one of the index files and passed as an argument to the deob_index.pl script. What do you think? Mauricio. -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From lincoln.stein at gmail.com Thu Oct 5 14:42:41 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Thu, 5 Oct 2006 14:42:41 -0400 Subject: [Bioperl-l] using nfreeze instead of freeze in Bio::SeqFeature::Store In-Reply-To: References: Message-ID: <6dce9a0b0610051142h56479843ofc5429d959cb6e3@mail.gmail.com> I think it's fine unless there is a significant performance hit, in which case the change should be made into a configuration option. Do you know if there is any overhead on doing this? Lincoln On 10/5/06, Cook, Malcolm wrote: > Lincoln, > > I committed a change to Bio::SeqFeature::Store to use nfreeze instead of > freeze which should allow SeqFeature objects to survive database > freeze/thaw cycles across architectures. > > I hope I was not presumptuous or in error in doing this.... > > Regards, > > Malcolm Cook > Database Applications Manager - Bioinformatics > Stowers Institute for Medical Research - Kansas City, Missouri > > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From torsten.seemann at infotech.monash.edu.au Fri Oct 6 01:26:10 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Fri, 06 Oct 2006 15:26:10 +1000 Subject: [Bioperl-l] Clean-up of Bio::Root::IO In-Reply-To: <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net> References: <452344D4.8070908@infotech.monash.edu.au> <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net> Message-ID: <4525E8F2.1000704@infotech.monash.edu.au> Hilmar, > I don't think there's a need to deprecate - if the methods just plain > delegate to whatever File:: module is appropriate their > implementation (supposedly) will become very simple and hence won't > pose a maintenance burden anymore. >> I have an uncommitted simplified version of Bio::Root::IO which does >> this, and "all tests pass". The functions currently (silently) >> dispatch >> directly to their native counterparts. >> >> The only tricky function is tempfile() which is *mostly* like >> File::Temp::tempfile(), but does some voodoo of converting >> (TEMPLATE=>'xxx') to the non-hash first parameter of the File:: >> version, >> so I'm hesitant to commit. It may do other magic - Hilmar? > > Not that I would know of. If the tests pass (without having to change > them!) I'd give it a try. Tempfile.t had two tests that failed. It seems that Bio::Root::IO had some magic whereby it would keep a list of all tempfilenames created with UNLINK != 0 and when the Bio::Root::IO object was destroyed (eg. undef $obj) it would MANUALLY unlink each of them. This would occur before File::Temp got to unlink them. Not sure why it was written like this (as File::Temp will delete them at the end of the script anyway) but maybe it was legacy for when File::Temp::tempfile WASN'T available. Anyway, I've kept backward compatibility there, although I think eventually it should be removed and Tempfile.t adjusted. Although all tests pass with my new trim Bio/Root/IO.pm I am still concerned about committing as the assumption is that the BioPerl test suite is good enough to handle such a change to an important module, but the reality may be different :-) Let me know if you think I should commit anyway, Your advice is appreciated. -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From dmessina at wustl.edu Fri Oct 6 01:25:56 2006 From: dmessina at wustl.edu (David Messina) Date: Fri, 6 Oct 2006 00:25:56 -0500 Subject: [Bioperl-l] BioPerl Deobfuscator updated In-Reply-To: <45258361.8080803@campus.iztacala.unam.mx> References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> <45256E18.3080103@purdue.edu> <45258361.8080803@campus.iztacala.unam.mx> Message-ID: On Oct 5, 2006, at 5:12 PM, Mauricio Herrera Cuadra wrote: > I think this value can be stored in one of the index files and > passed as an argument to the deob_index.pl script. What do you think? Yep, I think that works nicely. I added this feature and committed it to CVS. Here's what the new header looks like if you do deob_index.pl -s "bioperl-live": ? Thanks for the suggestions, guys. Dave -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: deob_header.jpg Type: image/jpeg Size: 25739 bytes Desc: not available URL: From deep_ans at yahoo.com Fri Oct 6 09:22:49 2006 From: deep_ans at yahoo.com (deepak shingan) Date: Fri, 6 Oct 2006 06:22:49 -0700 (PDT) Subject: [Bioperl-l] Sort blast file result according to evalues Message-ID: <20061006132249.49450.qmail@web51711.mail.yahoo.com> Hi , Is there any way to parse the blast file according to evalue for each hit. I want the output sorted according to hit evalue. I am using SearchIO algorithm and already tried sorting the hits according to bits, gaps, but I am not able to sort the hits by evalue. As evalues are mainly associated with hsp and each hit may have multiple hsps. waiting for help. Thanks, Dun Dansi --------------------------------- How low will we go? Check out Yahoo! Messenger?s low PC-to-Phone call rates. From hlapp at gmx.net Fri Oct 6 10:03:04 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 6 Oct 2006 10:03:04 -0400 Subject: [Bioperl-l] Clean-up of Bio::Root::IO In-Reply-To: <4525E8F2.1000704@infotech.monash.edu.au> References: <452344D4.8070908@infotech.monash.edu.au> <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net> <4525E8F2.1000704@infotech.monash.edu.au> Message-ID: <074183F9-004A-4693-A7BC-D5E5524FCC92@gmx.net> This is a 1.5, i.e. developers release that's in the works, and also you'd be doing this on the main trunk. If you get the tests to pass there's no reason to hold back. You may be right and in reality it has repercussions somewhere, but those will be the opportunities to improve our test suite. -hilmar On Oct 6, 2006, at 1:26 AM, Torsten Seemann wrote: > Although all tests pass with my new trim Bio/Root/IO.pm I am still > concerned about committing as the assumption is that the BioPerl > test suite is good enough to handle such a change to an important > module, but the reality may be different :-) > > Let me know if you think I should commit anyway, > > Your advice is appreciated. -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Fri Oct 6 10:58:09 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 6 Oct 2006 09:58:09 -0500 Subject: [Bioperl-l] Sort blast file result according to evalues In-Reply-To: <20061006132249.49450.qmail@web51711.mail.yahoo.com> References: <20061006132249.49450.qmail@web51711.mail.yahoo.com> Message-ID: The evalue for the hit is retrieved by the BlastHit::signifiance() method, if I remember correctly. So if $hit is a Bio::Search::Hit::BlastHit object, you use $hit->significance. If you want individual HSP evalues, you would use $hsp->evalue for the individual HSP objects. The output is normally sorted by the order they appear in the alignments and table, which is typically by increasing evalue or decreasing bits (score). So they are already sorted. If you wanted to run a sort yourself you could use a sort block using '{$a- >significance() <=> $b->significance()} @hits', but as pointed out on the wiki it may be safer to run a Schwartzian transform instead: http://www.bioperl.org/wiki/Bioperl_Best_Practices#Sorting Chris On Oct 6, 2006, at 8:22 AM, deepak shingan wrote: > Hi , > Is there any way to parse the blast file according to evalue for > each hit. I want the output sorted according to hit evalue. I am > using SearchIO algorithm and already tried sorting the hits > according to bits, gaps, but I am not able to sort the hits by evalue. > As evalues are mainly associated with hsp and each hit may have > multiple hsps. > > waiting for help. > > Thanks, > Dun Dansi > > > > > > --------------------------------- > How low will we go? Check out Yahoo! Messenger?s low PC-to-Phone > call rates. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Fri Oct 6 11:03:45 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 6 Oct 2006 10:03:45 -0500 Subject: [Bioperl-l] Clean-up of Bio::Root::IO In-Reply-To: <074183F9-004A-4693-A7BC-D5E5524FCC92@gmx.net> References: <452344D4.8070908@infotech.monash.edu.au> <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net> <4525E8F2.1000704@infotech.monash.edu.au> <074183F9-004A-4693-A7BC-D5E5524FCC92@gmx.net> Message-ID: <265AD609-F74E-4545-B3DD-FF94290BE0B4@uiuc.edu> On Oct 6, 2006, at 9:03 AM, Hilmar Lapp wrote: > This is a 1.5, i.e. developers release that's in the works, and also > you'd be doing this on the main trunk. If you get the tests to pass > there's no reason to hold back. > > You may be right and in reality it has repercussions somewhere, but > those will be the opportunities to improve our test suite. > > -hilmar Agreed, though I think Sendu only wants bug fixes for 1.5.2. You could always commit to CVS HEAD and it could be in 1.5.3. Let me rethink that. There were some subtle tempfile/tempdir issues that were popping up on WinXP where the some tempfiles were not being deleted b/c of permissions issues; I had planned on adding that to Bugzilla today or tomorrow. Maybe changing to File::Temp would fix that, so in essence it would be a bug fix! I'll go ahead and post the bug. Chris >> Although all tests pass with my new trim Bio/Root/IO.pm I am still >> concerned about committing as the assumption is that the BioPerl >> test suite is good enough to handle such a change to an important >> module, but the reality may be different :-) >> >> Let me know if you think I should commit anyway, >> >> Your advice is appreciated. > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From pmiguel at purdue.edu Fri Oct 6 11:06:56 2006 From: pmiguel at purdue.edu (Phillip San Miguel) Date: Fri, 06 Oct 2006 11:06:56 -0400 Subject: [Bioperl-l] BioPerl Deobfuscator updated In-Reply-To: <5861A24A-E0F9-4B06-9AD1-68857E6561F9@wustl.edu> References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> <45256E18.3080103@purdue.edu> <5861A24A-E0F9-4B06-9AD1-68857E6561F9@wustl.edu> Message-ID: <45267110.7030905@purdue.edu> David Messina wrote: > Thanks so much, Phillip, for taking the time to check out the new > version and send your comments. I really appreciate it! I've added > them to the wiki page so I can track them. > > Best, > Dave > Dave, No problem. I've just added a "keyword" to search BioPerl Deobfuscator to my Firefox browser. That way I can just type "deob qual" in my URL bar in firefox and the browser jumps directly to BioPerl Deobfuscator (like a bookmark) but it pre-submits the search item "qual". I heard about the Firefox "keywords" in a TWiT/FLOSS episode on mozilla. You just go to any search page and right-click in the search box of interest and one of the choices is "Add a Keyword for this Search". Then you just have to fill out "Name" and "Keyword" fields and drop the keyword into whatever folder you like. The "Keyword" then becomes the word to invoke that search with parameters that follow it when it is typed into the URL bar. Phillip From arareko at campus.iztacala.unam.mx Fri Oct 6 11:18:02 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Fri, 06 Oct 2006 10:18:02 -0500 Subject: [Bioperl-l] BioPerl Deobfuscator updated In-Reply-To: References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> <45256E18.3080103@purdue.edu> <45258361.8080803@campus.iztacala.unam.mx> Message-ID: <452673AA.7070305@campus.iztacala.unam.mx> Looks great! I'll update it during the weekend. Mauricio. David Messina wrote: > > On Oct 5, 2006, at 5:12 PM, Mauricio Herrera Cuadra wrote: >> I think this value can be stored in one of the index files and passed >> as an argument to the deob_index.pl script. What do you think? > > Yep, I think that works nicely. I added this feature and committed it to > CVS. Here's what the new header looks like if you do deob_index.pl -s > "bioperl-live": > > > Thanks for the suggestions, guys. > > Dave > > > ------------------------------------------------------------------------ > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From bix at sendu.me.uk Fri Oct 6 11:27:14 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 06 Oct 2006 16:27:14 +0100 Subject: [Bioperl-l] Sort blast file result according to evalues In-Reply-To: References: <20061006132249.49450.qmail@web51711.mail.yahoo.com> Message-ID: <452675D2.9090803@sendu.me.uk> Chris Fields wrote: > The evalue for the hit is retrieved by the BlastHit::signifiance() > method, if I remember correctly. So if $hit is a > Bio::Search::Hit::BlastHit object, you use $hit->significance. If > you want individual HSP evalues, you would use $hsp->evalue for the > individual HSP objects. > > The output is normally sorted by the order they appear in the > alignments and table, which is typically by increasing evalue or > decreasing bits (score). So they are already sorted. Concur. > If you wanted to run a sort yourself you could use a sort block using > '{$a->significance() <=> $b->significance()} @hits' Actually, it is best to use the sort_hits() method of the result object prior to asking for any hits. (As this allows for potential optimization in the parser.) ->significance is still the thing you need to sort on though. From cjfields at uiuc.edu Fri Oct 6 11:52:57 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 6 Oct 2006 10:52:57 -0500 Subject: [Bioperl-l] Sort blast file result according to evalues In-Reply-To: <452675D2.9090803@sendu.me.uk> References: <20061006132249.49450.qmail@web51711.mail.yahoo.com> <452675D2.9090803@sendu.me.uk> Message-ID: <31A6FC3A-8BEB-42B8-B51D-66E659EF7495@uiuc.edu> On Oct 6, 2006, at 10:27 AM, Sendu Bala wrote: >> If you wanted to run a sort yourself you could use a sort block using >> '{$a->significance() <=> $b->significance()} @hits' > > Actually, it is best to use the sort_hits() method of the result > object > prior to asking for any hits. (As this allows for potential > optimization > in the parser.) Ah, forgot about that one! Chris Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Fri Oct 6 14:36:49 2006 From: jason at bioperl.org (Jason Stajich) Date: Fri, 6 Oct 2006 11:36:49 -0700 Subject: [Bioperl-l] tempfile cleanup In-Reply-To: <437E2D49-F5F6-424A-BC61-EC4BFEB9F3C4@berkeley.edu> References: <437E2D49-F5F6-424A-BC61-EC4BFEB9F3C4@berkeley.edu> Message-ID: <0FCEC6B2-E190-4800-AAB1-89559C552FA6@bioperl.org> I think the magic trickery in there for cleanup is that File::Temp only cleans up tempfiles when Perl exits not when the Root::IO object goes out of scope -- so this can be a problem for people on CGI scripts that stay resident in memory and don't ever have tempfiles cleaned up. The managing the list aspect allows us to call _cleanup periodically (perhaps before the start of every Blast run) to insure that tempfiles are removed. perhaps newer File::Temp versions can solve this better now but I believe that was the behavior we were trying to deal with with managing the list of to-be-deleted files by the Root::IO object. This is some hackery that also had to do with not expecting File::Temp to be installed I believe. -jason From torsten.seemann at infotech.monash.edu.au Mon Oct 9 00:52:29 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Mon, 09 Oct 2006 14:52:29 +1000 Subject: [Bioperl-l] Multiple packages in the one .pm file Message-ID: <4529D58D.1080004@infotech.monash.edu.au> Hi all, The following modules have more than one "package xxxx;" declaration in them. For small, internal classes I guess this is fine, but for others, they should be split up into the filesystem - otherwise they are troublesome to locate and the online documentation doesn't list them! eg. bioperl-run/Bio/Tools/Run/Analysis/Job.pm is in bioperl-run/Bio/Tools/Run/Analysis.pm Here's the culprits: % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio | sed 's/:.*$//' | sort | uniq -d ; done bioperl-live/Bio/AnalysisI.pm bioperl-live/Bio/DB/Fasta.pm bioperl-live/Bio/DB/GFF.pm bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm bioperl-live/Bio/DB/SeqFeature/Store/memory.pm bioperl-live/Bio/SeqIO/interpro.pm bioperl-run/Bio/Tools/Run/Analysis.pm bioperl-run/Bio/Tools/Run/Analysis/soap.pm -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From pmiguel at purdue.edu Mon Oct 9 15:57:12 2006 From: pmiguel at purdue.edu (Phillip San Miguel) Date: Mon, 09 Oct 2006 15:57:12 -0400 Subject: [Bioperl-l] Will 1.5.2 bugfixes propagate into next RC? Message-ID: <452AA998.5010104@purdue.edu> I found a bug in Bio::SeqIO::phd and am wondering if the fix will propagate into the next release candidate? The bug is here: http://bugzilla.open-bio.org/show_bug.cgi?id=2120 I also created a patch that fixes it (on my machine, anyway). It is a fairly minor change, so it seems like it would be worth propagating it into the next release candidate. -- Phillip SanMiguel From bix at sendu.me.uk Mon Oct 9 16:57:28 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 09 Oct 2006 21:57:28 +0100 Subject: [Bioperl-l] Will 1.5.2 bugfixes propagate into next RC? In-Reply-To: <452AA998.5010104@purdue.edu> References: <452AA998.5010104@purdue.edu> Message-ID: <452AB7B8.4040404@sendu.me.uk> Phillip San Miguel wrote: > I found a bug in Bio::SeqIO::phd and am wondering if the fix will > propagate into the next release candidate? > > The bug is here: > > http://bugzilla.open-bio.org/show_bug.cgi?id=2120 > > I also created a patch that fixes it (on my machine, anyway). It is a > fairly minor change, so it seems like it would be worth propagating it > into the next release candidate. If it gets committed to HEAD before I make the next candidate, then yes. I'll do that if no one beats me to it (and if someone does, please add a new test for this). BTW Phillip, thank you for the bug report but in future use the attachment capabilities for files, please don't paste them into the comments box. From bix at sendu.me.uk Mon Oct 9 17:01:56 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 09 Oct 2006 22:01:56 +0100 Subject: [Bioperl-l] Analysis soap problem Message-ID: <452AB8C4.1010704@sendu.me.uk> I thought I'd 'advertise' this bug on the list so more people see it: http://bugzilla.open-bio.org/show_bug.cgi?id=2117 I don't want to make the next 1.5.2 release candidate until its fixed. Does anyone have any idea about it? Even if you can't fix it, just explaining what's (supposed) to be going on would help a lot. Thank you, Sendu. From Kevin.M.Brown at asu.edu Mon Oct 9 18:40:54 2006 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Mon, 9 Oct 2006 15:40:54 -0700 Subject: [Bioperl-l] Analysis soap problem Message-ID: <1A4207F8295607498283FE9E93B775B40219690B@EX02.asurite.ad.asu.edu> If I had to guess from looking at the snippet provided, the variable $seq holds no data so when you try to setup the regex /^$seq$/ you end up with /^$/ (blank line) and the warning. > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Sendu Bala > Sent: Monday, October 09, 2006 2:02 PM > To: bioperl-l List > Subject: [Bioperl-l] Analysis soap problem > > I thought I'd 'advertise' this bug on the list so more people see it: > http://bugzilla.open-bio.org/show_bug.cgi?id=2117 > > I don't want to make the next 1.5.2 release candidate until > its fixed. > Does anyone have any idea about it? Even if you can't fix it, just > explaining what's (supposed) to be going on would help a lot. > > Thank you, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at uiuc.edu Mon Oct 9 22:34:23 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 9 Oct 2006 21:34:23 -0500 Subject: [Bioperl-l] Analysis soap problem In-Reply-To: <452AB8C4.1010704@sendu.me.uk> References: <452AB8C4.1010704@sendu.me.uk> Message-ID: <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu> I have 'fixed' this in CVS. Note the quotes; it depends on what you might consider fixed. Multiple calls to results() were returning empty hash refs, so no data was being returned. For now, I stored the hash reference in a variable then tested each one. All tests now pass, including the 'outseq' one. Maybe it's just me, but shouldn't results() either consistently return the same information, or contain documentation that it doesn't do so? Anyway, I have left the bugzilla report open for now. Chris On Oct 9, 2006, at 4:01 PM, Sendu Bala wrote: > I thought I'd 'advertise' this bug on the list so more people see it: > http://bugzilla.open-bio.org/show_bug.cgi?id=2117 > > I don't want to make the next 1.5.2 release candidate until its fixed. > Does anyone have any idea about it? Even if you can't fix it, just > explaining what's (supposed) to be going on would help a lot. > > Thank you, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bosborne11 at verizon.net Mon Oct 9 22:09:45 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 09 Oct 2006 22:09:45 -0400 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: <4529D58D.1080004@infotech.monash.edu.au> Message-ID: Torsten, Fixed interpro.pm, it could have been written more simply (or more like other SeqIO modules). Can't really address the others. Brian O. On 10/9/06 12:52 AM, "Torsten Seemann" wrote: > Hi all, > > The following modules have more than one "package xxxx;" declaration in > them. For small, internal classes I guess this is fine, but for others, > they should be split up into the filesystem - otherwise they are > troublesome to locate and the online documentation doesn't list them! > > eg. > bioperl-run/Bio/Tools/Run/Analysis/Job.pm > is in > bioperl-run/Bio/Tools/Run/Analysis.pm > > Here's the culprits: > > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio | > sed 's/:.*$//' | sort | uniq -d ; done > > bioperl-live/Bio/AnalysisI.pm > bioperl-live/Bio/DB/Fasta.pm > bioperl-live/Bio/DB/GFF.pm > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm > bioperl-live/Bio/SeqIO/interpro.pm > > bioperl-run/Bio/Tools/Run/Analysis.pm > bioperl-run/Bio/Tools/Run/Analysis/soap.pm From bix at sendu.me.uk Tue Oct 10 03:03:20 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 10 Oct 2006 08:03:20 +0100 Subject: [Bioperl-l] Analysis soap problem In-Reply-To: <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu> References: <452AB8C4.1010704@sendu.me.uk> <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu> Message-ID: <452B45B8.8010401@sendu.me.uk> Chris Fields wrote: > I have 'fixed' this in CVS. Note the quotes; it depends on what you > might consider fixed. Multiple calls to results() were returning > empty hash refs, so no data was being returned. For now, I stored > the hash reference in a variable then tested each one. All tests now > pass, including the 'outseq' one. > > Maybe it's just me, but shouldn't results() either consistently > return the same information, or contain documentation that it doesn't > do so? Anyway, I have left the bugzilla report open for now. Judging by the tests there seems a clear expectation that multiple calls to results() should work, and certainly that makes sense and seems natural. So I'd say that results() should be fixed and the test script reverted. From cjfields at uiuc.edu Tue Oct 10 07:42:33 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 06:42:33 -0500 Subject: [Bioperl-l] Analysis soap problem In-Reply-To: <452B45B8.8010401@sendu.me.uk> References: <452AB8C4.1010704@sendu.me.uk> <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu> <452B45B8.8010401@sendu.me.uk> Message-ID: I agree, though I think Martin Senger should be contacted, at least to get his thoughts. Has anyone tried yet? Chris On Oct 10, 2006, at 2:03 AM, Sendu Bala wrote: > Chris Fields wrote: >> I have 'fixed' this in CVS. Note the quotes; it depends on what you >> might consider fixed. Multiple calls to results() were returning >> empty hash refs, so no data was being returned. For now, I stored >> the hash reference in a variable then tested each one. All tests now >> pass, including the 'outseq' one. >> >> Maybe it's just me, but shouldn't results() either consistently >> return the same information, or contain documentation that it doesn't >> do so? Anyway, I have left the bugzilla report open for now. > > Judging by the tests there seems a clear expectation that multiple > calls > to results() should work, and certainly that makes sense and seems > natural. So I'd say that results() should be fixed and the test script > reverted. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Tue Oct 10 08:14:31 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 10 Oct 2006 13:14:31 +0100 Subject: [Bioperl-l] Analysis soap problem In-Reply-To: References: <452AB8C4.1010704@sendu.me.uk> <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu> <452B45B8.8010401@sendu.me.uk> Message-ID: <452B8EA7.1080800@sendu.me.uk> Chris Fields wrote: > I agree, though I think Martin Senger should be contacted, at least to > get his thoughts. Has anyone tried yet? He's CCd on the bug report, but I haven't tried directly, no. Do you want to tackle this (contacting him and/or fixing the bug)? Cheers, Sendu. From cjfields at uiuc.edu Tue Oct 10 09:20:03 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 08:20:03 -0500 Subject: [Bioperl-l] Analysis soap problem In-Reply-To: <452B8EA7.1080800@sendu.me.uk> Message-ID: <001801c6ec6e$cc016900$15327e82@pyrimidine> I'll try giving it a closer look, just didn't have much time yesterday. I'll also try contacting Martin. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Sendu Bala > Sent: Tuesday, October 10, 2006 7:15 AM > To: bioperl-l > Subject: Re: [Bioperl-l] Analysis soap problem > > Chris Fields wrote: > > I agree, though I think Martin Senger should be contacted, at least to > > get his thoughts. Has anyone tried yet? > > He's CCd on the bug report, but I haven't tried directly, no. Do you > want to tackle this (contacting him and/or fixing the bug)? > > Cheers, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From pmiguel at purdue.edu Tue Oct 10 10:26:35 2006 From: pmiguel at purdue.edu (Phillip San Miguel) Date: Tue, 10 Oct 2006 10:26:35 -0400 Subject: [Bioperl-l] Bug reports and attachments In-Reply-To: <452AB7B8.4040404@sendu.me.uk> References: <452AA998.5010104@purdue.edu> <452AB7B8.4040404@sendu.me.uk> Message-ID: <452BAD9B.5010903@purdue.edu> Sendu Bala wrote: > > BTW Phillip, thank you for the bug report but in future use the > attachment capabilities for files, please don't paste them into the > comments box. > Sendu, Sounds reasonable to me. I should note, however; when I entered the bug, I was looking for some method to attach files. There is none on the "Enter Bug: Bioperl" page: http://bugzilla.open-bio.org/enter_bug.cgi?product=Bioperl Also, "bug writing guidelines" makes no mention of it. I vaguely remembered there being some method to do it--but given the "bug writing guidelines" exhortations to be specific and detailed, I thought I must put the information somewhere. So I put them them the only place offered (on that page)--"Description:" I see that, once submitted, attachments can be added to a bug report. Is that normally how it is done? Doesn't each attachment result in a separate email to the bioperl guts email list? Anyway, I've just added the files to the bug report as attachments, in case someone needs them to construct a test. -- Phillip From bix at sendu.me.uk Tue Oct 10 11:10:25 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 10 Oct 2006 16:10:25 +0100 Subject: [Bioperl-l] Bug reports and attachments In-Reply-To: <452BAD9B.5010903@purdue.edu> References: <452AA998.5010104@purdue.edu> <452AB7B8.4040404@sendu.me.uk> <452BAD9B.5010903@purdue.edu> Message-ID: <452BB7E1.5020200@sendu.me.uk> Phillip San Miguel wrote: > Sendu Bala wrote: >> BTW Phillip, thank you for the bug report but in future use the >> attachment capabilities for files, please don't paste them into the >> comments box. >> > Sendu, Sounds reasonable to me. I should note, however; when I > entered the bug, I was looking for some method to attach files. There > is none on the "Enter Bug: Bioperl" page: > > http://bugzilla.open-bio.org/enter_bug.cgi?product=Bioperl > > Also, "bug writing guidelines" makes no mention of it. I vaguely > remembered there being some method to do it--but given the "bug > writing guidelines" exhortations to be specific and detailed, I > thought I must put the information somewhere. So I put them them the > only place offered (on that page)--"Description:" I agree that things could be better here. Who looks after bugzilla, and is this an alterable feature? > I see that, once submitted, attachments can be added to a bug report. > Is that normally how it is done? Yes, AFAIK. > Doesn't each attachment result in a separate email to the bioperl > guts email list? Yes, but that's not a problem. In fact, doing it this way means you don't email everyone subscribed to guts your big files in plain text, but instead they get a small email with a link to the download. > Anyway, I've just added the files to the bug report as attachments, > in case someone needs them to construct a test. Thank you. From arareko at campus.iztacala.unam.mx Tue Oct 10 11:14:00 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Tue, 10 Oct 2006 10:14:00 -0500 Subject: [Bioperl-l] Bug reports and attachments In-Reply-To: <452BAD9B.5010903@purdue.edu> References: <452AA998.5010104@purdue.edu> <452AB7B8.4040404@sendu.me.uk> <452BAD9B.5010903@purdue.edu> Message-ID: <452BB8B8.40409@campus.iztacala.unam.mx> Phillip San Miguel wrote: > I see that, once submitted, attachments can be added to a bug report. > Is that normally how it is done? Yes, it's the normal method: create the bug report, then attach files. > Doesn't each attachment result in a separate email to the bioperl > guts email list? Adding a file will generate an informative email per bug change (attaching the file in this case) but won't send the attachment to the list. Regards, Mauricio. -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From cjfields at uiuc.edu Tue Oct 10 11:20:55 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 10:20:55 -0500 Subject: [Bioperl-l] Bug reports and attachments In-Reply-To: <452BAD9B.5010903@purdue.edu> Message-ID: <002801c6ec7f$ae8d85f0$15327e82@pyrimidine> > Also, "bug writing guidelines" makes no mention of it. I vaguely > remembered there being some method to do it--but given the "bug writing > guidelines" exhortations to be specific and detailed, I thought I must > put the information somewhere. So I put them them the only place offered > (on that page)--"Description:" > I see that, once submitted, attachments can be added to a bug > report. Is that normally how it is done? Doesn't each attachment result > in a separate email to the bioperl guts email list? > Anyway, I've just added the files to the bug report as attachments, > in case someone needs them to construct a test. Phillip, Initial bug reports only require the general description, OS used, bioperl version, etc. That's quite normal. Any relevant attachments are added afterward. We should probably make that clearer upfront on the wiki page; I don't know if anyone can make similar changes to bugzilla. Any bug changes, CVS commits, etc are mailed to bioperl-guts, yes. That isn't an issue though; it keeps the developers updated on the various bugs/commits that are going on and is a pretty common practice. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 10 12:48:22 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 11:48:22 -0500 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: <4529D58D.1080004@infotech.monash.edu.au> References: <4529D58D.1080004@infotech.monash.edu.au> Message-ID: <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> There are a number of other bioperl-run examples (the Bio::Tools::Run::Analysis::soap issue I looked into revealed such). I agree with both points, 1) that it depends on the size of the classes, and 2) from a maintainability standpoint, it can be very frustrating when looking for documentation. Is there really any advantage to doing this? Chris On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote: > Hi all, > > The following modules have more than one "package xxxx;" > declaration in > them. For small, internal classes I guess this is fine, but for > others, > they should be split up into the filesystem - otherwise they are > troublesome to locate and the online documentation doesn't list them! > > eg. > bioperl-run/Bio/Tools/Run/Analysis/Job.pm > is in > bioperl-run/Bio/Tools/Run/Analysis.pm > > Here's the culprits: > > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio | > sed 's/:.*$//' | sort | uniq -d ; done > > bioperl-live/Bio/AnalysisI.pm > bioperl-live/Bio/DB/Fasta.pm > bioperl-live/Bio/DB/GFF.pm > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm > bioperl-live/Bio/SeqIO/interpro.pm > > bioperl-run/Bio/Tools/Run/Analysis.pm > bioperl-run/Bio/Tools/Run/Analysis/soap.pm > > -- > Dr Torsten Seemann http://www.vicbioinformatics.com > Victorian Bioinformatics Consortium, Monash University, Australia > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 10 12:48:22 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 11:48:22 -0500 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: <4529D58D.1080004@infotech.monash.edu.au> References: <4529D58D.1080004@infotech.monash.edu.au> Message-ID: <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> There are a number of other bioperl-run examples (the Bio::Tools::Run::Analysis::soap issue I looked into revealed such). I agree with both points, 1) that it depends on the size of the classes, and 2) from a maintainability standpoint, it can be very frustrating when looking for documentation. Is there really any advantage to doing this? Chris On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote: > Hi all, > > The following modules have more than one "package xxxx;" > declaration in > them. For small, internal classes I guess this is fine, but for > others, > they should be split up into the filesystem - otherwise they are > troublesome to locate and the online documentation doesn't list them! > > eg. > bioperl-run/Bio/Tools/Run/Analysis/Job.pm > is in > bioperl-run/Bio/Tools/Run/Analysis.pm > > Here's the culprits: > > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio | > sed 's/:.*$//' | sort | uniq -d ; done > > bioperl-live/Bio/AnalysisI.pm > bioperl-live/Bio/DB/Fasta.pm > bioperl-live/Bio/DB/GFF.pm > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm > bioperl-live/Bio/SeqIO/interpro.pm > > bioperl-run/Bio/Tools/Run/Analysis.pm > bioperl-run/Bio/Tools/Run/Analysis/soap.pm > > -- > Dr Torsten Seemann http://www.vicbioinformatics.com > Victorian Bioinformatics Consortium, Monash University, Australia > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From lzhtom at hotmail.com Tue Oct 10 15:42:48 2006 From: lzhtom at hotmail.com (zhihua li) Date: Tue, 10 Oct 2006 19:42:48 +0000 Subject: [Bioperl-l] No "new" method in Bio::Tool::Run::AnalysisFactor::Pise? Message-ID: Hi netters. I've installed Bioperl 1.5.1, both core and run modules. But when I tried to use the Pise module, an error occured saying that there's no "new" method in this package. My script is: use strict; use warnings; use Bio::Tools::Run::AnalysisFactory::Pise; my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new(); my $program=$factory->program('mfold'); $program->seq('my_input_file'); my $job = $program->run(); print STDERR $job->contect('mfold.out'); The error message I got is: Can't locate object method "new" via package "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load "Bio::Tools::Run::AnalysisFactor::Pise"?) I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/Pise.pm and it DOES contain a sub new. So what's going on? Anyone could give me a hint? Thanks a lot! From cjfields at uiuc.edu Tue Oct 10 16:27:27 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 15:27:27 -0500 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com> References: <4529D58D.1080004@infotech.monash.edu.au> <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com> Message-ID: Makes sense to me. I think, as long as they're documented, it shouldn't be a problem. I think the main point is that the class methods for these don't show up using perldoc (something I ran into with Bio::DB::Fasta's inclusion of Bio::PrimarySeq::Fasta), but they do show up when using other documentation. So 'perldoc Bio::DB::Fasta' works, but 'perldoc Bio::PrimarySeq::Fasta' doesn't. So these can be problematic when looking for specific methods. However, I think pod2html handles multiple package declarations in one module, and the PDOC online do as well. Does the Deobfuscator? Chris On Oct 10, 2006, at 3:11 PM, Lincoln Stein wrote: > Hi, > > These ones are all mine: > > > bioperl-live/Bio/DB/Fasta.pm > > bioperl-live/Bio/DB/GFF.pm > > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm > > In each case, the second modules are teeny tiny ones that implement > iterators which are at most two methods long (typically a new() and > a next()). I prefer not to split them out because they will just > clutter up the file tree with stuff that is already well documented > in the "parent ship" modules. > > Lincoln > > > On 10/10/06, Chris Fields wrote: There are a > number of other bioperl-run examples (the > Bio::Tools::Run::Analysis::soap issue I looked into revealed such). > > I agree with both points, 1) that it depends on the size of the > classes, and 2) from a maintainability standpoint, it can be very > frustrating when looking for documentation. Is there really any > advantage to doing this? > > Chris > > On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote: > > > Hi all, > > > > The following modules have more than one "package xxxx;" > > declaration in > > them. For small, internal classes I guess this is fine, but for > > others, > > they should be split up into the filesystem - otherwise they are > > troublesome to locate and the online documentation doesn't list > them! > > > > eg. > > bioperl-run/Bio/Tools/Run/Analysis/Job.pm > > is in > > bioperl-run/Bio/Tools/Run/Analysis.pm > > > > Here's the culprits: > > > > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/ > Bio | > > sed 's/:.*$//' | sort | uniq -d ; done > > > > bioperl-live/Bio/AnalysisI.pm > > bioperl-live/Bio/DB/Fasta.pm > > bioperl-live/Bio/DB/GFF.pm > > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm > > bioperl-live/Bio/SeqIO/interpro.pm > > > > bioperl-run/Bio/Tools/Run/Analysis.pm > > bioperl-run/Bio/Tools/Run/Analysis/soap.pm > > > > -- > > Dr Torsten Seemann http://www.vicbioinformatics.com > > Victorian Bioinformatics Consortium, Monash University, Australia > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > Lincoln D. Stein > Cold Spring Harbor Laboratory > 1 Bungtown Road > Cold Spring Harbor, NY 11724 > (516) 367-8380 (voice) > (516) 367-8389 (fax) > FOR URGENT MESSAGES & SCHEDULING, > PLEASE CONTACT MY ASSISTANT, > SANDRA MICHELSEN, AT michelse at cshl.edu Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 10 16:30:16 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 15:30:16 -0500 Subject: [Bioperl-l] No "new" method in Bio::Tool::Run::AnalysisFactor::Pise? In-Reply-To: References: Message-ID: <870B7500-AA83-42D7-965B-865B91AA8E7F@uiuc.edu> On Oct 10, 2006, at 2:42 PM, zhihua li wrote: > Hi netters. > > I've installed Bioperl 1.5.1, both core and run modules. But when > I tried to use the Pise module, an error occured saying that > there's no "new" method in this package. > > My script is: > > use strict; > use warnings; > use Bio::Tools::Run::AnalysisFactory::Pise; > my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new(); > my $program=$factory->program('mfold'); > $program->seq('my_input_file'); > my $job = $program->run(); > print STDERR $job->contect('mfold.out'); > > The error message I got is: > > Can't locate object method "new" via package > "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load > "Bio::Tools::Run::AnalysisFactor::Pise"?) > > I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/ > Pise.pm and it DOES contain a sub new. > > So what's going on? Anyone could give me a hint? > > Thanks a lot! Well, according to your error output you have AnalysisFactory misspelled ('AnalysisFactor'), which should tell you what the problem is. Look for the same thing in your script. Chris > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Tue Oct 10 16:43:06 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 10 Oct 2006 21:43:06 +0100 Subject: [Bioperl-l] No "new" method in Bio::Tool::Run::AnalysisFactor::Pise? In-Reply-To: References: Message-ID: <452C05DA.5050803@sendu.me.uk> zhihua li wrote: > Hi netters. > > I've installed Bioperl 1.5.1, both core and run modules. But when I > tried to use the Pise module, an error occured saying that there's no > "new" method in this package. > > My script is: > > use strict; > use warnings; > use Bio::Tools::Run::AnalysisFactory::Pise; > my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new(); > my $program=$factory->program('mfold'); > $program->seq('my_input_file'); > my $job = $program->run(); > print STDERR $job->contect('mfold.out'); > > The error message I got is: > > Can't locate object method "new" via package > "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load > "Bio::Tools::Run::AnalysisFactor::Pise"?) > > I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/Pise.pm > and it DOES contain a sub new. > > So what's going on? Anyone could give me a hint? You have a typo. Bio::Tools::Run::AnalysisFactory::Pise, not Bio::Tools::Run::AnalysisFactor::Pise From lincoln.stein at gmail.com Tue Oct 10 16:11:00 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Tue, 10 Oct 2006 16:11:00 -0400 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> References: <4529D58D.1080004@infotech.monash.edu.au> <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> Message-ID: <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com> Hi, These ones are all mine: > bioperl-live/Bio/DB/Fasta.pm > bioperl-live/Bio/DB/GFF.pm > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm In each case, the second modules are teeny tiny ones that implement iterators which are at most two methods long (typically a new() and a next()). I prefer not to split them out because they will just clutter up the file tree with stuff that is already well documented in the "parent ship" modules. Lincoln On 10/10/06, Chris Fields wrote: > > There are a number of other bioperl-run examples (the > Bio::Tools::Run::Analysis::soap issue I looked into revealed such). > > I agree with both points, 1) that it depends on the size of the > classes, and 2) from a maintainability standpoint, it can be very > frustrating when looking for documentation. Is there really any > advantage to doing this? > > Chris > > On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote: > > > Hi all, > > > > The following modules have more than one "package xxxx;" > > declaration in > > them. For small, internal classes I guess this is fine, but for > > others, > > they should be split up into the filesystem - otherwise they are > > troublesome to locate and the online documentation doesn't list them! > > > > eg. > > bioperl-run/Bio/Tools/Run/Analysis/Job.pm > > is in > > bioperl-run/Bio/Tools/Run/Analysis.pm > > > > Here's the culprits: > > > > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio | > > sed 's/:.*$//' | sort | uniq -d ; done > > > > bioperl-live/Bio/AnalysisI.pm > > bioperl-live/Bio/DB/Fasta.pm > > bioperl-live/Bio/DB/GFF.pm > > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm > > bioperl-live/Bio/SeqIO/interpro.pm > > > > bioperl-run/Bio/Tools/Run/Analysis.pm > > bioperl-run/Bio/Tools/Run/Analysis/soap.pm > > > > -- > > Dr Torsten Seemann http://www.vicbioinformatics.com > > Victorian Bioinformatics Consortium, Monash University, Australia > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From asjo at koldfront.dk Tue Oct 10 16:04:35 2006 From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=) Date: Tue, 10 Oct 2006 22:04:35 +0200 Subject: [Bioperl-l] No "new" method in Bio::Tool::Run::AnalysisFactor::Pise? References: Message-ID: <871wpglyy4.fsf@topper.koldfront.dk> On Tue, 10 Oct 2006 19:42:48 +0000, zhihua wrote: > use Bio::Tools::Run::AnalysisFactory::Pise; > my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new(); ^ y [...] > Can't locate object method "new" via package > "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load > "Bio::Tools::Run::AnalysisFactor::Pise"?) You missed a 'y' in "Factory". Best wishes, -- "We've reached a special place... Spiritually... Adam Sj?gren ecumenically... grammatically." asjo at koldfront.dk From dmessina at wustl.edu Tue Oct 10 17:08:45 2006 From: dmessina at wustl.edu (David Messina) Date: Tue, 10 Oct 2006 16:08:45 -0500 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: References: <4529D58D.1080004@infotech.monash.edu.au> <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com> Message-ID: > However, I think pod2html handles multiple package declarations in > one module, and the PDOC online do as well. Does the Deobfuscator? Nope. From my cursory examination at the time they mostly were, as Lincoln said, short and sweet, so I didn't consider it a big deal. I do think the Deobfuscator should theoretically handle such cases anyway, though. I'll add it as a feature request on the wiki page. Or if you're chomping at the bit for it, I could certainly be beer- suaded to do it sooner rather than later... :) Dave From cjfields at uiuc.edu Tue Oct 10 17:33:39 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 16:33:39 -0500 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: References: <4529D58D.1080004@infotech.monash.edu.au> <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com> Message-ID: <7F35F565-7D28-4B06-A501-4D4083652C5C@uiuc.edu> Me? I'm a lowly postdoc. Lincoln's got the cash! Chris On Oct 10, 2006, at 4:08 PM, David Messina wrote: >> However, I think pod2html handles multiple package declarations in >> one module, and the PDOC online do as well. Does the Deobfuscator? > > Nope. From my cursory examination at the time they mostly were, as > Lincoln said, short and sweet, so I didn't consider it a big deal. > > I do think the Deobfuscator should theoretically handle such cases > anyway, though. I'll add it as a feature request on the wiki page. > Or if you're chomping at the bit for it, I could certainly be beer- > suaded to do it sooner rather than later... :) > > Dave > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From sdavis2 at mail.nih.gov Wed Oct 11 05:43:35 2006 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Wed, 11 Oct 2006 05:43:35 -0400 Subject: [Bioperl-l] No "new" method in Bio::Tool::Run::AnalysisFactor::Pise? In-Reply-To: References: Message-ID: <452CBCC7.30108@mail.nih.gov> zhihua li wrote: > Hi netters. > > I've installed Bioperl 1.5.1, both core and run modules. But when I > tried to use the Pise module, an error occured saying that there's no > "new" method in this package. > > My script is: > > use strict; > use warnings; > use Bio::Tools::Run::AnalysisFactory::Pise; > my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new(); > my $program=$factory->program('mfold'); > $program->seq('my_input_file'); > my $job = $program->run(); > print STDERR $job->contect('mfold.out'); > > The error message I got is: > > Can't locate object method "new" via package > "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load > "Bio::Tools::Run::AnalysisFactor::Pise"?) > > I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/Pise.pm > and it DOES contain a sub new. > > So what's going on? Anyone could give me a hint? > > Thanks a lot! The module name is Bio::Tools::Run::AnalysisFactory::Pise. Note that it is not "factor" but "factory". That should probably fix your problem. Sean From jay at jays.net Sat Oct 7 18:34:23 2006 From: jay at jays.net (Jay Hannah) Date: Sat, 07 Oct 2006 17:34:23 -0500 Subject: [Bioperl-l] Documentation typo: Bio::Search::Result::GenericResult Message-ID: <45282B6F.1030308@jays.net> I just updated my bioperl-live this morning, so I think I'm current. :) perldoc Bio::Search::Result::GenericResult ------------ SYNOPSIS # typically one gets Results from a SearchIO stream use Bio::SearchIO; my $io = new Bio::SearchIO(-format => 'blast', -file => 't/data/HUMBETGLOA.tblastx'); while( my $result = $io->next_result) { # process all search results within the input stream while( my $hit = $result->next_hits()) { ------------- Except that "next_hits()" does not exist. Should be "next_hit()". (Should I have posted a patch instead?) Thanks, j From bosborne11 at verizon.net Tue Oct 10 18:42:25 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 10 Oct 2006 18:42:25 -0400 Subject: [Bioperl-l] Documentation typo: Bio::Search::Result::GenericResult In-Reply-To: <45282B6F.1030308@jays.net> Message-ID: j, No need, not for something so simple. Brian O. On 10/7/06 6:34 PM, "Jay Hannah" wrote: > Except that "next_hits()" does not exist. Should be "next_hit()". > > (Should I have posted a patch instead?) From zchou at cau.edu.cn Wed Oct 11 02:34:24 2006 From: zchou at cau.edu.cn (zhuocheng Hou) Date: Wed, 11 Oct 2006 14:34:24 +0800 Subject: [Bioperl-l] about retreive alinged sequence Message-ID: <000a01c6ecff$4ea4b2f0$0915020a@zchou> Hello,everyone, I am a new user of Bioperl. I want to align mutiple DNA sequences based on translated proteins by using clustalw. However, I don't know how to retreive aligned sequences out. The codes as follows (from the tutorials of HOWTOPAML): # # These codes run and can find the screen print out of clustalw ....... my $aa_aln = $aln_factory->align(\@prots, at params); # project the protein alignment back to CDS coordinates my $dna_aln = aa_to_dna_aln($aa_aln, \%seqs); my @each = $dna_aln->each_seq(); # The following codes were writed by me. However, it doesn't work. I want to get teh $dna_aln contents and write it to files. my $in = Bio::AlignIO->newFh(-Fh=>\$dna_aln,-format => 'fasta'); my $aln=$dna_aln; my $out = Bio::AlignIO->new(-file => ">out.msf" , -format => 'msf'); #print $out $_ while <$in>; while ($aln = $in->next_aln() ) { my $out->write_aln($aln); } Best regards, Zhuocheng CAU From n.haigh at sheffield.ac.uk Wed Oct 11 10:00:33 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 11 Oct 2006 15:00:33 +0100 Subject: [Bioperl-l] about retreive alinged sequence In-Reply-To: <000a01c6ecff$4ea4b2f0$0915020a@zchou> References: <000a01c6ecff$4ea4b2f0$0915020a@zchou> Message-ID: <452CF901.6020409@sheffield.ac.uk> Dear Zhuocheng I'm not familiar with the aa_to_dna_al method but it appears that from your code that it returns an alignment object. Please find comments inserted below - hope they help! Nathan zhuocheng Hou wrote: > Hello,everyone, > > I am a new user of Bioperl. I want to align mutiple DNA sequences based on translated proteins by using clustalw. However, I don't know how to retreive aligned sequences out. > > The codes as follows (from the tutorials of HOWTOPAML): > > # > # These codes run and can find the screen print out of clustalw > ....... > my $aa_aln = $aln_factory->align(\@prots, at params); > # project the protein alignment back to CDS coordinates > my $dna_aln = aa_to_dna_aln($aa_aln, \%seqs); > $dna_aln should be a Bio::AlignIO object so all you need to do is setup the output stream to write the alignment object similar to what you wrote below. i.e. my $out = Bio::AlignIO->new(-file => ">out.msf" , -format => 'msf'); Then simply write the input alignment ($dna_aln) to the output stream with this: my $out->write_aln($dna_aln); > my @each = $dna_aln->each_seq(); > > # The following codes were writed by me. However, it doesn't work. I want to get teh $dna_aln contents and write it to files. > > > my $in = Bio::AlignIO->newFh(-Fh=>\$dna_aln,-format => 'fasta'); > my $aln=$dna_aln; > my $out = Bio::AlignIO->new(-file => ">out.msf" , > -format => 'msf'); > #print $out $_ while <$in>; > while ($aln = $in->next_aln() ) { > my $out->write_aln($aln); > } > > > Best regards, > > Zhuocheng > CAU > > > ------------------------------------------------------------------------ > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From melcher at rescomp.berkeley.edu Wed Oct 11 17:09:17 2006 From: melcher at rescomp.berkeley.edu (Graham Melcher) Date: Wed, 11 Oct 2006 14:09:17 -0700 Subject: [Bioperl-l] Accessing GO through MYSQL? Message-ID: <20061011210917.GA783@rescomp.berkeley.edu> Hey all, Preface:: This is my first post to this list, please redirect if my questions belong elsewhere. I need to lookup GO ontology information given GO:Accessors, and I have a local mysql db that mirrors the GO db from that website. I am not sure if the Bio::Ontology::* libraries were designed to be used in a dynamic, load-as-you-need sort of way, and am wondering how other people have gone about solving this problem. Details follow... Right now I'm using Class::DBI to access the Mysql database, then made a new set of subclassed Bio::Ontology::TermI and Bio::Ontology::RelationshipI which use these class::DBI objects to access the relevent information in the database on the fly. Unfortunately, I was getting stuck with the implementation of some of the other Bio::Ontology::*I, especially Ontology. Making all of these subclasses seems infeasible, or at least enough work that it might be available somewhere. Are mysql accessors out there, and I just haven't found them, or is Bio::Ontology possibly not way to go? Alternatively, if I end up having to write this sort of Bio::Ontology - Class::DBI interface, would anyone be interested in it being made generally usable and available? Finally, I just found go-perl, but although I haven't had a lot of time to look into it, it doesn't seem to use mysql either. Thanks! Graham -- Graham Melcher From sdavis2 at mail.nih.gov Thu Oct 12 07:51:14 2006 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Thu, 12 Oct 2006 07:51:14 -0400 Subject: [Bioperl-l] Accessing GO through MYSQL? In-Reply-To: <20061011210917.GA783@rescomp.berkeley.edu> References: <20061011210917.GA783@rescomp.berkeley.edu> Message-ID: <452E2C32.7070502@mail.nih.gov> Graham Melcher wrote: > Finally, I just found go-perl, but although I haven't had a lot of time > to look into it, it doesn't seem to use mysql either. > Yep. Keep going. Go-perl and Go-db-perl: http://www.godatabase.org/dev/go-db-perl/doc/go-db-perl-doc.html Sean From hlapp at gmx.net Thu Oct 12 00:44:49 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 12 Oct 2006 00:44:49 -0400 Subject: [Bioperl-l] NESCent Phyloinformatics Hackathon Message-ID: <939B253E-2F87-450A-A277-78B5645D3494@gmx.net> (apologies in advance to those who receive this multiple times) The National Evolutionary Synthesis Center (NESCent) in collaboration with Arlin Stoltzfus (U. Maryland, NIST), Aaron Mackey (GSK), Rutger Vos (UBC), and Mark Holder (FSU) sponsors a Phyloinformatics Hackathon to take place Dec 11-15 in Durham, NC. The (wiki) website with more information and a formal proposal is at https://www.nescent.org/wg_phyloinformatics/ In short, the goal is to leverage the Bio* toolkits to provide the "glue" for evolutionary analyses of various types that depend on automation, interoperability, and data integration. CALL FOR INPUT: The specific objectives are driven by "use cases", that is, specific target problems of interest to evolutionary biologists (click 'Use Cases' at the above website). We invite community input in order to focus efforts on the most urgent or pervasive problems. The wiki for the hackathon allows direct editing of the use cases after registration. You may also upload data files, or add comments to the "Forum" page. Alternatively, send email to hlapp at nescent.org. You may also contact any of the organizers with questions or comments. ATTENDANCE: The hackathon is scheduled for Dec 11-15, 2006 in Durham NC. Space is limited, and attendance is by invitation. If you have not been contacted but desire to attend, please contact Hilmar Lapp (hlapp at nescent.org). ORGANIZERS: Hilmar Lapp (NESCent; hlapp at nescent.org) Aaron Mackey (GSK; aaron.j.mackey at gsk.com) Mark Holder (FSU; mholder at scs.fsu.edu) Arlin Stoltzfus (CARB, NIST; arlin.stoltzfus at nist.gov) Todd Vision (NESCent; tjv at bio.unc.edu) Rutger Vos (UBC; rvosa at sfu.ca) From neetisomaiya at gmail.com Thu Oct 12 02:03:20 2006 From: neetisomaiya at gmail.com (neeti somaiya) Date: Thu, 12 Oct 2006 11:33:20 +0530 Subject: [Bioperl-l] need help urgently Message-ID: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> We are using BioPerl to parse a BLAST output file, and then we want to load full alignments into a CLOB column in one of our database tables. We are trying to use sql loader for the same. Anyone has an idea how we can go about it? We have tried loading sequences into CLOB columns using sql loader, and that works fine, but the same syntax when used for loading alignments, is not working. -- -Neeti Even my blood says, B positive From neetisomaiya at gmail.com Thu Oct 12 02:03:20 2006 From: neetisomaiya at gmail.com (neeti somaiya) Date: Thu, 12 Oct 2006 11:33:20 +0530 Subject: [Bioperl-l] need help urgently Message-ID: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> We are using BioPerl to parse a BLAST output file, and then we want to load full alignments into a CLOB column in one of our database tables. We are trying to use sql loader for the same. Anyone has an idea how we can go about it? We have tried loading sequences into CLOB columns using sql loader, and that works fine, but the same syntax when used for loading alignments, is not working. -- -Neeti Even my blood says, B positive From neetisomaiya at gmail.com Thu Oct 12 02:03:20 2006 From: neetisomaiya at gmail.com (neeti somaiya) Date: Thu, 12 Oct 2006 11:33:20 +0530 Subject: [Bioperl-l] need help urgently Message-ID: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> We are using BioPerl to parse a BLAST output file, and then we want to load full alignments into a CLOB column in one of our database tables. We are trying to use sql loader for the same. Anyone has an idea how we can go about it? We have tried loading sequences into CLOB columns using sql loader, and that works fine, but the same syntax when used for loading alignments, is not working. -- -Neeti Even my blood says, B positive From sayali_salodkar at persistent.co.in Thu Oct 12 06:16:34 2006 From: sayali_salodkar at persistent.co.in (Sayali) Date: Thu, 12 Oct 2006 15:46:34 +0530 Subject: [Bioperl-l] regarding polyphred output Message-ID: <006e01c6ede7$7f10bef0$00b1580a@persistent.co.in> Hi, I want to parse the output of polyphred http://droog.gs.washington.edu/PolyPhred.html. Is there a parser already available in Bioperl which would help me in doing the same. Thanks, Sayali DISCLAIMER ========== This e-mail may contain privileged and confidential information which is the property of Persistent Systems Pvt. Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Pvt. Ltd. does not accept any liability for virus infected mails. From sayali_salodkar at persistent.co.in Thu Oct 12 06:16:34 2006 From: sayali_salodkar at persistent.co.in (Sayali) Date: Thu, 12 Oct 2006 15:46:34 +0530 Subject: [Bioperl-l] regarding polyphred output Message-ID: <006e01c6ede7$7f10bef0$00b1580a@persistent.co.in> Hi, I want to parse the output of polyphred http://droog.gs.washington.edu/PolyPhred.html. Is there a parser already available in Bioperl which would help me in doing the same. Thanks, Sayali DISCLAIMER ========== This e-mail may contain privileged and confidential information which is the property of Persistent Systems Pvt. Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Pvt. Ltd. does not accept any liability for virus infected mails. From sdavis2 at mail.nih.gov Thu Oct 12 06:40:12 2006 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Thu, 12 Oct 2006 06:40:12 -0400 Subject: [Bioperl-l] need help urgently In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> Message-ID: <200610120640.12250.sdavis2@mail.nih.gov> On Thursday 12 October 2006 02:03, neeti somaiya wrote: > We are using BioPerl to parse a BLAST output file, and then we want to load > full alignments into a CLOB column in one of our database tables. We are > trying to use sql loader for the same. Anyone has an idea how we can go > about it? > We have tried loading sequences into CLOB columns using sql loader, and > that works fine, but the same syntax when used for loading alignments, is > not working. Neeti, You'll need to be a bit more specific about what you are doing. Can you post the code you are using and error messages? Also, what is "sql loader"? And what database are you trying to use? Sean From sdavis2 at mail.nih.gov Thu Oct 12 06:40:12 2006 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Thu, 12 Oct 2006 06:40:12 -0400 Subject: [Bioperl-l] need help urgently In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> Message-ID: <200610120640.12250.sdavis2@mail.nih.gov> On Thursday 12 October 2006 02:03, neeti somaiya wrote: > We are using BioPerl to parse a BLAST output file, and then we want to load > full alignments into a CLOB column in one of our database tables. We are > trying to use sql loader for the same. Anyone has an idea how we can go > about it? > We have tried loading sequences into CLOB columns using sql loader, and > that works fine, but the same syntax when used for loading alignments, is > not working. Neeti, You'll need to be a bit more specific about what you are doing. Can you post the code you are using and error messages? Also, what is "sql loader"? And what database are you trying to use? Sean From sdavis2 at mail.nih.gov Thu Oct 12 06:40:12 2006 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Thu, 12 Oct 2006 06:40:12 -0400 Subject: [Bioperl-l] need help urgently In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> Message-ID: <200610120640.12250.sdavis2@mail.nih.gov> On Thursday 12 October 2006 02:03, neeti somaiya wrote: > We are using BioPerl to parse a BLAST output file, and then we want to load > full alignments into a CLOB column in one of our database tables. We are > trying to use sql loader for the same. Anyone has an idea how we can go > about it? > We have tried loading sequences into CLOB columns using sql loader, and > that works fine, but the same syntax when used for loading alignments, is > not working. Neeti, You'll need to be a bit more specific about what you are doing. Can you post the code you are using and error messages? Also, what is "sql loader"? And what database are you trying to use? Sean From crabtree at tigr.ORG Thu Oct 12 07:28:06 2006 From: crabtree at tigr.ORG (Jonathan Crabtree) Date: Thu, 12 Oct 2006 07:28:06 -0400 Subject: [Bioperl-l] need help urgently In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> Message-ID: <452E26C6.6040800@tigr.org> Hi Neeti- neeti somaiya wrote: > We are using BioPerl to parse a BLAST output file, and then we want to load > full alignments into a CLOB column in one of our database tables. We are > trying to use sql loader for the same. Anyone has an idea how we can go > about it? > This doesn't sound like a BioPerl issue per se, so this list might not be the best venue for your question. Since SQL*Loader is an Oracle utility you may have better luck in a forum frequented by Oracle DBAs and/or general bioinformatics people. (Not that this isn't such a forum, but unless your difficulty is actually being caused by BioPerl, or there's some kind of SQL*Loader wrapper in BioPerl--which I don't think is the case--you run the risk of having people complain that your question doesn't have enough to do with BioPerl.) > We have tried loading sequences into CLOB columns using sql loader, and that > works fine, but the same syntax when used for loading alignments, is not > working. > It's been a while since I've done any work with SQL*Loader, but I'd guess that the reason it works with sequences and not alignments is because there are characters in the alignments (newlines, perhaps?) that SQL*Loader is incorrectly interpreting as either column (field) or row (record) delimiters. You may need to change your flat file encoding to use delimiters other than the defaults (and alter the SQL*Loader control file accordingly.) As Sean pointed out, however, it's difficult to be much help without seeing an example of a failed input and the corresponding error(s)! One other thing I remember about SQL*Loader (as of Oracle 8-9 or so) is that all the CLOB values had to appear *last* in the SQL*Loader record, at least if you were using variable-length fields. But since you've loaded sequences successfully, I doubt this is the issue. One final thought is that I believe SQL*Loader has an option whereby you can place your LOB values in files external to the main SQL*Loader input file, which sidesteps the field/row delimiter issue completely; you may want to look into this if you're not already loading your Oracle database this way. Jonathan From bix at sendu.me.uk Fri Oct 13 04:56:01 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 13 Oct 2006 09:56:01 +0100 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <4521E74E.1040404@infotech.monash.edu.au> References: <4521E74E.1040404@infotech.monash.edu.au> Message-ID: <452F54A1.7010908@sendu.me.uk> Bio::Tools::Run::WrapperBase is currently a Bio::Root::RootI. It's certainly interface-like, but doesn't follow the normal interface naming convention (WrapperBaseI). Should it be a Bio::Root::Root? Renamed WrapperBaseI? Left alone? From cjfields at uiuc.edu Fri Oct 13 08:20:58 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 13 Oct 2006 07:20:58 -0500 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <452F54A1.7010908@sendu.me.uk> References: <4521E74E.1040404@infotech.monash.edu.au> <452F54A1.7010908@sendu.me.uk> Message-ID: <43CC4E80-8F15-4C83-929D-DDC719360C8F@uiuc.edu> I would say, according to BioPerl convention, it should be renamed WrapperBaseI. It has a few interface-like methods and (importantly) lacks a constructor. Unless someone else out there has other reasoning? Note that this will require lots of bioperl-run changes as well, at least I think it will. Chris On Oct 13, 2006, at 3:56 AM, Sendu Bala wrote: > Bio::Tools::Run::WrapperBase is currently a Bio::Root::RootI. It's > certainly interface-like, but doesn't follow the normal interface > naming > convention (WrapperBaseI). Should it be a Bio::Root::Root? Renamed > WrapperBaseI? Left alone? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From avilella at gmail.com Fri Oct 13 11:26:47 2006 From: avilella at gmail.com (Albert Vilella) Date: Fri, 13 Oct 2006 16:26:47 +0100 Subject: [Bioperl-l] hickup in SimpleAlign remove_gaps method Message-ID: <358f4d650610130826k1a126983t73839d665bbcb502@mail.gmail.com> Hi all, While using the remove_gaps method in Bio::SimpleAlign I found out that if the alignment is (bad enough for) having no columns without any gap at all, the method will give a: Use of uninitialized value in split at this line in add_seq: map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq); So my idea was to tweak this line to something like: map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq || ''); But I am unsure about any other side effects this may have. Anyone? Albert. From cjfields at uiuc.edu Fri Oct 13 11:51:38 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 13 Oct 2006 10:51:38 -0500 Subject: [Bioperl-l] hickup in SimpleAlign remove_gaps method In-Reply-To: <358f4d650610130826k1a126983t73839d665bbcb502@mail.gmail.com> References: <358f4d650610130826k1a126983t73839d665bbcb502@mail.gmail.com> Message-ID: You can check to see if it passes all tests. I'm guessing SimpleAlign.t tests this method out in some way (though it's always safer to check). Chris On Oct 13, 2006, at 10:26 AM, Albert Vilella wrote: > Hi all, > > While using the remove_gaps method in Bio::SimpleAlign I found out > that if the alignment is (bad enough for) having no columns without > any gap at all, the method will give a: > > Use of uninitialized value in split at this line in add_seq: > > map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq); > > So my idea was to tweak this line to something like: > > map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq || ''); > > But I am unsure about any other side effects this may have. > > Anyone? > > Albert. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From jay at jays.net Fri Oct 13 12:09:16 2006 From: jay at jays.net (Jay Hannah) Date: Fri, 13 Oct 2006 11:09:16 -0500 Subject: [Bioperl-l] Documentation typo: Bio::Search::Result::GenericResult In-Reply-To: References: Message-ID: <452FBA2C.7070003@jays.net> Thanks Brian! My first BioPerl patch has been applied! A one byte typo correction! I'm so excited! :) /home/repository/bioperl/bioperl-live/Bio/Search/Result/GenericResult.pm,v ---------------------------- revision 1.27 date: 2006/10/10 22:41:46; author: bosborne; state: Exp; lines: +4 -4 next_hit, not next_hits ---------------------------- I'm a simple man who takes great satisfaction in the simple things. :) j Brian Osborne wrote: > j, > > No need, not for something so simple. > > Brian O. > > > On 10/7/06 6:34 PM, "Jay Hannah" wrote: >> Except that "next_hits()" does not exist. Should be "next_hit()". >> >> (Should I have posted a patch instead?) > From jay at jays.net Fri Oct 13 12:24:48 2006 From: jay at jays.net (Jay Hannah) Date: Fri, 13 Oct 2006 11:24:48 -0500 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? Message-ID: <452FBDD0.2070008@jays.net> So I'm doing the following: 1) Using Bio::SeqIO to read in a genbank file and kick out fasta. 2) Reading that fasta file w/ command line formatdb. 3) Using that output for command line blastall. 4) Using Bio::SearchIO to read the blast results. (If there's a better way, do tell. -grin-) This sequence is working great for nucleotide BLASTing, but I'm stuck on step 1 when trying protein BLAST. my $seq_in = Bio::SeqIO->new( -file => " "genbank", -alphabet => "protein" ); my $seq_out_protein = Bio::SeqIO->new( -file => ">out", -format => 'fasta', -alphabet => 'protein' ); while (my $inseq = $seq_in->next_seq) { $inseq->molecule("protein"); $seq_out_protein->write_seq($inseq); } This creates a nucleotide file "out". Setting -alphabet doesn't seem to do anything. Setting molecule("protein") doesn't seem to do anything either. I was expecting that it would just pull all the CDS strings out of the genbank file and dump those into fasta format? Am I missing something obvious? Thanks, j From bosborne11 at verizon.net Fri Oct 13 12:54:02 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Fri, 13 Oct 2006 12:54:02 -0400 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? In-Reply-To: <452FBDD0.2070008@jays.net> Message-ID: Jay, You're looking for the "translation" string in the CDS section, yes? You need to delve a bit into features, the CDS is considered to be a feature of the main or parent nucleotide sequence and the translation is part of CDS feature: http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Features_from_Genbank Brian O. On 10/13/06 12:24 PM, "Jay Hannah" wrote: > Am I missing something From bix at sendu.me.uk Fri Oct 13 12:59:46 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 13 Oct 2006 17:59:46 +0100 Subject: [Bioperl-l] Documentation typo: Bio::Search::Result::GenericResult In-Reply-To: <452FBA2C.7070003@jays.net> References: <452FBA2C.7070003@jays.net> Message-ID: <452FC602.3080302@sendu.me.uk> Jay Hannah wrote: > Thanks Brian! > > My first BioPerl patch has been applied! A one byte typo correction! I'm so excited! :) > > /home/repository/bioperl/bioperl-live/Bio/Search/Result/GenericResult.pm,v > ---------------------------- > revision 1.27 > date: 2006/10/10 22:41:46; author: bosborne; state: Exp; lines: +4 -4 > next_hit, not next_hits > ---------------------------- Congratulations! :D Next it will be two byte corrections and from there, the sky's the limit! :) From hlapp at gmx.net Fri Oct 13 13:28:50 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 13 Oct 2006 13:28:50 -0400 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <452F54A1.7010908@sendu.me.uk> References: <4521E74E.1040404@infotech.monash.edu.au> <452F54A1.7010908@sendu.me.uk> Message-ID: <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net> What does the POD (and the code) say about instantiating it? -hilmar On Oct 13, 2006, at 4:56 AM, Sendu Bala wrote: > Bio::Tools::Run::WrapperBase is currently a Bio::Root::RootI. It's > certainly interface-like, but doesn't follow the normal interface > naming > convention (WrapperBaseI). Should it be a Bio::Root::Root? Renamed > WrapperBaseI? Left alone? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From jay at jays.net Fri Oct 13 14:56:38 2006 From: jay at jays.net (Jay Hannah) Date: Fri, 13 Oct 2006 13:56:38 -0500 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? In-Reply-To: References: Message-ID: <452FE166.5080405@jays.net> Brian Osborne wrote: > You're looking for the "translation" string in the CDS section, yes? You > need to delve a bit into features, the CDS is considered to be a feature of > the main or parent nucleotide sequence and the translation is part of CDS > feature: > > http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Features_from_Genbank Yes. Thanks. I "rolled my own" -- I'm now doing this: while (my $inseq = $seq_in->next_seq) { my @features = $inseq->get_SeqFeatures(); foreach my $feat ( @features ) { next unless ($feat->primary_tag eq "CDS"); my @db_xrefs = $feat->annotation->get_Annotations("db_xref"); @db_xrefs = grep { /^GI:/ } @db_xrefs; die "Panic! More than one GI: db_xref?" if (@db_xrefs > 1); die "Panic! No GI: db_xref?" unless (@db_xrefs == 1); my $gi = $db_xrefs[0]; $gi =~ s/^GI://; my @translations = $feat->annotation->get_Annotations("translation"); die "Panic! More than one translation?" if (@translations > 1); my @protein_ids = $feat->annotation->get_Annotations("protein_id"); die "Panic! More than one protein_id?" if (@protein_ids > 1); my @product = $feat->annotation->get_Annotations("product"); die "Panic! More than one product?" if (@product > 1); print ">gi|$gi|gb|$protein_ids[0]|"; print $inseq->id . " $product[0]\n"; print "$translations[0]\n"; } } To generate a homebrew fasta file for a protein BLAST. I just thought that -alphabet and molecule() would do that stuff for me? What else would "protein" mean in those? Does anyone use -alphabet and/or molecule()? For what? How? Again, here's what I'm talking about: ========== my $seq_out_protein = Bio::SeqIO->new( -file => ">out", -format => 'fasta', -alphabet => 'protein' # No effect? ); while (my $inseq = $seq_in->next_seq) { $inseq->molecule("protein"); # No effect? ========== Thanks, j From bosborne11 at verizon.net Fri Oct 13 17:20:40 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Fri, 13 Oct 2006 17:20:40 -0400 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? In-Reply-To: <452FE166.5080405@jays.net> Message-ID: Jay, Yes, people use the -alphabet parameter. If you set it to something then Bioperl will not try to determine whether the sequence is protein, rna, or dna and this is particularly useful when the sequence contains characters that Bioperl would object to (sequences with distasteful characters can be created by various applications, for example, or you might introduce some weird character for some reason). Setting the -alphabet would also speed up Bioperl a bit, for the same reason. Brian O. On 10/13/06 2:56 PM, "Jay Hannah" wrote: > > I just thought that -alphabet and molecule() would do that stuff for me? What > else would "protein" mean in those? From jay at jays.net Sat Oct 14 11:25:05 2006 From: jay at jays.net (Jay Hannah) Date: Sat, 14 Oct 2006 10:25:05 -0500 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? In-Reply-To: References: Message-ID: <45310151.5050901@jays.net> Brian Osborne wrote: > Yes, people use the -alphabet parameter. If you set it to something then > Bioperl will not try to determine whether the sequence is protein, rna, or > dna and this is particularly useful when the sequence contains characters > that Bioperl would object to (sequences with distasteful characters can be > created by various applications, for example, or you might introduce some > weird character for some reason). Setting the -alphabet would also speed up > Bioperl a bit, for the same reason. Huh. That's what I assumed when I stumbled into the -alphabet parameter. So I thought this would read the protein sequences out of my genbank file and write a fasta file for me: my $seq_in = Bio::SeqIO->new( -file => "<$file", -format => "genbank", -alphabet => "protein" # No effect? ); my $seq_out = Bio::SeqIO->new( -file => ">$outfile", -format => "fasta", -alphabet => "protein" # No effect? ); while (my $inseq = $seq_in->next_seq) { $inseq->molecule("protein"); # No effect? $seq_out->write_seq($inseq); } It didn't. Would it be a Good Thing if it did what I was expecting? (Like I said I rolled my own, but I'm always looking for ways to enhance BioPerl that other people might find useful... Someday I will contribute something useful, by golly. -grin-) (Background: I'm doing protein BLASTs from genbank files. To make formatdb happy I have to have fasta files full of the protein sequences.) j From bosborne11 at verizon.net Sat Oct 14 14:40:21 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Sat, 14 Oct 2006 14:40:21 -0400 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? In-Reply-To: <45310151.5050901@jays.net> Message-ID: Jay, What you expected was that setting the -alphabet to "protein" would make Bioperl translate the input nucleotide sequence to output protein. In Bioperl this is accomplished by using the translate() method, no surprise there. If you take a look at the documentation on translate() in the online Bioperl Tutorial you'll see that this is a fairly sophisticated method, you can do all sorts of different things with it. So using -alphabet for this purpose won't really work, there are too many different ways to translate. Brian O. On 10/14/06 11:25 AM, "Jay Hannah" wrote: > Would it be a Good Thing if it did what I was expecting? From cjfields at uiuc.edu Sat Oct 14 20:44:04 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 14 Oct 2006 19:44:04 -0500 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? In-Reply-To: <45310151.5050901@jays.net> Message-ID: <000601c6eff3$084663c0$15327e82@pyrimidine> ... > Huh. That's what I assumed when I stumbled into the -alphabet parameter. > So I thought this would read the protein sequences out of my genbank file > and write a fasta file for me: You have to think about it this way: the GenBank record you are using is for the nucleotide sequence only, and all other information in that record describes the sequence. Similarly, if you used a 'GenPept' sequence, the focus would be the protein sequence. Both normally contain annotations which describe the sequence globally, such as references, organism info, etc. Both also may contain features (or SeqFeatures), which describe a feature bound to a particular location on the sequence. However, features are not an absolute requirement for a sequence; they're sort of 'window dressing', albeit almost always essential for describing the main sequence. I would do exactly as Brian suggests. See the Feature/Annotation HOWTO for ideas on how to screen out the particular features you want and either grab the 'translation' tag data or get the sequence object from the feature and translate it directly. You should get the same result either way though getting the tag may be faster. ... > It didn't. Would it be a Good Thing if it did what I was expecting? (Like > I said I rolled my own, but I'm always looking for ways to enhance BioPerl > that other people might find useful... Someday I will contribute something > useful, by golly. -grin-) > > (Background: I'm doing protein BLASTs from genbank files. To make formatdb > happy I have to have fasta files full of the protein sequences.) > > j You could, theoretically, write up a method to only retrieve features which correspond to coding regions only (CDS). You may want to optionally screen out pseudogenes but that's up to you. Chris From avilella at gmail.com Sun Oct 15 07:08:23 2006 From: avilella at gmail.com (Albert Vilella) Date: Sun, 15 Oct 2006 12:08:23 +0100 Subject: [Bioperl-l] no_residues test in SimpleAlign.t Message-ID: <358f4d650610150408r23cdf693t8e3b6b797649ff58@mail.gmail.com> Hi all, Can somebody check the SimpleAlign.t test? perl t/SimpleAlign.t I get a few errors, I am looking at one that deals with no_residues. I don't understand if this is suposed to work: sub no_residues { my $self = shift; my $count = 0; foreach my $seq ($self->each_seq) { my $str = $seq->seq(); $count += ($str =~ s/[^A-Za-z]//g); #is this the same as: # $str =~ s/[^A-Za-z]//g; # $count += length($str); } Cheers, Albert. return $count; } From cjfields at uiuc.edu Sun Oct 15 13:53:50 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 15 Oct 2006 12:53:50 -0500 Subject: [Bioperl-l] no_residues test in SimpleAlign.t In-Reply-To: <358f4d650610150408r23cdf693t8e3b6b797649ff58@mail.gmail.com> References: <358f4d650610150408r23cdf693t8e3b6b797649ff58@mail.gmail.com> Message-ID: Albert, I get all 75 tests passing. SimpleAlign.t was recently switched over to Test::More, so you should be seeing more explicit test descriptions. It looks like test 27 is no_residues(). Were there any more that failed? I usually run 'perl -I. t/test.t' from the main bioperl directory to check individual tests from the local directory. Otherwise you are checking your installed version which may be older (and may not match tests and recent bug fixes). Could that be the problem? Chris On Oct 15, 2006, at 6:08 AM, Albert Vilella wrote: > Hi all, > > Can somebody check the SimpleAlign.t test? > > perl t/SimpleAlign.t > > I get a few errors, I am looking at one that deals with no_residues. I > don't understand if this is suposed to work: > > sub no_residues { > my $self = shift; > my $count = 0; > > foreach my $seq ($self->each_seq) { > my $str = $seq->seq(); > > $count += ($str =~ s/[^A-Za-z]//g); > #is this the same as: > # $str =~ s/[^A-Za-z]//g; > # $count += length($str); > } > > Cheers, > > Albert. > return $count; > } > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From DGroskreutz at twt.com Mon Oct 16 02:00:39 2006 From: DGroskreutz at twt.com (DGroskreutz at twt.com) Date: Mon, 16 Oct 2006 01:00:39 -0500 Subject: [Bioperl-l] CN=Deb Groskreutz/OU=MSN/O=TWT is out of the office. Message-ID: I will be out of the office starting 10/13/2006 and will not return until 10/30/2006. I will be out of the office until October 30, 2006. I will reply to your message at that time. Thanks, Deb NOTICE OF CONFIDENTIALITY: The information contained in this communication, including attachments, is intended for the specific delivery to and use by the individual(s) to whom it is addressed. This email includes confidential information that may be attorney-client privileged. Any review, retransmission, dissemination, or unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please reply to the sender immediately and delete the original communication and any copy of it from your computer system, including all attachments. From bix at sendu.me.uk Mon Oct 16 04:08:34 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 09:08:34 +0100 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net> References: <4521E74E.1040404@infotech.monash.edu.au> <452F54A1.7010908@sendu.me.uk> <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net> Message-ID: <45333E02.9070808@sendu.me.uk> Hilmar Lapp wrote: > What does the POD (and the code) say about instantiating it? =head1 SYNOPSIS # do not use this object directly, it provides the following methods # for its subclasses ... =head1 DESCRIPTION This is a basic module from which to build executable wrapper modules. It has some basic methods to help when implementing new modules. There is no new() method. From bix at sendu.me.uk Mon Oct 16 09:23:41 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 14:23:41 +0100 Subject: [Bioperl-l] Bio::WebAgent sleep warning Message-ID: <453387DD.3040105@sendu.me.uk> Hi, Does anyone think it's appropriate for Bio::WebAgent to issue warnings every time it sleeps? I'd consider the sleeping part of its normal, expected and desired behaviour so I don't need to be warned about it. Perhaps change the $self->warn to a $self->debug? From cjfields at uiuc.edu Mon Oct 16 10:12:10 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 09:12:10 -0500 Subject: [Bioperl-l] Bio::WebAgent sleep warning In-Reply-To: <453387DD.3040105@sendu.me.uk> Message-ID: <000c01c6f12d$121b5000$15327e82@pyrimidine> > Hi, > > Does anyone think it's appropriate for Bio::WebAgent to issue warnings > every time it sleeps? I'd consider the sleeping part of its normal, > expected and desired behaviour so I don't need to be warned about it. > Perhaps change the $self->warn to a $self->debug? That sounds fine. Using debugging output for sleep would be similar behavior to Bio::DB::NCBIHelper and BioDB::GenericWebDBI. You may want to pass it by Heikki (I think that's his module). The only reason I would want to see sleep output, personally, is to make sure it is working properly. Almost looks like that class has the same intent that GenericWebDBI has (even down to using LWP::UserAgent as a superclass). I may look into it to see if I can use this as a superclass for GenericWebDBI. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Mon Oct 16 10:26:21 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Mon, 16 Oct 2006 15:26:21 +0100 Subject: [Bioperl-l] Bioperl Server Reconfig Message-ID: <4533968D.6040009@sheffield.ac.uk> Did anyone reconfigure the bioperl web server (which ever server hosts http://bioperl.org/DIST) by adding the following lines to the httpd.conf file: RedirectMatch /DIST/MSWin32-x86-multi-thread-5.8/(.*) http://ppm.activestate.com/PPMPackages/5.8-windows/MSWin32-x86-multi-thread-5.8/$1 This will be required as a workaround to a bug in ActivePerl 5.8.8.819 which will result in a failed install of Bioperl via PPM. Cheers Nath From n.haigh at sheffield.ac.uk Mon Oct 16 11:30:16 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Mon, 16 Oct 2006 16:30:16 +0100 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533A257.2000207@campus.iztacala.unam.mx> References: <4533968D.6040009@sheffield.ac.uk> <4533A257.2000207@campus.iztacala.unam.mx> Message-ID: <4533A588.9020505@sheffield.ac.uk> Mauricio Herrera Cuadra wrote: > Done. Could you please check if it works as it should? > > Cheers, > Mauricio. Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got someone to pop it in http://bioperl/DIST Volunteers? BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for the PPD? I seem to remember that there was talk about having to maintain a separate Bundle::BioPerl for each release of Bioperl. Any ideas on this front? Nath From arareko at campus.iztacala.unam.mx Mon Oct 16 11:16:39 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Mon, 16 Oct 2006 10:16:39 -0500 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533968D.6040009@sheffield.ac.uk> References: <4533968D.6040009@sheffield.ac.uk> Message-ID: <4533A257.2000207@campus.iztacala.unam.mx> Done. Could you please check if it works as it should? Cheers, Mauricio. Nathan Haigh wrote: > Did anyone reconfigure the bioperl web server (which ever server hosts > http://bioperl.org/DIST) by adding the following lines to the httpd.conf > file: > > RedirectMatch /DIST/MSWin32-x86-multi-thread-5.8/(.*) > http://ppm.activestate.com/PPMPackages/5.8-windows/MSWin32-x86-multi-thread-5.8/$1 > > This will be required as a workaround to a bug in ActivePerl 5.8.8.819 > which will result in a failed install of Bioperl via PPM. > > Cheers > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From arareko at campus.iztacala.unam.mx Mon Oct 16 11:33:33 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Mon, 16 Oct 2006 10:33:33 -0500 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533A588.9020505@sheffield.ac.uk> References: <4533968D.6040009@sheffield.ac.uk> <4533A257.2000207@campus.iztacala.unam.mx> <4533A588.9020505@sheffield.ac.uk> Message-ID: <4533A64D.6040203@campus.iztacala.unam.mx> Nathan Haigh wrote: > Mauricio Herrera Cuadra wrote: >> Done. Could you please check if it works as it should? >> >> Cheers, >> Mauricio. > Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got > someone to pop it in http://bioperl/DIST > > Volunteers? You can send it to me. -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From akarger at CGR.Harvard.edu Mon Oct 16 11:54:33 2006 From: akarger at CGR.Harvard.edu (Amir Karger) Date: Mon, 16 Oct 2006 11:54:33 -0400 Subject: [Bioperl-l] Bio::Location::Split Message-ID: I recently came across bug 2101, where Bio::Location::Split::to_FTstring gives the incorrect order for multi-sublocation locations on the minus strand. That is, I found it by getting incorrect results, and then found it in Bugzilla and in the September archives. I'm converting CDS files from one format to another. E.g., I read an EMBL file with a chromosome and CDS features, and want to output the location in a FASTA header. If I do something like: foreach (<$in>) { foreach my $feat ($seq->getSeqFeatures) { print $feat->location->to_FTstring() } } I get the wrong results for multi-exon CDSs on the -1 strand, as described in the bug report. Is there a relatively easy way around this? I assume I can't get at the original string of the location, which in this case is all I need. Can I just flip the order of the exons in certain cases? Chris F, can you tell me the preliminary solution you mentioned? I must say I'm sort of surprised this wasn't found before. It seems like a not-that-rare occurrence. Oh well. Thanks, - Amir Karger Research Computing Life Sciences Division Harvard University From bix at sendu.me.uk Mon Oct 16 12:14:39 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 17:14:39 +0100 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533A588.9020505@sheffield.ac.uk> References: <4533968D.6040009@sheffield.ac.uk> <4533A257.2000207@campus.iztacala.unam.mx> <4533A588.9020505@sheffield.ac.uk> Message-ID: <4533AFEF.8080103@sendu.me.uk> Nathan Haigh wrote: > Mauricio Herrera Cuadra wrote: >> Done. Could you please check if it works as it should? >> >> Cheers, >> Mauricio. > Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got > someone to pop it in http://bioperl/DIST > > Volunteers? I'm sure Mauricio would be happy to do it, but so am I. You may want to hold off a little while until I release rc2, which may be a few hours away. > BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for > the PPD? I seem to remember that there was talk about having to maintain > a separate Bundle::BioPerl for each release of Bioperl. Any ideas on > this front? It depends on what is in the PPD and what kind of auto-dependency features the ActiveState installer has. Given Perl 5.8 and your current PPD, does Bioperl install with the same or fewer number of skips if you also install Bundle::BioPerl first? That is, does Bundle::BioPerl even do anything useful anymore? If not, obviously don't bother making it a pre-req. If it does, my opinion is that you make it a pre-req. If people really don't want to install the optional stuff they can download the .zip file and install manually without even a make. From Kevin.M.Brown at asu.edu Mon Oct 16 12:14:51 2006 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Mon, 16 Oct 2006 09:14:51 -0700 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? Message-ID: <1A4207F8295607498283FE9E93B775B402196FAA@EX02.asurite.ad.asu.edu> > > Yes, people use the -alphabet parameter. If you set it to > something then > > Bioperl will not try to determine whether the sequence is > protein, rna, or > > dna and this is particularly useful when the sequence > contains characters > > that Bioperl would object to (sequences with distasteful > characters can be > > created by various applications, for example, or you might > introduce some > > weird character for some reason). Setting the -alphabet > would also speed up > > Bioperl a bit, for the same reason. > > Huh. That's what I assumed when I stumbled into the -alphabet > parameter. So I thought this would read the protein sequences > out of my genbank file and write a fasta file for me: > > my $seq_in = Bio::SeqIO->new( > -file => "<$file", > -format => "genbank", > -alphabet => "protein" # No effect? > ); > my $seq_out = Bio::SeqIO->new( > -file => ">$outfile", > -format => "fasta", > -alphabet => "protein" # No effect? > ); > while (my $inseq = $seq_in->next_seq) { > $inseq->molecule("protein"); # No effect? > $seq_out->write_seq($inseq); > } > > It didn't. Would it be a Good Thing if it did what I was > expecting? (Like I said I rolled my own, but I'm always > looking for ways to enhance BioPerl that other people might > find useful... Someday I will contribute something useful, by > golly. -grin-) > > (Background: I'm doing protein BLASTs from genbank files. To > make formatdb happy I have to have fasta files full of the > protein sequences.) This might work for your needs (CDS to protein FASTA). my $seq_in = Bio::SeqIO->new( -file => "<$file", -format => "genbank", ); open my $seq_out, ">$outfile"; while (my $inseq = $seq_in->next_seq) { print $seq_out ">". $inseq->display_id(). "\n"; print $seq_out $inseq->translate() ."\n"; } From bix at sendu.me.uk Mon Oct 16 11:44:19 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 16:44:19 +0100 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? Message-ID: <4533A8D3.90709@sendu.me.uk> I think Chris recently deprecated this, but should it be? For me, its POD description justifies its existence, and perhaps more importantly, Bio::Index::Blast relies on it. I took a quick peek at the latter and it didn't seem trivial to move it over to Bio::SearchIO instead. Should it be undeprecated? From n.haigh at sheffield.ac.uk Mon Oct 16 12:39:02 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Mon, 16 Oct 2006 17:39:02 +0100 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533AFEF.8080103@sendu.me.uk> References: <4533968D.6040009@sheffield.ac.uk> <4533A257.2000207@campus.iztacala.unam.mx> <4533A588.9020505@sheffield.ac.uk> <4533AFEF.8080103@sendu.me.uk> Message-ID: <4533B5A6.1070701@sheffield.ac.uk> Sendu Bala wrote: > Nathan Haigh wrote: >> Mauricio Herrera Cuadra wrote: >>> Done. Could you please check if it works as it should? >>> >>> Cheers, >>> Mauricio. >> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got >> someone to pop it in http://bioperl/DIST >> >> Volunteers? > > I'm sure Mauricio would be happy to do it, but so am I. You may want > to hold off a little while until I release rc2, which may be a few > hours away. Just e-mailed Mauricio links to the files off list, It's not a big job for me to remake the bioperl PPD, so Mauricio it's up to you if you want to wait 18hrs for me to make the PPDs for 1.5.2-rc2. > > >> BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for >> the PPD? I seem to remember that there was talk about having to maintain >> a separate Bundle::BioPerl for each release of Bioperl. Any ideas on >> this front? > > It depends on what is in the PPD and what kind of auto-dependency > features the ActiveState installer has. Given Perl 5.8 and your > current PPD, does Bioperl install with the same or fewer number of > skips if you also install Bundle::BioPerl first? That is, does > Bundle::BioPerl even do anything useful anymore? If not, obviously > don't bother making it a pre-req. If it does, my opinion is that you > make it a pre-req. If people really don't want to install the optional > stuff they can download the .zip file and install manually without > even a make. As far as the PPDs are concerned - no tests are run during installation. PPM more or less just copies files into the correct place for Perl to find so both approaches result in the same thing. However, I've not tried making a CPAN distribution file for either Bioperl or Bundle::Bioperl - I wouldn't know where to start! MakeFile.PL now only documents the prereq in one place (%packages), and this is used to add the prereq to the bioperl PPD when issuing "nmake ppd". This way, each release of BioPerl should be up-to-date with prereq as long as developers add their modules prereq to %packages. If we have Bundle::BioPerl, most of those prereq need to be moved from the Bioperl PPD to the Bundle::BioPerl PPD - a bit of a pain because there are no guidelines as to what should/should not go in Bundle::BioPerl. Therefore, as far as the PPDs are concerned, it far easier to do away with Bundel::BioPerl. Nath From hlapp at gmx.net Mon Oct 16 13:04:24 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 16 Oct 2006 13:04:24 -0400 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <45333E02.9070808@sendu.me.uk> References: <4521E74E.1040404@infotech.monash.edu.au> <452F54A1.7010908@sendu.me.uk> <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net> <45333E02.9070808@sendu.me.uk> Message-ID: <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net> So it looks like an abstract base class, not an interface that defines a contract or API? Should use Root.pm then, would be my vote. -hilmar On Oct 16, 2006, at 4:08 AM, Sendu Bala wrote: > Hilmar Lapp wrote: >> What does the POD (and the code) say about instantiating it? > > =head1 SYNOPSIS > > # do not use this object directly, it provides the following > methods > # for its subclasses > > ... > > > =head1 DESCRIPTION > > This is a basic module from which to build executable wrapper modules. > It has some basic methods to help when implementing new modules. > > > There is no new() method. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Mon Oct 16 13:08:28 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 16 Oct 2006 13:08:28 -0400 Subject: [Bioperl-l] Bio::WebAgent sleep warning In-Reply-To: <453387DD.3040105@sendu.me.uk> References: <453387DD.3040105@sendu.me.uk> Message-ID: It depends. What triggers the sleeping? If it's part of every request that it processes then I'd agree. If it is triggered by failure to precede the next try then the failure is probably not expected (though possible), and hence should be reported by warn(). If it is just part of the polling cycle then there should probably be a limit up to which the time waited is considered 'normal' and after which it is considered 'excessive' and hence should be reported through warn(). My $0.02. -hilmar On Oct 16, 2006, at 9:23 AM, Sendu Bala wrote: > Hi, > > Does anyone think it's appropriate for Bio::WebAgent to issue warnings > every time it sleeps? I'd consider the sleeping part of its normal, > expected and desired behaviour so I don't need to be warned about it. > Perhaps change the $self->warn to a $self->debug? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Mon Oct 16 13:13:53 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 18:13:53 +0100 Subject: [Bioperl-l] Bio::WebAgent sleep warning In-Reply-To: References: <453387DD.3040105@sendu.me.uk> Message-ID: <4533BDD1.8060204@sendu.me.uk> Hilmar Lapp wrote: > It depends. What triggers the sleeping? If it's part of every request > that it processes then I'd agree. If it is triggered by failure to > precede the next try then the failure is probably not expected (though > possible), and hence should be reported by warn(). > > If it is just part of the polling cycle then there should probably be a > limit up to which the time waited is considered 'normal' and after which > it is considered 'excessive' and hence should be reported through warn(). =head2 sleep Title : sleep Usage : $self->sleep Function: sleep for a number of seconds indicated by the delay policy Returns : none Args : none NOTE: This method keeps track of the last time it was called and only imposes a sleep if it was called more recently than the delay_policy() allows. =cut It issues a warning every time it actually sleeps. I find it inappropriate that a method warns me that it did what I asked it to do. From arareko at campus.iztacala.unam.mx Mon Oct 16 13:14:06 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Mon, 16 Oct 2006 12:14:06 -0500 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533B5A6.1070701@sheffield.ac.uk> References: <4533968D.6040009@sheffield.ac.uk> <4533A257.2000207@campus.iztacala.unam.mx> <4533A588.9020505@sheffield.ac.uk> <4533AFEF.8080103@sendu.me.uk> <4533B5A6.1070701@sheffield.ac.uk> Message-ID: <4533BDDE.2040801@campus.iztacala.unam.mx> Nathan Haigh wrote: > Sendu Bala wrote: >> Nathan Haigh wrote: >>> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got >>> someone to pop it in http://bioperl/DIST >>> >>> Volunteers? >> I'm sure Mauricio would be happy to do it, but so am I. You may want >> to hold off a little while until I release rc2, which may be a few >> hours away. > > Just e-mailed Mauricio links to the files off list, It's not a big job > for me to remake the bioperl PPD, so Mauricio it's up to you if you want > to wait 18hrs for me to make the PPDs for 1.5.2-rc2. Too late, I've already placed 1.5.2-rc1 in DIST. hehe :) -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From bix at sendu.me.uk Mon Oct 16 12:32:11 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 17:32:11 +0100 Subject: [Bioperl-l] Swissprot problems Message-ID: <4533B40B.2030908@sendu.me.uk> t/Biofetch.t and t/DB.t are skipping their swissprot database fetches. Earlier today http://www.ebi.ac.uk/cgi-bin/dbfetch was down for maintenance but is now back up. However I'm guessing the databases must have changed. I've manually looked for the test case 'YNB3_YEAST' in database 'UniProtKB' and it came back with no result, even though I can find the test case manually at the expasy website. Is this an EBI bug or deliberate change that makes sense to someone? From m.weimer at dkfz-heidelberg.de Mon Oct 16 12:43:38 2006 From: m.weimer at dkfz-heidelberg.de (Marc Weimer) Date: Mon, 16 Oct 2006 18:43:38 +0200 Subject: [Bioperl-l] Bio::DB::SwissProt Problem Message-ID: <1161017019.5203.6.camel@localhost> Dear list members, when running ###################################################################### #! /usr/bin/perl -w use strict; use Bio::DB::SwissProt; my $db_obj = new Bio::DB::SwissProt(-verbose => 1); my $seq_obj = $db_obj->get_Seq_by_acc("O02938"); ###################################################################### using Bioperl 1.5.2 I get the following message: ########################################################################################## request is POST http://www.ebi.ac.uk/cgi-bin/dbfetch Content-Length: 49 Content-Type: application/x-www-form-urlencoded format=swissprot&db=UniProtKB&style=raw&id=O02938 ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: acc O02938 does not exist STACK: Error::throw STACK: Bio::Root::Root::throw /usr/local/share/perl/5.8.7/Bio/Root/Root.pm:350 STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc /usr/local/share/perl/5.8.7/Bio/DB/WebDBSeqI.pm:181 STACK: ./get.test.pl:8 ----------------------------------------------------------- ########################################################################################## But the accession number does exist. Surprisingly, everything worked fine a few days ago. Any ideas of what might have happened? Thanks and best regards, Marc From hlapp at gmx.net Mon Oct 16 13:15:50 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 16 Oct 2006 13:15:50 -0400 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <4533A8D3.90709@sendu.me.uk> References: <4533A8D3.90709@sendu.me.uk> Message-ID: The problem is it is not maintained, and there are outstanding been bug reports. If you un-deprecate it, then we need a response to people who come across problems with it when using it. Either you change the POD to say exactly who and when one should use it (or rather not) and point to the fact that it is unsupported for all other cases. Or what would you suggest? -hilmar On Oct 16, 2006, at 11:44 AM, Sendu Bala wrote: > I think Chris recently deprecated this, but should it be? For me, its > POD description justifies its existence, and perhaps more importantly, > Bio::Index::Blast relies on it. > > I took a quick peek at the latter and it didn't seem trivial to > move it > over to Bio::SearchIO instead. > > Should it be undeprecated? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Mon Oct 16 13:21:46 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 12:21:46 -0500 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <4533A8D3.90709@sendu.me.uk> Message-ID: <000001c6f147$8efdfd60$15327e82@pyrimidine> Bio::Tools::BPlite was placed on the deprecation list a while back (~ rel 1.5); the other related Bio::Tools::BP* modules were also supposed to be on that list as well. If we want to undeprecate (de-deprecate? reprecate?) BPlite we also would need to do the same for the others. They must be updated to parse current BLAST/PSI-BLAST/bl2seq text output, something that Bio::SearchIO::blast is currently capable of (so the functionality is redundant). And someone needs to take them over. In my opinion it may be more trouble than it's worth as they haven't been touched in a while. Seems if we 'revive' BPlite we're not really moving forward esp. since you have added the PullParser recently and made substantial improvements to SearchIO. Maybe Bio::Index::Blast just needs to be deprecated or rewritten to use SearchIO? Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Sendu Bala > Sent: Monday, October 16, 2006 10:44 AM > To: bioperl-l > Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? > > I think Chris recently deprecated this, but should it be? For me, its > POD description justifies its existence, and perhaps more importantly, > Bio::Index::Blast relies on it. > > I took a quick peek at the latter and it didn't seem trivial to move it > over to Bio::SearchIO instead. > > Should it be undeprecated? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From bix at sendu.me.uk Mon Oct 16 13:21:58 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 18:21:58 +0100 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: References: <4533A8D3.90709@sendu.me.uk> Message-ID: <4533BFB6.5070504@sendu.me.uk> Hilmar Lapp wrote: > The problem is it is not maintained, and there are outstanding been bug > reports. > > If you un-deprecate it, then we need a response to people who come > across problems with it when using it. Either you change the POD to say > exactly who and when one should use it (or rather not) and point to the > fact that it is unsupported for all other cases. > > Or what would you suggest? I'm not sure. Does Bio::Index::Blast even work correctly? Does it suffer from whatever bugs Bio::Tools::BPlite has? Is Bio::Index::Blast maintained? Should that be deprecated as well? Ideally I'd like to see Bio::Index::Blast updated to use Bio::SearchIO and then Bio::Tools::BPlite can be deprecated. But like I say, it didn't seem trivial (or even appropriate). Ultimately I just wanted to solve the warnings in the test suite. Thoughts, Chris? From cjfields at uiuc.edu Mon Oct 16 13:30:05 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 12:30:05 -0500 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533A588.9020505@sheffield.ac.uk> Message-ID: <000101c6f148$b8538b20$15327e82@pyrimidine> > Mauricio Herrera Cuadra wrote: > > Done. Could you please check if it works as it should? > > > > Cheers, > > Mauricio. > Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got > someone to pop it in http://bioperl/DIST > > Volunteers? > > BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for > the PPD? I seem to remember that there was talk about having to maintain > a separate Bundle::BioPerl for each release of Bioperl. Any ideas on > this front? > > Nath Nathan, I think Chris Dagdigian still maintains Bundle::Bioperl on CPAN. That version should be the common basis for prereqs for any Bioperl core installation. It's relatively easy to add/remove modules to the Bundle::Bioperl. Contact Chris D. and let him know if anything needs to be changed. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 16 13:33:50 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 12:33:50 -0500 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net> Message-ID: <000201c6f149$3ed63490$15327e82@pyrimidine> > So it looks like an abstract base class, not an interface that > defines a contract or API? Should use Root.pm then, would be my vote. > > -hilmar Makes sense to me. Maybe another audit is needed to catch similar instances, or has this been done already? Chris > On Oct 16, 2006, at 4:08 AM, Sendu Bala wrote: > > > Hilmar Lapp wrote: > >> What does the POD (and the code) say about instantiating it? > > > > =head1 SYNOPSIS > > > > # do not use this object directly, it provides the following > > methods > > # for its subclasses > > > > ... > > > > > > =head1 DESCRIPTION > > > > This is a basic module from which to build executable wrapper modules. > > It has some basic methods to help when implementing new modules. > > > > > > There is no new() method. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 16 13:57:35 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 12:57:35 -0500 Subject: [Bioperl-l] Bio::Location::Split In-Reply-To: Message-ID: <000301c6f14c$8fb0e060$15327e82@pyrimidine> > I recently came across bug 2101, where Bio::Location::Split::to_FTstring > gives the incorrect order for multi-sublocation locations on the minus > strand. That is, I found it by getting incorrect results, and then found > it in Bugzilla and in the September archives. > > I'm converting CDS files from one format to another. E.g., I read an > EMBL file with a chromosome and CDS features, and want to output the > location in a FASTA header. If I do something like: > > foreach (<$in>) { > foreach my $feat ($seq->getSeqFeatures) { > print $feat->location->to_FTstring() > } > } > > I get the wrong results for multi-exon CDSs on the -1 strand, as > described in the bug report. > > Is there a relatively easy way around this? I assume I can't get at the > original string of the location, which in this case is all I need. Can I > just flip the order of the exons in certain cases? Chris F, can you tell > me the preliminary solution you mentioned? > > I must say I'm sort of surprised this wasn't found before. It seems like > a not-that-rare occurrence. Oh well. > > Thanks, > > - Amir Karger > Research Computing > Life Sciences Division > Harvard University Could you let me know specifically which EMBL file contains the odd locations? The bug report uses theoretical locations, not actual ones, so it would be nice to have a real-world example to test against. As for the lack of catching this, the particular types of locations that cause the issue are quite rare. Note that there are two bugs for that bug report. The first (and more serious) is still unresolved. The second (where remote locations are treated differently in Location::Split, which caused more problems than it was worth) had a fix committed about a month ago. Any fixes I have made for the first bug invariably break several other methods, which use the current Location::Split object logic for retrieving sequences, building feature strings, etc. Since a new RC is imminent and the bug only affects a small number of locations, I have held off until after a final release is made (the last thing I want to do is fix something that breaks ~6-8 other methods), but I'll try looking at it again this week. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 16 14:29:02 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 13:29:02 -0500 Subject: [Bioperl-l] Swissprot problems In-Reply-To: <4533B40B.2030908@sendu.me.uk> Message-ID: <000401c6f150$f57dfc30$15327e82@pyrimidine> > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Sendu Bala > Sent: Monday, October 16, 2006 11:32 AM > To: bioperl-l > Subject: [Bioperl-l] Swissprot problems > > t/Biofetch.t and t/DB.t are skipping their swissprot database fetches. > Earlier today http://www.ebi.ac.uk/cgi-bin/dbfetch was down for > maintenance but is now back up. However I'm guessing the databases must > have changed. I've manually looked for the test case 'YNB3_YEAST' in > database 'UniProtKB' and it came back with no result, even though I can > find the test case manually at the expasy website. > > Is this an EBI bug or deliberate change that makes sense to someone? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l I can confirm that. It's not our end, though. Entering the same data on the DBFetch web page also gets no data. I have emailed EBI about the problem and will let you know if I hear anything; I think it's related to the maintenance issue. Notably, nothing on the web page indicates any database name changes yet. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 16 14:29:52 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 13:29:52 -0500 Subject: [Bioperl-l] Bio::DB::SwissProt Problem In-Reply-To: <1161017019.5203.6.camel@localhost> Message-ID: <000501c6f151$12918710$15327e82@pyrimidine> We think there is a problem on the SwissProt (DBFetch) server. I have contacted them about the problem and will post something when I hear something back. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Marc Weimer > Sent: Monday, October 16, 2006 11:44 AM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] Bio::DB::SwissProt Problem > > Dear list members, > > when running > > ###################################################################### > #! /usr/bin/perl -w > > use strict; > use Bio::DB::SwissProt; > > my $db_obj = new Bio::DB::SwissProt(-verbose => 1); > > my $seq_obj = $db_obj->get_Seq_by_acc("O02938"); > ###################################################################### > > using Bioperl 1.5.2 I get the following message: > > ########################################################################## > ################ > > request is POST http://www.ebi.ac.uk/cgi-bin/dbfetch > Content-Length: 49 > Content-Type: application/x-www-form-urlencoded > > format=swissprot&db=UniProtKB&style=raw&id=O02938 > > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: acc O02938 does not exist > STACK: Error::throw > STACK: > Bio::Root::Root::throw /usr/local/share/perl/5.8.7/Bio/Root/Root.pm:350 > STACK: > Bio::DB::WebDBSeqI::get_Seq_by_acc > /usr/local/share/perl/5.8.7/Bio/DB/WebDBSeqI.pm:181 > STACK: ./get.test.pl:8 > ----------------------------------------------------------- > > ########################################################################## > ################ > > But the accession number does exist. Surprisingly, everything worked > fine a few days ago. Any ideas of what might have happened? > > Thanks and best regards, > > Marc > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Mon Oct 16 14:39:28 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 13:39:28 -0500 Subject: [Bioperl-l] SwissProt Down Message-ID: <000601c6f152$6997dbd0$15327e82@pyrimidine> Looks like the swissprot problem stems from maintenance at EBI. From the EBI page http://www.ebi.ac.uk/Information/ (not on the DBFetch page, BTW): Please Note: Monday October 16th 12:00-15:00 - Due to general maintenance, some services from the EBI may be temporarily unavailable. We apologise for any inconvenience. At least we know that Test::More skips are working! Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Mon Oct 16 14:51:31 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 19:51:31 +0100 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: References: Message-ID: <4533D4B3.2000809@sendu.me.uk> Brian Osborne wrote: > Sendu, > > I just made a commit that makes Bio::Index::Blast use SearchIO instead of > BPlite. I was concerned about the whole id_parser thing. Did you determine that your change still allows for id_parser to be used and have the intended effect, or that id_parser is in someway meaningless and should be removed as a method? From cjfields at uiuc.edu Mon Oct 16 15:03:33 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 14:03:33 -0500 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <4533BFB6.5070504@sendu.me.uk> Message-ID: <000301c6f155$c7029ff0$15327e82@pyrimidine> > Hilmar Lapp wrote: > > The problem is it is not maintained, and there are outstanding been bug > > reports. > > > > If you un-deprecate it, then we need a response to people who come > > across problems with it when using it. Either you change the POD to say > > exactly who and when one should use it (or rather not) and point to the > > fact that it is unsupported for all other cases. > > > > Or what would you suggest? > > I'm not sure. > > Does Bio::Index::Blast even work correctly? Does it suffer from whatever > bugs Bio::Tools::BPlite has? Is Bio::Index::Blast maintained? Should > that be deprecated as well? > > Ideally I'd like to see Bio::Index::Blast updated to use Bio::SearchIO > and then Bio::Tools::BPlite can be deprecated. But like I say, it didn't > seem trivial (or even appropriate). > > Ultimately I just wanted to solve the warnings in the test suite. > Thoughts, Chris? My opinion is we either have to completely support BPlite (and the others) or drop it altogether. I don't think we can state "use BPLite only with Bio::Index::Blast, use SearchIO everywhere else." That's too inconsistent. It seems simpler to deprecate the various Bio::Tools::BP* classes and either fix Bio::Index::Blast to use Bio::SearchIO (which I think Brian is working on) or deprecate Bio::Index::Blast as well. The warnings in the test suite belong to BlastIndex.t, correct? I updated using Brian's Bio::Index::blast fix and it passes now w/o warnings. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From akarger at CGR.Harvard.edu Mon Oct 16 15:00:28 2006 From: akarger at CGR.Harvard.edu (Amir Karger) Date: Mon, 16 Oct 2006 15:00:28 -0400 Subject: [Bioperl-l] Bio::Location::Split Message-ID: > -----Original Message----- > From: Chris Fields [mailto:cjfields at uiuc.edu] > > > > I'm converting CDS files from one format to another. E.g., I read an > > EMBL file with a chromosome and CDS features, and want to output the > > location in a FASTA header.> > > > I get the wrong results for multi-exon CDSs on the -1 strand, as > > described in the bug report. > > > > Could you let me know specifically which EMBL file contains the odd > locations? The bug report uses theoretical locations, not > actual ones, so > it would be nice to have a real-world example to test against. I downloaded candida glabrata chromosome B from EBI: http://www.ebi.ac.uk/genomes/eukaryota.html, CR380948 testportal>perl location.pl new_glabrata_B.embl > bio testportal>perl -wlne 'print $1 if /^FT\s+CDS\s+(.*)/' new_glabrata_B.embl > nonbio testportal>wc bio nonbio 217 217 4537 bio 217 217 4549 nonbio 434 434 9086 total testportal>diff bio nonbio 4c4 < complement(join(10632..11157,10347..10372)) --- > join(complement(10632..11157),complement(10347..10372)) Just one example here, but see below. > As for the lack of catching this, the particular types of > locations that > cause the issue are quite rare. Really? I guess our definitions of rare depend on which sequences we're working with. I'm doing fungal genomes, and here's a grep for a few species' entire genomes: testportal>foreach i ( *.embl ) foreach? echo $i foreach? grep CDS $i | grep join | grep -c complement foreach? end glabrata_orf.embl 29 hansenii_orf.embl 151 lactis_orf.embl 70 lipolytica_orf.embl 337 pombe_orf.embl 1137 You might like to use pombe as a test case, as it has lots of these complement joins, including ones with multiple introns. Anyway, I'd question the "rare" designation. It seems to me like any species that has introns will have situations like this in their CDSs. Not to mention any other sequence that uses Bio::Location::Split. (Since I'm not a Real Biologist, I can't think up mor examples here, but I'm sure they exist.) Or are you saying it's rare to use join (complement(C..D), complement(A..B)) instead of complement(join(A..B, C..D)). In that case, I guess I just got really unlucky in that five fungal genomes I was using decided to use the "rare" syntax. > Note that there are two bugs > for that bug > report. The first (and more serious) is still unresolved. The second > (where remote locations are treated differently in > Location::Split, which > caused more problems than it was worth) had a fix committed > about a month > ago. Sadly, it's the first (and in my case, more common (I have no remote locations.)) bug for me. > Any fixes I have made for the first bug invariably break several other > methods, which use the current Location::Split object logic > for retrieving > sequences, building feature strings, etc. Since a new RC is > imminent and > the bug only affects a small number of locations, I have held > off until > after a final release is made (the last thing I want to do is > fix something > that breaks ~6-8 other methods), but I'll try looking at it > again this week. IMO this is a pretty serious bug (if these kinds of sequences aren't that rare as I've shown above), because you're outputting sequence descriptions that are just plain wrong. Anyone who uses FTLocationFactory to read these output description will have incorrect sequence, incorrect translated proteins, etc. And it's even more serious if other methods are depending on it. I know I can't dictate your time, and should be volunteering to work on fixing it. But if it affects other modules, then I will no doubt break things even more than you have in your attempts. -Amir > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > From bosborne11 at verizon.net Mon Oct 16 14:25:14 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 16 Oct 2006 14:25:14 -0400 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <4533A8D3.90709@sendu.me.uk> Message-ID: Sendu, I just made a commit that makes Bio::Index::Blast use SearchIO instead of BPlite. The BlastIndex.t test is giving a few warnings so I need to take a look at that but all tests are passing. An awful lot of work has gone into the SearchIO system, for more on why its approach is deemed to be superior in the context of Bioperl see the SearchIO HOWTO. One key feature of this upcoming release is an emphasis on removing extraneous modules, I think it's safe to say that BPlite has been considered extraneous for a number of years now. Brian O. On 10/16/06 11:44 AM, "Sendu Bala" wrote: > I think Chris recently deprecated this, but should it be? For me, its > POD description justifies its existence, and perhaps more importantly, > Bio::Index::Blast relies on it. > > I took a quick peek at the latter and it didn't seem trivial to move it > over to Bio::SearchIO instead. > > Should it be undeprecated? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From bosborne11 at verizon.net Mon Oct 16 14:59:38 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 16 Oct 2006 14:59:38 -0400 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <4533D4B3.2000809@sendu.me.uk> Message-ID: Sendu, OK. I _think_ this change shouldn't affect id_parser() but I will test this in BlastIndex.t. The id_parser() method is relevant to all these Index* modules - don't know how much it's used but it certainly is nice to have it available. Brian O. On 10/16/06 2:51 PM, "Sendu Bala" wrote: > Brian Osborne wrote: >> Sendu, >> >> I just made a commit that makes Bio::Index::Blast use SearchIO instead of >> BPlite. > > I was concerned about the whole id_parser thing. Did you determine that > your change still allows for id_parser to be used and have the intended > effect, or that id_parser is in someway meaningless and should be > removed as a method? From cjfields at uiuc.edu Mon Oct 16 16:51:08 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 15:51:08 -0500 Subject: [Bioperl-l] Bio::Location::Split In-Reply-To: Message-ID: <000001c6f164$d1380190$15327e82@pyrimidine> ... > I downloaded candida glabrata chromosome B from EBI: > http://www.ebi.ac.uk/genomes/eukaryota.html, CR380948 > > testportal>perl location.pl new_glabrata_B.embl > bio > testportal>perl -wlne 'print $1 if /^FT\s+CDS\s+(.*)/' > new_glabrata_B.embl > nonbio > testportal>wc bio nonbio > 217 217 4537 bio > 217 217 4549 nonbio > 434 434 9086 total > testportal>diff bio nonbio > 4c4 > < complement(join(10632..11157,10347..10372)) > --- > > join(complement(10632..11157),complement(10347..10372)) > > Just one example here, but see below. > > > As for the lack of catching this, the particular types of > > locations that > > cause the issue are quite rare. > > Really? I guess our definitions of rare depend on which sequences we're > working with. I'm doing fungal genomes, and here's a grep for a few > species' entire genomes: > > testportal>foreach i ( *.embl ) > foreach? echo $i > foreach? grep CDS $i | grep join | grep -c complement > foreach? end > glabrata_orf.embl > 29 > hansenii_orf.embl > 151 > lactis_orf.embl > 70 > lipolytica_orf.embl > 337 > pombe_orf.embl > 1137 > > You might like to use pombe as a test case, as it has lots of these > complement joins, including ones with multiple introns. I'll use those. I'll see if an analogous GenBank file exists as well. I can probably make a preliminary fix for FT_string() so that it arranges the sublocations correctly, but I think the best way to go is to have FTLocationFactory not modify the various sublocations to start with, which it currently does when it sets strand() (strand() propagates the strand info to sublocations). > Anyway, I'd question the "rare" designation. It seems to me like any > species that has introns will have situations like this in their CDSs. > Not to mention any other sequence that uses Bio::Location::Split. (Since > I'm not a Real Biologist, I can't think up mor examples here, but I'm > sure they exist.) I think that additional tests are definitely needed for pulling out sequences. What I mean by 'rare' is that the majority of sequences do not have problems. Also, this seems to be a 'silent' bug since the error shows up in to_FTstring() but the object sublocations seem to beprocessed correctly when using the location object directly (such as via SeqFeatureI). Round-tripping the sequence should pick it up though. Since complement(join(10632..11157,10347..10372)) is not the same as join(complement(10632..11157),complement(10347..10372)). That is essentially what you are doing, correct? i.e. getting the sequences using Bioperl, saving them (which passes them through SeqIO), reading them again (back through SeqIO with the malformed location string). > Or are you saying it's rare to use join (complement(C..D), > complement(A..B)) instead of complement(join(A..B, C..D)). In that case, > I guess I just got really unlucky in that five fungal genomes I was > using decided to use the "rare" syntax. Location::Split is supposed to handle all variations, but apparently it isn't. > > Note that there are two bugs > > for that bug > > report. The first (and more serious) is still unresolved. The second > > (where remote locations are treated differently in > > Location::Split, which > > caused more problems than it was worth) had a fix committed > > about a month > > ago. > > Sadly, it's the first (and in my case, more common (I have no remote > locations.)) bug for me. > > > Any fixes I have made for the first bug invariably break several other > > methods, which use the current Location::Split object logic > > for retrieving > > sequences, building feature strings, etc. Since a new RC is > > imminent and > > the bug only affects a small number of locations, I have held > > off until > > after a final release is made (the last thing I want to do is > > fix something > > that breaks ~6-8 other methods), but I'll try looking at it > > again this week. > > IMO this is a pretty serious bug (if these kinds of sequences aren't > that rare as I've shown above), because you're outputting sequence > descriptions that are just plain wrong. Anyone who uses > FTLocationFactory to read these output description will have incorrect > sequence, incorrect translated proteins, etc. And it's even more serious > if other methods are depending on it. > > I know I can't dictate your time, and should be volunteering to work on > fixing it. But if it affects other modules, then I will no doubt break > things even more than you have in your attempts. > > -Amir I'll give it a look over the next week. Like I mentioned above, I may be able to fix it in Split::to_FTstring() w/o breaking other tests (in which case I'll commit it for the 1.5.2 release), but it would be a temporary hack until I can work out why other tests are failing. Chris From jason at bioperl.org Mon Oct 16 18:45:21 2006 From: jason at bioperl.org (Jason Stajich) Date: Mon, 16 Oct 2006 15:45:21 -0700 Subject: [Bioperl-l] split location problems Message-ID: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org> The whole point of split locations is to represent genes with introns so that is not the "rare" case. I'm confused where the problem is. The locations that I get out with to_FTstring on the location object are exactly the same as those input. I have processed the genbank fungal genomes into GFF3 and have had no problems so I'm confused where you are breaking down. If I write them out as embl I also get the correct thing. This is using the CVS version of bioperl from the HEAD. I've added code to test this to bug 2101 including a C.glabrata chromsome downloaded from genbank. Perhaps the problem is on the EMBL parsing side, I didn't test that. On the technical side, I still am not sure I fully know where the strand information should be stored - the top level container or the sub-features. I'll try and stay up on the discussion if anything has been decided that I should know about. -jason From torsten.seemann at infotech.monash.edu.au Mon Oct 16 18:23:23 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Tue, 17 Oct 2006 08:23:23 +1000 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <000201c6f149$3ed63490$15327e82@pyrimidine> References: <000201c6f149$3ed63490$15327e82@pyrimidine> Message-ID: <4534065B.9020309@infotech.monash.edu.au> Chris Fields wrote: >> So it looks like an abstract base class, not an interface that >> defines a contract or API? Should use Root.pm then, would be my vote. >> -hilmar > > Makes sense to me. Maybe another audit is needed to catch similar > instances, or has this been done already? The purpose of my original (poorly phrased) question was to try and sort out where Root and RootI where being used the wrong way around. I'm currently "all-audited out" so I leave this task to another volunteer. -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From cjfields at uiuc.edu Mon Oct 16 21:07:55 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 20:07:55 -0500 Subject: [Bioperl-l] split location problems In-Reply-To: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org> References: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org> Message-ID: On Oct 16, 2006, at 5:45 PM, Jason Stajich wrote: > The whole point of split locations is to represent genes with > introns so that is not the "rare" case. > > I'm confused where the problem is. The locations that I get out > with to_FTstring on the location object are exactly the same as > those input. The problem is with the a subset of split locations described in the bug report. The following works: complement(join(2691..4571,4918..5163)) whereas this: join(complement(4918..5163),complement(2691..4571)) gives this: complement(join(4918..5163,2691..4571)) which is not syntactically the same. It should be: complement(join(2691..4571,4918..5163)) since 'join' implies that the order of the segments to be joined is important ('order' and 'bond' do not, I guess). > I have processed the genbank fungal genomes into GFF3 and have had > no problems so I'm confused where you are breaking down. If I > write them out as embl I also get the correct thing. This is using > the CVS version of bioperl from the HEAD. > > I've added code to test this to bug 2101 including a C.glabrata > chromsome downloaded from genbank. Perhaps the problem is on the > EMBL parsing side, I didn't test that. > > On the technical side, I still am not sure I fully know where the > strand information should be stored - the top level container or > the sub-features. I'll try and stay up on the discussion if > anything has been decided that I should know about. > > -jason Split::strand() sets the sublocations as well, which seems to confuse the situation more but it is consistent with LocationI, as Hilmar points out. I'm looking into a few solutions now, including a fix in Split::to_FTstring(). Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Mon Oct 16 22:48:14 2006 From: jason at bioperl.org (Jason Stajich) Date: Mon, 16 Oct 2006 19:48:14 -0700 Subject: [Bioperl-l] split location problems In-Reply-To: References: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org> Message-ID: <8273f6c20610161948w201537a5v2fcfa189eb809283@mail.gmail.com> This probably was exposed by the fact that the Split object used to explicitly sort the features by start*strand always. But with remote locations and needing to be able to explicitly set the order (for features that are not required to be 5' -> 3') that code must have been removed. I think there is just one place that must be missing a 'reverse' on the list of sub-locations when the top-level feature is a complement. I'll wait for your fix before wading in - we probably might want to figure out a 'consolidate' method to shrink redundant and equivalent representations to the shortest possible form. Ugh this really starts to resemble trying to write a boolean logic toolkit.... -jason On 10/16/06, Chris Fields wrote: > > > On Oct 16, 2006, at 5:45 PM, Jason Stajich wrote: > > > The whole point of split locations is to represent genes with > > introns so that is not the "rare" case. > > > > I'm confused where the problem is. The locations that I get out > > with to_FTstring on the location object are exactly the same as > > those input. > > The problem is with the a subset of split locations described in the > bug report. The following works: > > complement(join(2691..4571,4918..5163)) > > whereas this: > > join(complement(4918..5163),complement(2691..4571)) > > gives this: > > complement(join(4918..5163,2691..4571)) > > which is not syntactically the same. It should be: > > complement(join(2691..4571,4918..5163)) > > since 'join' implies that the order of the segments to be joined is > important ('order' and 'bond' do not, I guess). > > > I have processed the genbank fungal genomes into GFF3 and have had > > no problems so I'm confused where you are breaking down. If I > > write them out as embl I also get the correct thing. This is using > > the CVS version of bioperl from the HEAD. > > > > I've added code to test this to bug 2101 including a C.glabrata > > chromsome downloaded from genbank. Perhaps the problem is on the > > EMBL parsing side, I didn't test that. > > > > On the technical side, I still am not sure I fully know where the > > strand information should be stored - the top level container or > > the sub-features. I'll try and stay up on the discussion if > > anything has been decided that I should know about. > > > > -jason > > Split::strand() sets the sublocations as well, which seems to confuse > the situation more but it is consistent with LocationI, as Hilmar > points out. I'm looking into a few solutions now, including a fix in > Split::to_FTstring(). > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > -- Jason Stajich jason at bioperl.org http://www.duke.edu/~jes12/ From cjfields at uiuc.edu Mon Oct 16 23:34:25 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 22:34:25 -0500 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: References: Message-ID: On Oct 16, 2006, at 10:24 PM, Brian Osborne wrote: > Chris and Sendu, > > Sendu was correct in wondering whether id_parser() in Blast.pm > would work > after the module was altered to use SearchIO but what I've found > out from my > local tests is that id_parser() didn't work when BPlite was being used > either. I can continue to work on this but it's safe to say that > removing > BPlite doesn't cause a problem with id_parser, it was already there. > > Brian O. .... It may be one reason (the main reason?) the method wasn't tested. Maybe it should be removed if it can't be easily fixed; I don't think it makes sense keeping it otherwise. Chris Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bosborne11 at verizon.net Mon Oct 16 23:24:59 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 16 Oct 2006 23:24:59 -0400 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <000301c6f155$c7029ff0$15327e82@pyrimidine> Message-ID: Chris and Sendu, Sendu was correct in wondering whether id_parser() in Blast.pm would work after the module was altered to use SearchIO but what I've found out from my local tests is that id_parser() didn't work when BPlite was being used either. I can continue to work on this but it's safe to say that removing BPlite doesn't cause a problem with id_parser, it was already there. Brian O. On 10/16/06 3:03 PM, "Chris Fields" wrote: >> Hilmar Lapp wrote: >>> The problem is it is not maintained, and there are outstanding been bug >>> reports. >>> >>> If you un-deprecate it, then we need a response to people who come >>> across problems with it when using it. Either you change the POD to say >>> exactly who and when one should use it (or rather not) and point to the >>> fact that it is unsupported for all other cases. >>> >>> Or what would you suggest? >> >> I'm not sure. >> >> Does Bio::Index::Blast even work correctly? Does it suffer from whatever >> bugs Bio::Tools::BPlite has? Is Bio::Index::Blast maintained? Should >> that be deprecated as well? >> >> Ideally I'd like to see Bio::Index::Blast updated to use Bio::SearchIO >> and then Bio::Tools::BPlite can be deprecated. But like I say, it didn't >> seem trivial (or even appropriate). >> >> Ultimately I just wanted to solve the warnings in the test suite. >> Thoughts, Chris? > > My opinion is we either have to completely support BPlite (and the others) > or drop it altogether. I don't think we can state "use BPLite only with > Bio::Index::Blast, use SearchIO everywhere else." That's too inconsistent. > > > It seems simpler to deprecate the various Bio::Tools::BP* classes and either > fix Bio::Index::Blast to use Bio::SearchIO (which I think Brian is working > on) or deprecate Bio::Index::Blast as well. > > The warnings in the test suite belong to BlastIndex.t, correct? I updated > using Brian's Bio::Index::blast fix and it passes now w/o warnings. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From bosborne11 at verizon.net Mon Oct 16 23:48:56 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 16 Oct 2006 23:48:56 -0400 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: Message-ID: Chris, OK. In fact there's no written guarantee that all Bio::Index* modules have an id_parser() method. It happens that most do, and it's useful. I'll fix the documentation in Bio::Index::Blast and add an enhancement request to Bugzilla, may be able to get around to before 1.5.2 release but no promises. Brian O. On 10/16/06 11:34 PM, "Chris Fields" wrote: > > On Oct 16, 2006, at 10:24 PM, Brian Osborne wrote: > >> Chris and Sendu, >> >> Sendu was correct in wondering whether id_parser() in Blast.pm >> would work >> after the module was altered to use SearchIO but what I've found >> out from my >> local tests is that id_parser() didn't work when BPlite was being used >> either. I can continue to work on this but it's safe to say that >> removing >> BPlite doesn't cause a problem with id_parser, it was already there. >> >> Brian O. > > .... > > It may be one reason (the main reason?) the method wasn't tested. > Maybe it should be removed if it can't be easily fixed; I don't think > it makes sense keeping it otherwise. > > Chris > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From n.haigh at sheffield.ac.uk Tue Oct 17 02:35:43 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 17 Oct 2006 07:35:43 +0100 Subject: [Bioperl-l] INSTALL and INSTALL.WIN Message-ID: <453479BF.90408@sheffield.ac.uk> I'm a bit unclear as to what is happening with these files. Are these files now superseded by the wikified versions? If so, should these files now just simply contain a link to the wikified versions - otherwise things could get in a mess since I updated the wiki version of INSTALL.WIN and I see Sendu updated the cvs versions a couple of weeks ago - hopefully these differences aren't that big. Nath From faruque at ebi.ac.uk Tue Oct 17 04:19:44 2006 From: faruque at ebi.ac.uk (Nadeem Faruque) Date: Tue, 17 Oct 2006 09:19:44 +0100 Subject: [Bioperl-l] split location problems Message-ID: EMBL' currently outputs join-complements in the format join(complement(30..40),complement(10..20)) instead of the Genbank preferred complement(join(10..20,30..40)) EMBL's may reflect what happens in the cell a little more than Genbank's, but it is less readable and less concise. NB I've also seen a couple of people construct these incorrectly eg join(complement(10..20),complement(30..40)) I believe we are moving to the complement-join format but I can't give a date for the transition. Having said that, trans-splicing will still give us the joys of complex locations, eg join(1..5,complement(join(10..20,30..40))) complement(join(30..40,10..20)) <- looks wrong (unless it is a very small circle) but mis-ordered exons are resolved by the trans- splicing machinery. Nadeem -- S.M. Nadeem N. Faruque EMBL Nucleotide Database Curation Team EMBL Outstation Tel: +44 1223 494611 Fax: +44 1223 494472 The European Bioinformatics Institute URL: http://www.ebi.ac.uk/ Email for data submissions: datasubs at ebi.ac.uk Email for updates: update at ebi.ac.uk ======================================================== From bix at sendu.me.uk Tue Oct 17 04:59:36 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 17 Oct 2006 09:59:36 +0100 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net> References: <4521E74E.1040404@infotech.monash.edu.au> <452F54A1.7010908@sendu.me.uk> <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net> <45333E02.9070808@sendu.me.uk> <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net> Message-ID: <45349B78.8090905@sendu.me.uk> Hilmar Lapp wrote: > So it looks like an abstract base class, not an interface that > defines a contract or API? Should use Root.pm then, would be my vote. Agreed, that was actually what I did in my local copy when I made a new inheriting class (so discovering the problem). This change is harmless to other modules, but does mean they'll have redundant use of Bio::Root::Root which will want cleaning up at some stage. From bix at sendu.me.uk Tue Oct 17 06:32:54 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 17 Oct 2006 11:32:54 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 Message-ID: <4534B156.4090501@sendu.me.uk> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. See http://www.bioperl.org/wiki/Release_1.5.2 for instructions on getting and testing this RC. Developers: This should be the last RC before release ~next monday. Now would be a good time for last minute documentaiton updates and additions. Users: Even though 1.5.2 is a 'developer' release, we consider it the most stable and capable version of Bioperl, and recommend that you use it in all but the most critical production environments. Please try it out and let us know of any problems or difficulties you run into. Thank you, Sendu. From cjfields at uiuc.edu Tue Oct 17 07:16:47 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 06:16:47 -0500 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <453479BF.90408@sheffield.ac.uk> References: <453479BF.90408@sheffield.ac.uk> Message-ID: <72C437E7-8F69-416D-996B-FF3DD1498E78@uiuc.edu> The general consensus was to keep text versions available; we could add URL links to the wiki pages for the most up-to-dat version. BTW, I have modified INSTALL already. INSTALL.WIN is next in line (I was waiting for your changes). Chris On Oct 17, 2006, at 1:35 AM, Nathan S. Haigh wrote: > I'm a bit unclear as to what is happening with these files. > > Are these files now superseded by the wikified versions? If so, should > these files now just simply contain a link to the wikified versions - > otherwise things could get in a mess since I updated the wiki > version of > INSTALL.WIN and I see Sendu updated the cvs versions a couple of weeks > ago - hopefully these differences aren't that big. > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Tue Oct 17 07:45:45 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 17 Oct 2006 12:45:45 +0100 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <72C437E7-8F69-416D-996B-FF3DD1498E78@uiuc.edu> References: <453479BF.90408@sheffield.ac.uk> <72C437E7-8F69-416D-996B-FF3DD1498E78@uiuc.edu> Message-ID: <4534C269.5050704@sheffield.ac.uk> Chris Fields wrote: > The general consensus was to keep text versions available; we could > add URL links to the wiki pages for the most up-to-dat version. BTW, > I have modified INSTALL already. INSTALL.WIN is next in line (I was > waiting for your changes). > Is it possible to generate these files from the wiki whenever there is a release? I now edits shouldn't be too severe or too often - but I can see things getting a little messy/annoying if edits have to be made in 2 places. Nath From cjfields at uiuc.edu Tue Oct 17 10:04:32 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 09:04:32 -0500 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <4534C269.5050704@sheffield.ac.uk> Message-ID: <000301c6f1f5$2d3509d0$15327e82@pyrimidine> There isn't a very easy way since so many links have to be removed/modified. I have found a few CPAN modules that could help, but for now I just dump the text output from a text browser (elinks) using the 'printable version' page and hand-edit, which works very quickly. That works for the time being until I can find another more automated solution. Fortunately there have been very few edits to either INSTALL wiki page so they should remain relatively stable. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk] > Sent: Tuesday, October 17, 2006 6:46 AM > To: Chris Fields > Cc: bioperl-l > Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN > > Chris Fields wrote: > > The general consensus was to keep text versions available; we could > > add URL links to the wiki pages for the most up-to-dat version. BTW, > > I have modified INSTALL already. INSTALL.WIN is next in line (I was > > waiting for your changes). > > > Is it possible to generate these files from the wiki whenever there is a > release? I now edits shouldn't be too severe or too often - but I can > see things getting a little messy/annoying if edits have to be made in 2 > places. > > Nath From cjfields at uiuc.edu Tue Oct 17 10:12:09 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 09:12:09 -0500 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: Message-ID: <000401c6f1f6$424b5580$15327e82@pyrimidine> > Chris, > > OK. In fact there's no written guarantee that all Bio::Index* modules have > an id_parser() method. It happens that most do, and it's useful. I'll fix > the documentation in Bio::Index::Blast and add an enhancement request to > Bugzilla, may be able to get around to before 1.5.2 release but no > promises. > > Brian O. Do the various Bio::Index* modules share a common interface? I wouldn't worry too much about it for this release, unless you really have time. It is still, after all, a developer's release, and you've noted it in Bugzilla. We could try for another dev release in winter (rel 1.5.3, I guess) to get any bug fixes or new modules added. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > On 10/16/06 11:34 PM, "Chris Fields" wrote: > > > > > On Oct 16, 2006, at 10:24 PM, Brian Osborne wrote: > > > >> Chris and Sendu, > >> > >> Sendu was correct in wondering whether id_parser() in Blast.pm > >> would work > >> after the module was altered to use SearchIO but what I've found > >> out from my > >> local tests is that id_parser() didn't work when BPlite was being used > >> either. I can continue to work on this but it's safe to say that > >> removing > >> BPlite doesn't cause a problem with id_parser, it was already there. > >> > >> Brian O. > > > > .... > > > > It may be one reason (the main reason?) the method wasn't tested. > > Maybe it should be removed if it can't be easily fixed; I don't think > > it makes sense keeping it otherwise. > > > > Chris > > > > Christopher Fields > > Postdoctoral Researcher > > Lab of Dr. Robert Switzer > > Dept of Biochemistry > > University of Illinois Urbana-Champaign > > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From n.haigh at sheffield.ac.uk Tue Oct 17 10:15:17 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 17 Oct 2006 15:15:17 +0100 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <000301c6f1f5$2d3509d0$15327e82@pyrimidine> References: <000301c6f1f5$2d3509d0$15327e82@pyrimidine> Message-ID: <4534E575.5050308@sheffield.ac.uk> Chris Fields wrote: > There isn't a very easy way since so many links have to be removed/modified. > I have found a few CPAN modules that could help, but for now I just dump the > text output from a text browser (elinks) using the 'printable version' page > and hand-edit, which works very quickly. That works for the time being > until I can find another more automated solution. > > Fortunately there have been very few edits to either INSTALL wiki page so > they should remain relatively stable. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > So am I correct in saying that the best way is to make all updates to the wikified versions of these files, and then at regular intervals/major releases you (or someone else) will update the CVS version of the files in the way describe above? Cheers Nath From bix at sendu.me.uk Tue Oct 17 10:00:39 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 17 Oct 2006 15:00:39 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4534E09C.9030707@genomics.dk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> Message-ID: <4534E207.8030508@sendu.me.uk> Niels Larsen wrote: > Greetings, > > I am no perl beginner, but I am a BioPerl beginner. Today I looked > for remote similarity services that can be used from Perl. I found > the EBI SOAP interface where their example script returns > > Can't find method element in the message at > /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311. What script exactly? There was a problem with the SOAP server that was fixed earlier today. > and the DDBJ service which (from Denmark) returns > > undef What returned undef? Specifics please. > and then the NCBI server accessed through BioPerls RemoteBlast which > seems to spin in a loop that fills TMPDIR with many tempfiles. Will > release 1.5.2 include improvements to RemoteBlast? (I see Roger Hall > is working towards that). What version of Bioperl were you testing with? What did you do to get it to 'spin in a loop'? I can tell you that remote blasting certainly works in Bioperl 1.5.2, but you'll have to give more details on the things you tried and the problems you encountered. You can also answer the questions yourself by trying the release candidate. From B.Beckert at ibmc.u-strasbg.fr Tue Oct 17 09:59:30 2006 From: B.Beckert at ibmc.u-strasbg.fr (Bertrand Beckert) Date: Tue, 17 Oct 2006 15:59:30 +0200 Subject: [Bioperl-l] problems with: Bio::Tools::Run::RemoteBlast Message-ID: hi, I am running a large number of blasts via a connexion to ncbi blast page ('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi'). I try to use 'Bio::Tools::Run::RemoteBlast' but unfortunately I have some problems. I make a simple example with only one sequence in order to understand how work this module. This is my simple input file, a DNA sequence in fasta form: > test > TTTTGATGAGGCGCATCAATCATGAGTAAAGTTTAGATTACTGTCTGCTAACAGCTGAAT TTGAAAGGGTGCGATGCCGAAGCGATTATAATAGCAGTTATAATTTGTTGGACTTTTTGG TTAAGAGCTGAGAGTTTGTCATTATTTAAAAATAATGGAGTGCATCACTTGTA I have made some modification of the example available in doc of bioperl. It give me a RID which contain the results of my blast but I have a problem with the "$result=$factory->retrieve_blast($rid)" in my script. In the documentation it wrote that $result=$factory->retrieve_blast ($rid) return when it work a Bio::Tools::Bplite or Bio::Tools::Blast object. In my case it returns a Bio::SearchIO::blast... I don't understand why I don't have the good type of object return (see PART I). I also try to resolve the problem by replace the foreach loop in my script by a new one in order to explore the blast page result but it also don't work (see part II). could you help me please. Thank you Bertrand Beckert. PART I: Here is my script with a little annotation and also the shell window printing: ------------------------------------------------------------------------ ---------------------------- #!/usr/bin/perl -w use Bio::Tools::Run::RemoteBlast; use Bio::SearchIO; sub blast { my $prog='blastn'; my $db='refseq_genomic'; my $e_val='1e-10'; my $Input='Seq.fasta'; my @params = ('-prog' => $prog, '-data' => $db, '-expect' => $e_val, '-readmethod' => 'SearchIO'); my $factory = Bio::Tools::Run::RemoteBlast->new(@params); #changes parameters $Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'}='Bacteria [ORGN]'; $Bio::Tools::Run::RemoteBlast::HEADER{'MATRIX_NAME'}='BLOSUM25'; $factory->submit_blast($Input); print STDERR "waiting...\n"; while (my @rids=$factory->each_rid) { print "my rid: ", at rids,"\n"; #return me the ID of the submited blast i.e. RID: 1161079157-766-185099855365.BLASTQ2 #this page contains the result of my blast... foreach my $rid (@rids) { $result=$factory->retrieve_blast($rid); #line in order to understand what type of object is return by retrieve_blast print "rc:", $result,"\n"; } } } &blast; ------------------------------------------------------------------------ ---------------------------- here you can see the shell window: bbeckert at tatooine:~/Script_perl$ ./test.pl waiting... my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x890bc54) my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x890bc30) my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x89eb7f4) my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x8a2cc74) my rid: 1161079157-766-185099855365.BLASTQ2 ... my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x886bbac) my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x89eb5f0) my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x8a2d2d4) my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x84fa054) ... PARTII: I also try to resolve the problem by replace the foreach loop in my script by: ------------------------------------------------------------------------ ---------------------------- foreach my $rid (@rids) { while(1) { $result=$factory->retrieve_blast($rid)->next_result(); print "rc:", $result,"\n"; if ($result) { print $result->num_hits(),"\n"; } ------------------------------------------------------------------------ ---------------------------- With tis loop I could explore the result Blast page. that is what I obtain in the shell window: bbeckert at tatooine:~/Script_perl$ ./test.pl waiting... my rid: 1161088606-9905-123050755601.BLASTQ4 Use of uninitialized value in print at ./retrieve_blast.pl line 30. rc: Use of uninitialized value in print at ./retrieve_blast.pl line 30. rc: Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::Search::Result::BlastResult=HASH(0x84fb8b8) 0 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::Search::Result::BlastResult=HASH(0x84fba8c) 0 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::Search::Result::BlastResult=HASH(0x84fb834) ---- -- Berrtrand BECKERT PhD student IBMC - UPR 9002 du CNRS - ARN 15, rue Rene Descartes F-67084 STRASBOURG Cedex b.beckert at ibmc.u-strasbg.fr From niels at genomics.dk Tue Oct 17 09:54:36 2006 From: niels at genomics.dk (Niels Larsen) Date: Tue, 17 Oct 2006 15:54:36 +0200 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4534B156.4090501@sendu.me.uk> References: <4534B156.4090501@sendu.me.uk> Message-ID: <4534E09C.9030707@genomics.dk> Greetings, I am no perl beginner, but I am a BioPerl beginner. Today I looked for remote similarity services that can be used from Perl. I found the EBI SOAP interface where their example script returns Can't find method element in the message at /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311. and the DDBJ service which (from Denmark) returns undef and then the NCBI server accessed through BioPerls RemoteBlast which seems to spin in a loop that fills TMPDIR with many tempfiles. Will release 1.5.2 include improvements to RemoteBlast? (I see Roger Hall is working towards that). Niels L ------------------------------------------------------------------------ Niels Larsen Danish Genome Institute Gustav Wieds vej 10 C DK-8000 Aarhus C Denmark Electronic mail: niels at genomics.dk Skype: niels_larsen_denmark Telephone: +45-8942-5268 Telefax: +45-8620-1222 ------------------------------------------------------------------------ From cjfields at uiuc.edu Tue Oct 17 10:28:40 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 09:28:40 -0500 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <4534E575.5050308@sheffield.ac.uk> Message-ID: <000501c6f1f8$8b78efe0$15327e82@pyrimidine> ... > So am I correct in saying that the best way is to make all updates to > the wikified versions of these files, and then at regular > intervals/major releases you (or someone else) will update the CVS > version of the files in the way describe above? > > Cheers > Nath Yes. I think the online docs will stay relatively stable. A week or so ago Mauricio and I were discussing moving the dependencies list to it's own CVS document (since they pertain to all Bioperl installations, not just UNIX'y flavors). I haven't done that yet since I was waiting on the INSTALL.WIN changes before I made any more changes. Well, that and I've been really busy doing other things. One way we could make sure that changes to the online docs would match the CVS docs would be to only allow certain wiki users (such as sysadmins) make modifications to those pages. That way any changes would have to go through someone who also has CVS access and could make similar changes to the distribution docs. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Tue Oct 17 10:37:38 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 17 Oct 2006 15:37:38 +0100 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <000501c6f1f8$8b78efe0$15327e82@pyrimidine> References: <000501c6f1f8$8b78efe0$15327e82@pyrimidine> Message-ID: <4534EAB2.50609@sheffield.ac.uk> Chris Fields wrote: > ... > >> So am I correct in saying that the best way is to make all updates to >> the wikified versions of these files, and then at regular >> intervals/major releases you (or someone else) will update the CVS >> version of the files in the way describe above? >> >> Cheers >> Nath >> > > Yes. I think the online docs will stay relatively stable. A week or so ago > Mauricio and I were discussing moving the dependencies list to it's own CVS > document (since they pertain to all Bioperl installations, not just UNIX'y > flavors). I haven't done that yet since I was waiting on the INSTALL.WIN > changes before I made any more changes. Well, that and I've been really > busy doing other things. > Sounds good. > One way we could make sure that changes to the online docs would match the > CVS docs would be to only allow certain wiki users (such as sysadmins) make > modifications to those pages. That way any changes would have to go through > someone who also has CVS access and could make similar changes to the > distribution docs. > Ugh, not sure I like the sound of maintaining 2 copies of any files - sounds like a future headache even if they are pretty stable. It also makes it unclear which of the two file should be considered first (i.e. is the most up-to-date) on pages such as: http://www.bioperl.org/wiki/Installing_BioPerl It suggests that INSTALL and INSTALL.WIN should be looked at first, but there are online copies of those files available - this should now be the other way around - shouldn't it? I might just be making a mountain out of a molehill, so I'll shut up on this topic and make any future edits to the wiki pages instead. > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > From bosborne11 at verizon.net Tue Oct 17 10:48:54 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 17 Oct 2006 10:48:54 -0400 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <000401c6f1f6$424b5580$15327e82@pyrimidine> Message-ID: Chris, The Bio::Index modules either 'use base qw(Bio::Index::Abstract)' or 'use base qw(Bio::Index::AbstractSeq)'. Neither of these modules has an id_parser() method. Brian O. On 10/17/06 10:12 AM, "Chris Fields" wrote: > Do the various Bio::Index* modules share a common interface? From cjfields at uiuc.edu Tue Oct 17 10:45:53 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 09:45:53 -0500 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <4534EAB2.50609@sheffield.ac.uk> Message-ID: <000601c6f1fa$f260b560$15327e82@pyrimidine> ... > > One way we could make sure that changes to the online docs would match > the > > CVS docs would be to only allow certain wiki users (such as sysadmins) > make > > modifications to those pages. That way any changes would have to go > through > > someone who also has CVS access and could make similar changes to the > > distribution docs. > > > Ugh, not sure I like the sound of maintaining 2 copies of any files - > sounds like a future headache even if they are pretty stable. It also > makes it unclear which of the two file should be considered first (i.e. > is the most up-to-date) on pages such as: > http://www.bioperl.org/wiki/Installing_BioPerl > > It suggests that INSTALL and INSTALL.WIN should be looked at first, but > there are online copies of those files available - this should now be > the other way around - shouldn't it? I might just be making a mountain > out of a molehill, so I'll shut up on this topic and make any future > edits to the wiki pages instead. Yes that should be the other way around (the wiki would be the most up-to-date), so the CVS docs should point to the wiki, not vice-versa. Getting the docs right is as important as getting the code to work. So I don't consider it a 'mountain-out-of-a-molehill' problem. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 17 11:07:49 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 10:07:49 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4534E207.8030508@sendu.me.uk> Message-ID: <001001c6f1fe$02fd4de0$15327e82@pyrimidine> > Niels Larsen wrote: > > Greetings, > > > > I am no perl beginner, but I am a BioPerl beginner. Today I looked > > for remote similarity services that can be used from Perl. I found > > the EBI SOAP interface where their example script returns > > > > Can't find method element in the message at > > /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311. > > What script exactly? There was a problem with the SOAP server that was > fixed earlier today. > > > > and the DDBJ service which (from Denmark) returns > > > > undef > > What returned undef? Specifics please. > The first problem, like Sendu mentions, was fixed on the remote server (I get them to pass now). Those were from bioperl-run, though, not the bioperl core distribution. As for DDBJ, do you mean EBI or SwissProt? I ask b/c you mention Denmark. EBI were having server maintenance outages yesterday, which was announced here. As Sendu mentions, please be more specific. > > and then the NCBI server accessed through BioPerls RemoteBlast which > > seems to spin in a loop that fills TMPDIR with many tempfiles. Will > > release 1.5.2 include improvements to RemoteBlast? (I see Roger Hall > > is working towards that). > > What version of Bioperl were you testing with? What did you do to get it > to 'spin in a loop'? I can tell you that remote blasting certainly works > in Bioperl 1.5.2, but you'll have to give more details on the things you > tried and the problems you encountered. > > You can also answer the questions yourself by trying the release > candidate. The tempfiles showing up are from the repeated RID requests and are deleted after the BLAST run (at least they should be); this is quite normal. They don't 'spin in a loop' unless the BLAST query is taking a particularly long time, which can happen depending on how the BLAST query is set up, i.e. what type of BLAST program is requested, if comp-based stats are requested, length of query, database requested, etc. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Tue Oct 17 11:14:07 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 17 Oct 2006 16:14:07 +0100 Subject: [Bioperl-l] problems with: Bio::Tools::Run::RemoteBlast In-Reply-To: References: Message-ID: <4534F33F.3070809@sendu.me.uk> Bertrand Beckert wrote: > hi, > > I am running a large number of blasts via a connexion to ncbi blast > page ('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi'). > I try to use 'Bio::Tools::Run::RemoteBlast' but unfortunately I have > some problems. [snip] > In the documentation it wrote that $result=$factory->retrieve_blast > ($rid) return when it work a Bio::Tools::Bplite or Bio::Tools::Blast > object. In my case it returns a Bio::SearchIO::blast... I don't > understand why I don't have the good type of object return (see PART I). I take it you're using some old version of Bioperl where unfortunately the documentation was incorrect. In fact you're supposed to get a Bio::SearchIO object, so it is a good thing that you are. The latest version of Bioperl has (as far as I can see) correct documentation and behaviour. Bio::Tools::Bplite and Bio::Tools::Blast are deprecated. You want Bio::SearchIO::blast. All is well. > I also try to resolve the problem by replace the foreach loop in my > script by a new one in order to explore the blast page result but it > also don't work (see part II). I'm not really sure what problem you might be facing there, but take a look at some up-to-date documentation, using the new example code: http://doc.bioperl.org/bioperl-live/Bio/Tools/Run/RemoteBlast.html From n.haigh at sheffield.ac.uk Tue Oct 17 12:10:15 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 17 Oct 2006 17:10:15 +0100 Subject: [Bioperl-l] [Fwd: Re: Bundle::BioPerl] Message-ID: <45350067.6070604@sheffield.ac.uk> FYI on Bundle::BioPerl Nathan -------- Original Message -------- Subject: Re: Bundle::BioPerl Date: Tue, 17 Oct 2006 11:52:00 -0400 From: Chris Dagdigian To: Nathan S. Haigh References: <45348FB8.4050009 at sheffield.ac.uk> Hi Nathan, I've updated the Bundle and uploaded it to CPAN. I *think* the rationale for keeping it still exists but I'm removed enough from Bioperl now that I'll defer to others on the decision. The basic idea was that BioPerl has a heck of a lot of dependencies that it requires of (other perl modules) in order to get all the functionality out of it. Many of these dependencies may not be present in default Perl installations. Tracking down all of the dependencies and installing them (along with all of the dependencies- of-the-dependencies) by hand is a massive pain. The nice thing about the Bundle is that it lists the core module dependencies and it works great with the CPAN.pm module to automate the downloading and installation of everything that BioPerl requires. The CPAN module is smart enough that when processing *our* bundle it will also track down and install anything that our bundle entries themselves list as a dependency. So for unix/Linux systems the Bundle is a great one-liner ("perl - MCPAN -e 'install Bundle::BioPerl'" ) way to auto-install or update the many perl modules that BioPerl makes use of. On the windows side, not sure if it is of any help though. Regards, Chris On Oct 17, 2006, at 4:09 AM, Nathan S. Haigh wrote: > Hi Chris > > I've been working on making a PPD for the upcoming Bioperl 1.5.2 > release. During this time I also updated Bundle::BioPerl to include > up-to-date prereqs. I was wondering if you could update the CPAN > package? The updated BioPerl.pm file is attached. > > There is some talk about why and if we need Bundle::BioPerl > anymore. What was the rationale for having it in the first place, > and does it still hold true now? > > Cheers > Nath > From plu5even at gmail.com Tue Oct 17 12:26:34 2006 From: plu5even at gmail.com (Peter H. Baenziger) Date: Tue, 17 Oct 2006 12:26:34 -0400 Subject: [Bioperl-l] LocatableSeq object vs Sequence Object Message-ID: <830b42e70610170926h5bdaabd6hc1352b8d4f3c8c4c@mail.gmail.com> All, This is my first bioperl script (but not my first Perl script) so please forgive my naivety. I've read through documentation and looked through cookbooks and the like but to no avail. Any advice is appreciated. So...I am working with an alignment object of several sequences. My intentions is to loop through all the sequences of the alignment to find what amino acid they have at a known position in the alignment (not the position in the sequence). I was thinking I could use: foreach $seq ($alignment->each_seq()) to loop through the sequences and call: $seq->location_from_column($pos) on each of the sequences. However, I don't think I have "LocatableSequences" (the type of object that has method "location_from_columns") being returned by $alignment->each_seq(). So, how do I bridge this gap here? Or is there a better way? My appreciation in advance! Peter code: my $swissObj = $swissdb->get_Seq_by_acc($query); //put several of these in @sequenceObjects ... my $alignFactory = Bio::Tools::Run::Alignment::Clustalw->new(); my $alignment = $alignFactory->align(\@sequenceObjects); #print $alignment->overall_percentage_identity(); #works #now we find the "alignment position" of the mutation we have on the human version and get the amino acid at that "alignment position" for all seq my $humanSequence = $prefix."HUMAN"; my $pos = $alignment->column_from_residue_number($humanSequence, $aa_seqpos); #this is the "alignment position" equivalent to the mutation position #we'll keep track of what amino acid each species has at the "alignment equivalent" location listed as being a mutation on the the human version foreach $seq ($alignment->each_seq()) { #print $seq->species() . "\n"; #won't work because $alignment->each_seq() actually returns a locatableSeq object, not a normal sequence object $speciesAA{$species} = $seq->locatation_from_column($pos); } -- <<->> Peter H. Baenziger From akarger at CGR.Harvard.edu Tue Oct 17 12:53:19 2006 From: akarger at CGR.Harvard.edu (Amir Karger) Date: Tue, 17 Oct 2006 12:53:19 -0400 Subject: [Bioperl-l] split location problems Message-ID: > From: Jason Stajich [mailto:jason.stajich at gmail.com] > > The whole point of split locations is to represent genes with > introns > so that is not the "rare" case. Absolutely. > I have processed the genbank fungal genomes into GFF3 and > have had no > problems so I'm confused where you are breaking down. If I write > them out as embl I also get the correct thing. This is using > the CVS > version of bioperl from the HEAD. > > I've added code to test this to bug 2101 including a C.glabrata > chromsome downloaded from genbank. Perhaps the problem is on the > EMBL parsing side, I didn't test that. Well, I don't know whether it's EMBL parsing, or a bit further down the pipe, but I downloaded C.glabrata chromosome B for GenBank (NC_005968), and it describes the complement/joins in the way that Bioperl is handling correctly. GenBank: CDS complement(join(10347..10372,10632..11157)) /locus_tag="CAGL0B00242g" EMBL: FT CDS join(complement(10632..11157),complement(10347..10372)) FT /locus_tag="CAGL0B00242g" Here's the diff when I run the location-printing script I posted yesterday: diff biogb bio 1c1,5 < complement(join(10347..10372,10632..11157)) --- > complement(1701..2651) > complement(2635..3345) > complement(3980..4408) > complement(join(10632..11157,10347..10372)) > 10379..10615 209a214,217 > 498198..498890 > 499712..500062 > 499851..500702 > 500579..501364 As you can see, the complement/join CDS is written out in a different order, which is Bad. (I looked at at least one of the other differences: the GB file says it's a "misc feature" and EMBL says it's a CDS. But they don't seem to be relevant here.) -Amir > > On the technical side, I still am not sure I fully know where the > strand information should be stored - the top level container or the > sub-features. I'll try and stay up on the discussion if > anything has > been decided that I should know about. > > -jason > > > > From paul.boutros at utoronto.ca Tue Oct 17 12:57:19 2006 From: paul.boutros at utoronto.ca (Paul Boutros) Date: Tue, 17 Oct 2006 12:57:19 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 Message-ID: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca> Hi, Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two failed tests, the first seems to be just a result of me not having DBD::mysql installed. Paul Test Summary ============ Failed Test Stat Wstat Total Fail List of Failed ------------------------------------------------------------------------------- t/BioDBSeqFeature_mysql.t 46 46 1-46 t/SearchIO.t 22 5632 1337 2671 2-1337 2 tests and 106 subtests skipped. Failed 2/236 test scripts. 1382/11688 subtests failed. Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys = 159.61 CPU) BioDBSeqFeature_mysql ===================== pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t 1..46 install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8 /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at (eval 37) line 3. Perhaps the DBD::mysql perl module hasn't been fully installed, or perhaps the capitalisation of 'mysql' isn't right. Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge. at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208 SearchIO ======== pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more 1..1337 ok 1 -------------------- WARNING --------------------- MSG: XML::SAX::Expat not currently supported; must have local copies of NCBI DTD docs! --------------------------------------------------- -------------------- WARNING --------------------- MSG: error in parsing a report: 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' does not exist file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd Handler couldn't resolve external entity at line 2, column 82, byte 104 error in processing external entity reference at line 2, column 82, byte 104 at /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line 187 --------------------------------------------------- not ok 2 # Failed test 2 in t/SearchIO.t at line 68 Can't call method "database_name" on an undefined value at t/SearchIO.t line 69. ------------------------------ Message: 10 Date: Tue, 17 Oct 2006 11:32:54 +0100 From: Sendu Bala Subject: [Bioperl-l] Bioperl 1.5.2 RC2 To: bioperl-l at bioperl.org Message-ID: <4534B156.4090501 at sendu.me.uk> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. See http://www.bioperl.org/wiki/Release_1.5.2 for instructions on getting and testing this RC. Developers: This should be the last RC before release ~next monday. Now would be a good time for last minute documentaiton updates and additions. Users: Even though 1.5.2 is a 'developer' release, we consider it the most stable and capable version of Bioperl, and recommend that you use it in all but the most critical production environments. Please try it out and let us know of any problems or difficulties you run into. Thank you, Sendu. From barry.moore at genetics.utah.edu Tue Oct 17 12:57:48 2006 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Tue, 17 Oct 2006 10:57:48 -0600 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <000301c6f1f5$2d3509d0$15327e82@pyrimidine> References: <000301c6f1f5$2d3509d0$15327e82@pyrimidine> Message-ID: <3CACE1AB-D7F2-4F54-A1C3-35E1AE444CD1@genetics.utah.edu> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix does a reasonable job of textifying html. You get the links as numbered references at the bottom or: lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix | perl -ane 's/\[?\[\d+\](edit\])?//g;print' to remove the links all together. Barry P.S. Looks like this: #Creative Commons copyright Installing Bioperl for Unix From BioPerl Jump to: navigation, search Contents * 1 BIOPERL INSTALLATION * 2 SYSTEM REQUIREMENTS * 3 OPTIONAL * 4 ADDITIONAL INSTALLATION INFORMATION * 5 THE BIOPERL BUNDLE * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make' * 8 WHERE ARE THE MAN PAGES? * 9 EXTERNAL PROGRAMS + 9.1 Environment Variables * 10 INSTALLING BIOPERL SCRIPTS * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA * 12 INSTALLING BIOPERL MODULES THE HARD WAY * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION * 14 THE TEST SYSTEM * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE + 15.1 CONFIGURING for BSD and Solaris boxes + 15.2 INSTALLATION * 16 DEPENDENCIES AND Bundle::BioPerl BIOPERL INSTALLATION Bioperl has been installed on many forms of Unix, Win9X/NT/2000/XP, and on Mac OS X (see the PLATFORMS file for more details). Following are instructions for installing Bioperl for Unix/Linux/Mac OS X; Windows installation instructions can be found here. For installing Bioperl for Mac OS X using Fink, see Getting BioPerl. SYSTEM REQUIREMENTS * Perl 5.005 or later; version 5.6 and greater are recommended. Note that most modules will work with earlier versions of Perl. The only ones that will not are Bio::SimpleAlign and the Bio::Index::* modules. If you don't need these modules and you want to install Bioperl using an earlier version of Perl, edit the "require 5.005;" line in Makefile.PL as necessary. * External modules: Bioperl uses functionality provided in other Perl modules. Some of these are included in the standard perl package but some need to be obtained from the CPAN site. The list of external modules is included at the bottom of this document. The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of these external modules easy. Simply install the bundle using your CPAN shell and all necessary modules will be installed. See THE BIOPERL BUNDLE, below. OPTIONAL * ANSI C or GNU C compiler (gcc) for XS extensions (the bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext PACKAGE, below). ADDITIONAL INSTALLATION INFORMATION * Additional information on Bioperl and MAC OS: + OS 9 - http://bioperl.org/Core/mac-bioperl.html + OSX-http://www.tc.umn.edu/~cann0010/ Bioperl_OSX_install.html + OS X - Installing using Fink (in Getting BioPerl) THE BIOPERL BUNDLE You typically need root privileges to install using CPAN. If you don't have these privileges please see INSTALLING BIOPERL IN A PERSONAL MODULE AREA for additional information. Install Bundle::Bioperl using CPAN. One way: >perl -MCPAN -e "install Bundle::BioPerl" Another way: >perl -MCPAN -e shell cpan>install Bundle::BioPerl On Oct 17, 2006, at 8:04 AM, Chris Fields wrote: > There isn't a very easy way since so many links have to be removed/ > modified. > I have found a few CPAN modules that could help, but for now I just > dump the > text output from a text browser (elinks) using the 'printable > version' page > and hand-edit, which works very quickly. That works for the time > being > until I can find another more automated solution. > > Fortunately there have been very few edits to either INSTALL wiki > page so > they should remain relatively stable. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > >> -----Original Message----- >> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk] >> Sent: Tuesday, October 17, 2006 6:46 AM >> To: Chris Fields >> Cc: bioperl-l >> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN >> >> Chris Fields wrote: >>> The general consensus was to keep text versions available; we could >>> add URL links to the wiki pages for the most up-to-dat version. >>> BTW, >>> I have modified INSTALL already. INSTALL.WIN is next in line (I was >>> waiting for your changes). >>> >> Is it possible to generate these files from the wiki whenever >> there is a >> release? I now edits shouldn't be too severe or too often - but I can >> see things getting a little messy/annoying if edits have to be >> made in 2 >> places. >> >> Nath > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From niels at genomics.dk Tue Oct 17 12:58:14 2006 From: niels at genomics.dk (Niels Larsen) Date: Tue, 17 Oct 2006 18:58:14 +0200 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4534E207.8030508@sendu.me.uk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> Message-ID: <45350BA6.3040102@genomics.dk> Ok, here are ways to reproduce; I sure apologize if I made the test scripts wrong. And I suppose EBI/DDBJ's interfaces are not a bioperl issue really. Niels ------------ EBI I invoked the EBI script http://www.ebi.ac.uk/Tools/webservices/downloads/WSWUBlastSOAPLite.zip like this WSWUBlastClient.pl -p blastn -D embl test.fasta where the content of test.fasta is below, and got Can't find method element in the message at /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311. >Planctomyces sp. 282; Genbank Taxonomy ID: 79927 AATGAACGTTGGCGGCATGGATTAGGCATGCAAGTCGAGGGAGAACCCGCAAGGGGACACCGGCG AACGGGGTAGGAATACATAGGTAACGTACCCTCAGGACGGGGATAGCCAAGGGAAACTTTGGGTA ATACCCGATGTGATGGCAAGATGTGAATGCTTGTCATCAAAGGTGAGATTCCACCTGAGGAGCGG CTTATGCATCATTAGCTTGTTGGCGGGGTAACGGCCCACCAAGGCTGCGATGATTAGGGGGTGTG AGAGCATGGCCCCCACCACTGGCACTGAGACACTGGCCAGACACCTACGGGTGGCTGCAGTCGAG I tried with this test sequence in fasta format and with just the sequence. ------------ DDBJ Inspired by this page, http://xml.nig.ac.jp/doc/Blast.txt I made this test script ------ cut -- #!/usr/bin/env perl use strict; use warnings FATAL => qw ( all ); my ( $service, $seqstr, $result ); use SOAP::Lite; use Data::Dumper; $service = SOAP::Lite->service('http://xml.nig.ac.jp/wsdl/Blast.wsdl'); $seqstr = "MSSRIARALALVVTLLHLTRLALSTCPAACHCPLEAPKCAPGVGLVRDGCGCCKVCAKQL"; $result = $service->searchSimple( "blastp", "SWISS", $seqstr ); print Dumper( $result ); ------ cut -- which for me prints undef. ------------- NCBI/Bioperl I installed 1.5.2-RC2, looked at the RemoteBlast example in http://www.bioperl.org/wiki/Bptutorial.pl and then put that into this test code, more or less cut/paste, --- cut -- #!/usr/bin/env perl use strict; use warnings FATAL => qw ( all ); use Bio::Tools::Run::RemoteBlast; use Data::Dumper; my ( $remote_blast, $r, $rc, $rid, @rids ); $remote_blast = Bio::Tools::Run::RemoteBlast->new ( -prog => 'blastn', -data => 'ecoli', -expect => '1e-10' ); $r = $remote_blast->submit_blast("ecoli.fasta"); while ( @rids = $remote_blast->each_rid ) { # print Dumper( \@rids ); for $rid ( @rids ) { $rc = $remote_blast->retrieve_blast($rid); # print Dumper( $rc ); } sleep 10; } --- cut -- which saves the same blast report to TMPDIR for every 10 seconds. The "ecoli.fasta" file contains this >test gggggctctgttggttctcccgcaacgctactctgtttaccaggtcaggtccggaaggaa gcagccaaggcagatgacgcgtgtgccgggatgtagctggcagggcccccaccc Maybe I am supposed to add a check for content in $rc and then stop the inner loop? I could figure that out maybe, but I wish there was a function which simply takes a single sequence + arguments and only returns a list of matches when done, and does not return until then (or until a specified timeout). ------------------------------------------------------------------------ Niels Larsen Danish Genome Institute Gustav Wieds vej 10 C DK-8000 Aarhus C Denmark Electronic mail: niels at genomics.dk Skype: niels_larsen_denmark Telephone: +45-8942-5268 Telefax: +45-8620-1222 ------------------------------------------------------------------------ From bertrand.beckert at gmail.com Tue Oct 17 10:52:36 2006 From: bertrand.beckert at gmail.com (bertrand beckert) Date: Tue, 17 Oct 2006 16:52:36 +0200 Subject: [Bioperl-l] problems with: Bio::Tools::Run::RemoteBlast Message-ID: <500217090610170752q565cfc08t5208e3b64f99ef7f@mail.gmail.com> hi, I am running a large number of blasts via a connexion to ncbi blast page ('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi'). I try to use 'Bio::Tools::Run::RemoteBlast' but unfortunately I have some problems. I make a simple example with only one sequence in order to understand how work this module. This is my simple input file, a DNA sequence in fasta form: >test TTTTGATGAGGCGCATCAATCATGAGTAAAGTTTAGATTACTGTCTGCTAACAGCTGAAT TTGAAAGGGTGCGATGCCGAAGCGATTATAATAGCAGTTATAATTTGTTGGACTTTTTGG TTAAGAGCTGAGAGTTTGTCATTATTTAAAAATAATGGAGTGCATCACTTGTA I have made some modification of the example available in doc of bioperl. It give me a RID which contain the results of my blast but I have a problem with the "$result=$factory->retrieve_blast($rid)" in my script. In the documentation it wrote that $result=$factory->retrieve_blast ($rid) return when it work a Bio::Tools::Bplite or Bio::Tools::Blast object. In my case it returns a Bio::SearchIO::blast... I don't understand why I don't have the good type of object return (see PART I). I also try to resolve the problem by replace the foreach loop in my script by a new one in order to explore the blast page result but it also don't work (see part II). could you help me please. Thank you Bertrand Beckert. PART I: Here is my script with a little annotation and also the shell window printing: ------------------------------------------------------------------------ #!/usr/bin/perl -w use Bio::Tools::Run::RemoteBlast; use Bio::SearchIO; sub blast { my $prog='blastn'; my $db='refseq_genomic'; my $e_val='1e-10'; my $Input='Seq.fasta'; my @params = ('-prog' => $prog, '-data' => $db, '-expect' => $e_val, '-readmethod' => 'SearchIO'); my $factory = Bio::Tools::Run::RemoteBlast->new(@params); #changes parameters $Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'}='Bacteria [ORGN]'; $Bio::Tools::Run::RemoteBlast::HEADER{'MATRIX_NAME'}='BLOSUM25'; $factory->submit_blast($Input); print STDERR "waiting...\n"; while (my @rids=$factory->each_rid) { print "my rid: ", at rids,"\n"; #return me the ID of the submited blast i.e. RID: 1161079157-766-185099855365.BLASTQ2 #this page contains the result of my blast... foreach my $rid (@rids) { $result=$factory->retrieve_blast($rid); #line in order to understand what type of object is return by retrieve_blast print "rc:", $result,"\n"; } } } &blast; ------------------------------------------------------------------------ here you can see the shell window: bbeckert at tatooine:~/Script_perl$ ./test.pl waiting... my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x890bc54) my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x890bc30) my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x89eb7f4) my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x8a2cc74) my rid: 1161079157-766-185099855365.BLASTQ2 ... my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x886bbac) my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x89eb5f0) my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x8a2d2d4) my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x84fa054) ... PARTII: I also try to resolve the problem by replace the foreach loop in my script by: ------------------------------------------------------------------------ foreach my $rid (@rids) { while(1) { $result=$factory->retrieve_blast($rid)->next_result(); print "rc:", $result,"\n"; if ($result) { print $result->num_hits(),"\n"; } ------------------------------------------------------------------------ With tis loop I could explore the result Blast page. that is what I obtain in the shell window: bbeckert at tatooine:~/Script_perl$ ./test.pl waiting... my rid: 1161088606-9905-123050755601.BLASTQ4 Use of uninitialized value in print at ./retrieve_blast.pl line 30. rc: Use of uninitialized value in print at ./retrieve_blast.pl line 30. rc: Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::Search::Result::BlastResult=HASH(0x84fb8b8) 0 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::Search::Result::BlastResult=HASH(0x84fba8c) 0 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::Search::Result::BlastResult=HASH(0x84fb834) ---- -- Berrtrand BECKERT PhD student IBMC - UPR 9002 du CNRS - ARN 15, rue Rene Descartes F-67084 STRASBOURG Cedex b.beckert at ibmc.u-strasbg.fr bertrand.beckert at gmail.com From cjfields at uiuc.edu Tue Oct 17 13:50:49 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 12:50:49 -0500 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <3CACE1AB-D7F2-4F54-A1C3-35E1AE444CD1@genetics.utah.edu> Message-ID: <001201c6f214$c8934440$15327e82@pyrimidine> (Apologies for the top post, but I thought my response might get lost below) I use elinks in a similar fashion. It tends to format the tables a bit better than lynx. Chris > -----Original Message----- > From: Barry Moore [mailto:barry.moore at genetics.utah.edu] > Sent: Tuesday, October 17, 2006 11:58 AM > To: Chris Fields > Cc: 'Nathan S. Haigh'; 'bioperl-l' > Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN > > lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix > > does a reasonable job of textifying html. You get the links as > numbered references at the bottom or: > > lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix | > perl -ane 's/\[?\[\d+\](edit\])?//g;print' > > to remove the links all together. > > Barry > > P.S. Looks like this: > > #Creative Commons copyright > > Installing Bioperl for Unix > > From BioPerl > > Jump to: navigation, search > > Contents > > * 1 BIOPERL INSTALLATION > * 2 SYSTEM REQUIREMENTS > * 3 OPTIONAL > * 4 ADDITIONAL INSTALLATION INFORMATION > * 5 THE BIOPERL BUNDLE > * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN > * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make' > * 8 WHERE ARE THE MAN PAGES? > * 9 EXTERNAL PROGRAMS > + 9.1 Environment Variables > * 10 INSTALLING BIOPERL SCRIPTS > * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA > * 12 INSTALLING BIOPERL MODULES THE HARD WAY > * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION > * 14 THE TEST SYSTEM > * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE > + 15.1 CONFIGURING for BSD and Solaris boxes > + 15.2 INSTALLATION > * 16 DEPENDENCIES AND Bundle::BioPerl > > > BIOPERL INSTALLATION > > Bioperl has been installed on many forms of Unix, Win9X/NT/2000/XP, > and on Mac OS X (see the PLATFORMS file for more details). > Following are > instructions for installing Bioperl for Unix/Linux/Mac OS X; > Windows > installation instructions can be found here. For installing > Bioperl for > Mac OS X using Fink, see Getting BioPerl. > > > SYSTEM REQUIREMENTS > > * Perl 5.005 or later; version 5.6 and greater are recommended. > Note > that most modules will work with earlier versions of Perl. > The only ones > that will not are Bio::SimpleAlign and the Bio::Index::* > modules. If > you don't need these modules and you want to install Bioperl > using an > earlier version of Perl, edit the "require 5.005;" line in > Makefile.PL > as necessary. > > * External modules: Bioperl uses functionality provided in > other Perl > modules. Some of these are included in the standard perl > package but > some need to be obtained from the CPAN site. The list of > external > modules is included at the bottom of this document. > > The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of > these > external modules easy. Simply install the bundle using your CPAN > shell and > all necessary modules will be installed. See THE BIOPERL BUNDLE, > below. > > > OPTIONAL > > * ANSI C or GNU C compiler (gcc) for XS extensions (the > bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext > PACKAGE, below). > > > > ADDITIONAL INSTALLATION INFORMATION > > * Additional information on Bioperl and MAC OS: > + OS 9 - http://bioperl.org/Core/mac-bioperl.html > + OSX-http://www.tc.umn.edu/~cann0010/ > Bioperl_OSX_install.html > + OS X - Installing using Fink (in Getting BioPerl) > > > > THE BIOPERL BUNDLE > > You typically need root privileges to install using CPAN. If you > don't > have these privileges please see INSTALLING BIOPERL IN A PERSONAL > MODULE > AREA for additional information. > > Install Bundle::Bioperl using CPAN. One way: > >perl -MCPAN -e "install Bundle::BioPerl" > > Another way: > >perl -MCPAN -e shell > cpan>install Bundle::BioPerl > > > > On Oct 17, 2006, at 8:04 AM, Chris Fields wrote: > > > There isn't a very easy way since so many links have to be removed/ > > modified. > > I have found a few CPAN modules that could help, but for now I just > > dump the > > text output from a text browser (elinks) using the 'printable > > version' page > > and hand-edit, which works very quickly. That works for the time > > being > > until I can find another more automated solution. > > > > Fortunately there have been very few edits to either INSTALL wiki > > page so > > they should remain relatively stable. > > > > Christopher Fields > > Postdoctoral Researcher - Switzer Lab > > Dept. of Biochemistry > > University of Illinois Urbana-Champaign > > > >> -----Original Message----- > >> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk] > >> Sent: Tuesday, October 17, 2006 6:46 AM > >> To: Chris Fields > >> Cc: bioperl-l > >> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN > >> > >> Chris Fields wrote: > >>> The general consensus was to keep text versions available; we could > >>> add URL links to the wiki pages for the most up-to-dat version. > >>> BTW, > >>> I have modified INSTALL already. INSTALL.WIN is next in line (I was > >>> waiting for your changes). > >>> > >> Is it possible to generate these files from the wiki whenever > >> there is a > >> release? I now edits shouldn't be too severe or too often - but I can > >> see things getting a little messy/annoying if edits have to be > >> made in 2 > >> places. > >> > >> Nath > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Tue Oct 17 13:52:36 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 12:52:36 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca> Message-ID: <001301c6f215$07a9a070$15327e82@pyrimidine> What do you get when you run the SearchIO.t test by itself using 'perl -I. t/SearchIO.t'? It looks like something pretty catastrophic happened. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Paul Boutros > Sent: Tuesday, October 17, 2006 11:57 AM > To: bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2 > > Hi, > Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two failed > tests, the first seems to be just a result of me not having DBD::mysql > installed. > Paul > > Test Summary > ============ > > Failed Test Stat Wstat Total Fail List of Failed > -------------------------------------------------------------------------- > ----- > t/BioDBSeqFeature_mysql.t 46 46 1-46 > t/SearchIO.t 22 5632 1337 2671 2-1337 > 2 tests and 106 subtests skipped. > Failed 2/236 test scripts. 1382/11688 subtests failed. > Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys = > 159.61 CPU) > > BioDBSeqFeature_mysql > ===================== > pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t > 1..46 > install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC > contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t > /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8 > /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi > /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at > (eval 37) line 3. > Perhaps the DBD::mysql perl module hasn't been fully installed, > or perhaps the capitalisation of 'mysql' isn't right. > Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge. > at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208 > > SearchIO > ======== > pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more > 1..1337 > ok 1 > > -------------------- WARNING --------------------- > MSG: XML::SAX::Expat not currently supported; must have local copies > of NCBI DTD docs! > --------------------------------------------------- > > -------------------- WARNING --------------------- > MSG: error in parsing a report: > > 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' > does not exist > file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd > Handler couldn't resolve external entity at line 2, column 82, byte 104 > error in processing external entity reference at line 2, column 82, > byte 104 at > /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line > 187 > > --------------------------------------------------- > not ok 2 > # Failed test 2 in t/SearchIO.t at line 68 > Can't call method "database_name" on an undefined value at > t/SearchIO.t line 69. > > ------------------------------ > > Message: 10 > Date: Tue, 17 Oct 2006 11:32:54 +0100 > From: Sendu Bala > Subject: [Bioperl-l] Bioperl 1.5.2 RC2 > To: bioperl-l at bioperl.org > Message-ID: <4534B156.4090501 at sendu.me.uk> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. > See http://www.bioperl.org/wiki/Release_1.5.2 for > instructions on getting and testing this RC. > > Developers: > This should be the last RC before release ~next monday. Now would > be a good time for last minute documentaiton updates and additions. > > Users: > Even though 1.5.2 is a 'developer' release, we consider it the most > stable and capable version of Bioperl, and recommend that you use > it in all but the most critical production environments. Please > try it out and let us know of any problems or difficulties you run > into. > > > Thank you, > Sendu. > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From paul.boutros at utoronto.ca Tue Oct 17 13:59:33 2006 From: paul.boutros at utoronto.ca (Paul Boutros) Date: Tue, 17 Oct 2006 13:59:33 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <001301c6f215$07a9a070$15327e82@pyrimidine> References: <001301c6f215$07a9a070$15327e82@pyrimidine> Message-ID: <20061017135933.cu3li7pz8fzk8cko@webmail.utoronto.ca> Hi Chris, Here it is: pcboutro at ccb690[643] >> perl -I. t/SearchIO.t 1..1337 ok 1 -------------------- WARNING --------------------- MSG: XML::SAX::Expat not currently supported; must have local copies of NCBI DTD docs! --------------------------------------------------- -------------------- WARNING --------------------- MSG: error in parsing a report: 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' does not exist file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd Handler couldn't resolve external entity at line 2, column 82, byte 104 error in processing external entity reference at line 2, column 82, byte 104 at /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line 187 --------------------------------------------------- not ok 2 # Failed test 2 in t/SearchIO.t at line 68 Can't call method "database_name" on an undefined value at t/SearchIO.t line 69. Quoting Chris Fields : > What do you get when you run the SearchIO.t test by itself using 'perl -I. > t/SearchIO.t'? It looks like something pretty catastrophic happened. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > >> -----Original Message----- >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >> bounces at lists.open-bio.org] On Behalf Of Paul Boutros >> Sent: Tuesday, October 17, 2006 11:57 AM >> To: bioperl-l at lists.open-bio.org >> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2 >> >> Hi, >> Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two failed >> tests, the first seems to be just a result of me not having DBD::mysql >> installed. >> Paul >> >> Test Summary >> ============ >> >> Failed Test Stat Wstat Total Fail List of Failed >> -------------------------------------------------------------------------- >> ----- >> t/BioDBSeqFeature_mysql.t 46 46 1-46 >> t/SearchIO.t 22 5632 1337 2671 2-1337 >> 2 tests and 106 subtests skipped. >> Failed 2/236 test scripts. 1382/11688 subtests failed. >> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys = >> 159.61 CPU) >> >> BioDBSeqFeature_mysql >> ===================== >> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t >> 1..46 >> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC >> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t >> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8 >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi >> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at >> (eval 37) line 3. >> Perhaps the DBD::mysql perl module hasn't been fully installed, >> or perhaps the capitalisation of 'mysql' isn't right. >> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge. >> at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208 >> >> SearchIO >> ======== >> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more >> 1..1337 >> ok 1 >> >> -------------------- WARNING --------------------- >> MSG: XML::SAX::Expat not currently supported; must have local copies >> of NCBI DTD docs! >> --------------------------------------------------- >> >> -------------------- WARNING --------------------- >> MSG: error in parsing a report: >> >> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' >> does not exist >> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd >> Handler couldn't resolve external entity at line 2, column 82, byte 104 >> error in processing external entity reference at line 2, column 82, >> byte 104 at >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line >> 187 >> >> --------------------------------------------------- >> not ok 2 >> # Failed test 2 in t/SearchIO.t at line 68 >> Can't call method "database_name" on an undefined value at >> t/SearchIO.t line 69. >> >> ------------------------------ >> >> Message: 10 >> Date: Tue, 17 Oct 2006 11:32:54 +0100 >> From: Sendu Bala >> Subject: [Bioperl-l] Bioperl 1.5.2 RC2 >> To: bioperl-l at bioperl.org >> Message-ID: <4534B156.4090501 at sendu.me.uk> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed >> >> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. >> See http://www.bioperl.org/wiki/Release_1.5.2 for >> instructions on getting and testing this RC. >> >> Developers: >> This should be the last RC before release ~next monday. Now would >> be a good time for last minute documentaiton updates and additions. >> >> Users: >> Even though 1.5.2 is a 'developer' release, we consider it the most >> stable and capable version of Bioperl, and recommend that you use >> it in all but the most critical production environments. Please >> try it out and let us know of any problems or difficulties you run >> into. >> >> >> Thank you, >> Sendu. >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From barry.moore at genetics.utah.edu Tue Oct 17 14:07:12 2006 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Tue, 17 Oct 2006 12:07:12 -0600 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: References: Message-ID: <588DE26B-8F18-4540-BAEE-2B479CBDE8B3@genetics.utah.edu> In fact, I think it was you who taught me that trick in the first place. B On Oct 17, 2006, at 11:40 AM, Brian Osborne wrote: > Barry, > > I second that. lynx does the best job of converting HTML to text > I've seen. > > Brian O. > > > On 10/17/06 12:57 PM, "Barry Moore" > wrote: > >> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix >> >> does a reasonable job of textifying html. You get the links as >> numbered references at the bottom or: >> >> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix | >> perl -ane 's/\[?\[\d+\](edit\])?//g;print' >> >> to remove the links all together. >> >> Barry >> >> P.S. Looks like this: >> >> #Creative Commons copyright >> >> Installing Bioperl for Unix >> >> From BioPerl >> >> Jump to: navigation, search >> >> Contents >> >> * 1 BIOPERL INSTALLATION >> * 2 SYSTEM REQUIREMENTS >> * 3 OPTIONAL >> * 4 ADDITIONAL INSTALLATION INFORMATION >> * 5 THE BIOPERL BUNDLE >> * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN >> * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make' >> * 8 WHERE ARE THE MAN PAGES? >> * 9 EXTERNAL PROGRAMS >> + 9.1 Environment Variables >> * 10 INSTALLING BIOPERL SCRIPTS >> * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA >> * 12 INSTALLING BIOPERL MODULES THE HARD WAY >> * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION >> * 14 THE TEST SYSTEM >> * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE >> + 15.1 CONFIGURING for BSD and Solaris boxes >> + 15.2 INSTALLATION >> * 16 DEPENDENCIES AND Bundle::BioPerl >> >> >> BIOPERL INSTALLATION >> >> Bioperl has been installed on many forms of Unix, Win9X/NT/ >> 2000/XP, >> and on Mac OS X (see the PLATFORMS file for more details). >> Following are >> instructions for installing Bioperl for Unix/Linux/Mac OS X; >> Windows >> installation instructions can be found here. For installing >> Bioperl for >> Mac OS X using Fink, see Getting BioPerl. >> >> >> SYSTEM REQUIREMENTS >> >> * Perl 5.005 or later; version 5.6 and greater are recommended. >> Note >> that most modules will work with earlier versions of Perl. >> The only ones >> that will not are Bio::SimpleAlign and the Bio::Index::* >> modules. If >> you don't need these modules and you want to install Bioperl >> using an >> earlier version of Perl, edit the "require 5.005;" line in >> Makefile.PL >> as necessary. >> >> * External modules: Bioperl uses functionality provided in >> other Perl >> modules. Some of these are included in the standard perl >> package but >> some need to be obtained from the CPAN site. The list of >> external >> modules is included at the bottom of this document. >> >> The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of >> these >> external modules easy. Simply install the bundle using your CPAN >> shell and >> all necessary modules will be installed. See THE BIOPERL BUNDLE, >> below. >> >> >> OPTIONAL >> >> * ANSI C or GNU C compiler (gcc) for XS extensions >> (the >> bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext >> PACKAGE, below). >> >> >> >> ADDITIONAL INSTALLATION INFORMATION >> >> * Additional information on Bioperl and MAC OS: >> + OS 9 - http://bioperl.org/Core/mac-bioperl.html >> + OSX-http://www.tc.umn.edu/~cann0010/ >> Bioperl_OSX_install.html >> + OS X - Installing using Fink (in Getting BioPerl) >> >> >> >> THE BIOPERL BUNDLE >> >> You typically need root privileges to install using CPAN. If you >> don't >> have these privileges please see INSTALLING BIOPERL IN A PERSONAL >> MODULE >> AREA for additional information. >> >> Install Bundle::Bioperl using CPAN. One way: >>> perl -MCPAN -e "install Bundle::BioPerl" >> >> Another way: >>> perl -MCPAN -e shell >> cpan>install Bundle::BioPerl >> >> >> >> On Oct 17, 2006, at 8:04 AM, Chris Fields wrote: >> >>> There isn't a very easy way since so many links have to be removed/ >>> modified. >>> I have found a few CPAN modules that could help, but for now I just >>> dump the >>> text output from a text browser (elinks) using the 'printable >>> version' page >>> and hand-edit, which works very quickly. That works for the time >>> being >>> until I can find another more automated solution. >>> >>> Fortunately there have been very few edits to either INSTALL wiki >>> page so >>> they should remain relatively stable. >>> >>> Christopher Fields >>> Postdoctoral Researcher - Switzer Lab >>> Dept. of Biochemistry >>> University of Illinois Urbana-Champaign >>> >>>> -----Original Message----- >>>> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk] >>>> Sent: Tuesday, October 17, 2006 6:46 AM >>>> To: Chris Fields >>>> Cc: bioperl-l >>>> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN >>>> >>>> Chris Fields wrote: >>>>> The general consensus was to keep text versions available; we >>>>> could >>>>> add URL links to the wiki pages for the most up-to-dat version. >>>>> BTW, >>>>> I have modified INSTALL already. INSTALL.WIN is next in line >>>>> (I was >>>>> waiting for your changes). >>>>> >>>> Is it possible to generate these files from the wiki whenever >>>> there is a >>>> release? I now edits shouldn't be too severe or too often - but >>>> I can >>>> see things getting a little messy/annoying if edits have to be >>>> made in 2 >>>> places. >>>> >>>> Nath >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From bix at sendu.me.uk Tue Oct 17 14:07:04 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 17 Oct 2006 19:07:04 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca> References: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca> Message-ID: <45351BC8.9080507@sendu.me.uk> Paul Boutros wrote: > Hi, > Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two failed > tests, the first seems to be just a result of me not having DBD::mysql > installed. [snip] Thanks for those, very useful. Not something that's come up before afaik; I'll look into them. From cjfields at uiuc.edu Tue Oct 17 14:31:51 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 13:31:51 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <20061017135933.cu3li7pz8fzk8cko@webmail.utoronto.ca> Message-ID: <001401c6f21a$836f9fc0$15327e82@pyrimidine> Your local copy of XML::SAX has XML::SAX::Expat set as the default SAX backend parser. For some reason BLAST XML parsing doesn't work with that parser (it tries to verify the XML first before parsing, hence the DTD error). I may try getting this to work again, but so far I haven't found an easy way to prevent XML verification via XML::SAX::Expat. There are two options: 1) install XML::SAX::ExpatXS (the better option), which works AND is 4x faster than XML::SAX::Expat, or 2) set the default parser in the PareserDetails.ini file in your local to use XML::SAX::PurePerl. BTW, XML::SAX::ExpatXS is intended to replace XML::SAX::Expat (it just hasn't officially happened yet); the latter hasn't had significant development in about three years. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: Paul Boutros [mailto:paul.boutros at utoronto.ca] > Sent: Tuesday, October 17, 2006 1:00 PM > To: Chris Fields > Cc: bioperl-l at lists.open-bio.org > Subject: RE: [Bioperl-l] Bioperl 1.5.2 RC2 > > Hi Chris, > > Here it is: > pcboutro at ccb690[643] >> perl -I. t/SearchIO.t > 1..1337 > ok 1 > > -------------------- WARNING --------------------- > MSG: XML::SAX::Expat not currently supported; must have local copies > of NCBI DTD docs! > --------------------------------------------------- > > -------------------- WARNING --------------------- > MSG: error in parsing a report: > > 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' > does not exist > file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd > Handler couldn't resolve external entity at line 2, column 82, byte 104 > error in processing external entity reference at line 2, column 82, > byte 104 at > /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line > 187 > > --------------------------------------------------- > not ok 2 > # Failed test 2 in t/SearchIO.t at line 68 > Can't call method "database_name" on an undefined value at > t/SearchIO.t line 69. > > > Quoting Chris Fields : > > > What do you get when you run the SearchIO.t test by itself using 'perl - > I. > > t/SearchIO.t'? It looks like something pretty catastrophic happened. > > > > Christopher Fields > > Postdoctoral Researcher - Switzer Lab > > Dept. of Biochemistry > > University of Illinois Urbana-Champaign > > > > > >> -----Original Message----- > >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > >> bounces at lists.open-bio.org] On Behalf Of Paul Boutros > >> Sent: Tuesday, October 17, 2006 11:57 AM > >> To: bioperl-l at lists.open-bio.org > >> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2 > >> > >> Hi, > >> Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two failed > >> tests, the first seems to be just a result of me not having DBD::mysql > >> installed. > >> Paul > >> > >> Test Summary > >> ============ > >> > >> Failed Test Stat Wstat Total Fail List of Failed > >> ----------------------------------------------------------------------- > --- > >> ----- > >> t/BioDBSeqFeature_mysql.t 46 46 1-46 > >> t/SearchIO.t 22 5632 1337 2671 2-1337 > >> 2 tests and 106 subtests skipped. > >> Failed 2/236 test scripts. 1382/11688 subtests failed. > >> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys = > >> 159.61 CPU) > >> > >> BioDBSeqFeature_mysql > >> ===================== > >> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t > >> 1..46 > >> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC > >> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t > >> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8 > >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi > >> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at > >> (eval 37) line 3. > >> Perhaps the DBD::mysql perl module hasn't been fully installed, > >> or perhaps the capitalisation of 'mysql' isn't right. > >> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge. > >> at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208 > >> > >> SearchIO > >> ======== > >> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more > >> 1..1337 > >> ok 1 > >> > >> -------------------- WARNING --------------------- > >> MSG: XML::SAX::Expat not currently supported; must have local copies > >> of NCBI DTD docs! > >> --------------------------------------------------- > >> > >> -------------------- WARNING --------------------- > >> MSG: error in parsing a report: > >> > >> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' > >> does not exist > >> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd > >> Handler couldn't resolve external entity at line 2, column 82, byte 104 > >> error in processing external entity reference at line 2, column 82, > >> byte 104 at > >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line > >> 187 > >> > >> --------------------------------------------------- > >> not ok 2 > >> # Failed test 2 in t/SearchIO.t at line 68 > >> Can't call method "database_name" on an undefined value at > >> t/SearchIO.t line 69. > >> > >> ------------------------------ > >> > >> Message: 10 > >> Date: Tue, 17 Oct 2006 11:32:54 +0100 > >> From: Sendu Bala > >> Subject: [Bioperl-l] Bioperl 1.5.2 RC2 > >> To: bioperl-l at bioperl.org > >> Message-ID: <4534B156.4090501 at sendu.me.uk> > >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed > >> > >> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. > >> See http://www.bioperl.org/wiki/Release_1.5.2 for > >> instructions on getting and testing this RC. > >> > >> Developers: > >> This should be the last RC before release ~next monday. Now would > >> be a good time for last minute documentaiton updates and additions. > >> > >> Users: > >> Even though 1.5.2 is a 'developer' release, we consider it the most > >> stable and capable version of Bioperl, and recommend that you use > >> it in all but the most critical production environments. Please > >> try it out and let us know of any problems or difficulties you run > >> into. > >> > >> > >> Thank you, > >> Sendu. > >> > >> > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > From cjfields at uiuc.edu Tue Oct 17 15:05:59 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 14:05:59 -0500 Subject: [Bioperl-l] split location problems In-Reply-To: Message-ID: <001b01c6f21f$48640420$15327e82@pyrimidine> > > From: Jason Stajich [mailto:jason.stajich at gmail.com] > > > > The whole point of split locations is to represent genes with > > introns > > so that is not the "rare" case. > > Absolutely. Right, but that specific kind of join statement is not commonly used in GenBank files, which seems to be the format predominately used (no offense to EBI). This may explain why we haven't seen this pop up more often. I believe we're seeing is a difference in the way these locations are described at NCBI vs EBI, which Nadeem Faruque seems to corroborate. He indicated that EBI may move to using similar GenBank-like location strings. Regardless, FTlocationFactory and Bio::Location::Split should handle both if they are present but only seems to like the GenBank version. > > I've added code to test this to bug 2101 including a C.glabrata > > chromsome downloaded from genbank. Perhaps the problem is on the > > EMBL parsing side, I didn't test that. > > Well, I don't know whether it's EMBL parsing, or a bit further down the > pipe, but I downloaded C.glabrata chromosome B for GenBank (NC_005968), > and it describes the complement/joins in the way that Bioperl is > handling correctly. > > GenBank: > CDS complement(join(10347..10372,10632..11157)) > /locus_tag="CAGL0B00242g" > > EMBL: > FT CDS > join(complement(10632..11157),complement(10347..10372)) > FT /locus_tag="CAGL0B00242g" Yes, something that I found out independently (and corroborated by Nadeem). > Here's the diff when I run the location-printing script I posted > yesterday: > > diff biogb bio > 1c1,5 > < complement(join(10347..10372,10632..11157)) > --- > > complement(1701..2651) > > complement(2635..3345) > > complement(3980..4408) > > complement(join(10632..11157,10347..10372)) > > 10379..10615 > 209a214,217 > > 498198..498890 > > 499712..500062 > > 499851..500702 > > 500579..501364 > > As you can see, the complement/join CDS is written out in a different > order, which is Bad. I think this can be handled directly in to_FTstring(). I'll have to add a method to get the strand info from the Split object w/o going through strand(). However, I'm thinking about trying a different tact which is a bit simpler and, if it proves fruitful, may simplify Split locations somewhat. It won't be ready for 1.5.2 but maybe the next release. > (I looked at at least one of the other differences: the GB file says > it's a "misc feature" and EMBL says it's a CDS. But they don't seem to > be relevant here.) > -Amir Probably not but something to keep in mind. -c Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From er at xs4all.nl Tue Oct 17 15:01:48 2006 From: er at xs4all.nl (Erikjan) Date: Tue, 17 Oct 2006 21:01:48 +0200 (CEST) Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <001301c6f215$07a9a070$15327e82@pyrimidine> References: <001301c6f215$07a9a070$15327e82@pyrimidine> Message-ID: <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> Hello, I noticed a little problem with the Annotation "DBLink" from GenBank entries When I run: perl -MBio::DB::GenBank -e 'my $gi = 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio = $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my $ac=$seq->annotation(); my @annotations = $ac->get_Annotations("dblink"); for(@annotations) { print $_, "\n";} print $INC{ "Bio/Annotation/DBLink.pm" }, "\n"; ' This yields: GenBank:AL591065.17.17 and the place where the used Bio/Annotation/DBLink.pm resides. Can others repeat this? I have dug into the source a little and Bio::Annotation::DBLink seems to be the place where this happens: it has a concatenation which leads to that repeated version number. It this something that I should fix "client-side", so to speak, or is it worthwhile to add some logic to that concatenation to prevent this? Thanks, Eric From bosborne11 at verizon.net Tue Oct 17 13:40:54 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 17 Oct 2006 13:40:54 -0400 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <3CACE1AB-D7F2-4F54-A1C3-35E1AE444CD1@genetics.utah.edu> Message-ID: Barry, I second that. lynx does the best job of converting HTML to text I've seen. Brian O. On 10/17/06 12:57 PM, "Barry Moore" wrote: > lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix > > does a reasonable job of textifying html. You get the links as > numbered references at the bottom or: > > lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix | > perl -ane 's/\[?\[\d+\](edit\])?//g;print' > > to remove the links all together. > > Barry > > P.S. Looks like this: > > #Creative Commons copyright > > Installing Bioperl for Unix > > From BioPerl > > Jump to: navigation, search > > Contents > > * 1 BIOPERL INSTALLATION > * 2 SYSTEM REQUIREMENTS > * 3 OPTIONAL > * 4 ADDITIONAL INSTALLATION INFORMATION > * 5 THE BIOPERL BUNDLE > * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN > * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make' > * 8 WHERE ARE THE MAN PAGES? > * 9 EXTERNAL PROGRAMS > + 9.1 Environment Variables > * 10 INSTALLING BIOPERL SCRIPTS > * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA > * 12 INSTALLING BIOPERL MODULES THE HARD WAY > * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION > * 14 THE TEST SYSTEM > * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE > + 15.1 CONFIGURING for BSD and Solaris boxes > + 15.2 INSTALLATION > * 16 DEPENDENCIES AND Bundle::BioPerl > > > BIOPERL INSTALLATION > > Bioperl has been installed on many forms of Unix, Win9X/NT/2000/XP, > and on Mac OS X (see the PLATFORMS file for more details). > Following are > instructions for installing Bioperl for Unix/Linux/Mac OS X; > Windows > installation instructions can be found here. For installing > Bioperl for > Mac OS X using Fink, see Getting BioPerl. > > > SYSTEM REQUIREMENTS > > * Perl 5.005 or later; version 5.6 and greater are recommended. > Note > that most modules will work with earlier versions of Perl. > The only ones > that will not are Bio::SimpleAlign and the Bio::Index::* > modules. If > you don't need these modules and you want to install Bioperl > using an > earlier version of Perl, edit the "require 5.005;" line in > Makefile.PL > as necessary. > > * External modules: Bioperl uses functionality provided in > other Perl > modules. Some of these are included in the standard perl > package but > some need to be obtained from the CPAN site. The list of > external > modules is included at the bottom of this document. > > The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of > these > external modules easy. Simply install the bundle using your CPAN > shell and > all necessary modules will be installed. See THE BIOPERL BUNDLE, > below. > > > OPTIONAL > > * ANSI C or GNU C compiler (gcc) for XS extensions (the > bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext > PACKAGE, below). > > > > ADDITIONAL INSTALLATION INFORMATION > > * Additional information on Bioperl and MAC OS: > + OS 9 - http://bioperl.org/Core/mac-bioperl.html > + OSX-http://www.tc.umn.edu/~cann0010/ > Bioperl_OSX_install.html > + OS X - Installing using Fink (in Getting BioPerl) > > > > THE BIOPERL BUNDLE > > You typically need root privileges to install using CPAN. If you > don't > have these privileges please see INSTALLING BIOPERL IN A PERSONAL > MODULE > AREA for additional information. > > Install Bundle::Bioperl using CPAN. One way: >> perl -MCPAN -e "install Bundle::BioPerl" > > Another way: >> perl -MCPAN -e shell > cpan>install Bundle::BioPerl > > > > On Oct 17, 2006, at 8:04 AM, Chris Fields wrote: > >> There isn't a very easy way since so many links have to be removed/ >> modified. >> I have found a few CPAN modules that could help, but for now I just >> dump the >> text output from a text browser (elinks) using the 'printable >> version' page >> and hand-edit, which works very quickly. That works for the time >> being >> until I can find another more automated solution. >> >> Fortunately there have been very few edits to either INSTALL wiki >> page so >> they should remain relatively stable. >> >> Christopher Fields >> Postdoctoral Researcher - Switzer Lab >> Dept. of Biochemistry >> University of Illinois Urbana-Champaign >> >>> -----Original Message----- >>> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk] >>> Sent: Tuesday, October 17, 2006 6:46 AM >>> To: Chris Fields >>> Cc: bioperl-l >>> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN >>> >>> Chris Fields wrote: >>>> The general consensus was to keep text versions available; we could >>>> add URL links to the wiki pages for the most up-to-dat version. >>>> BTW, >>>> I have modified INSTALL already. INSTALL.WIN is next in line (I was >>>> waiting for your changes). >>>> >>> Is it possible to generate these files from the wiki whenever >>> there is a >>> release? I now edits shouldn't be too severe or too often - but I can >>> see things getting a little messy/annoying if edits have to be >>> made in 2 >>> places. >>> >>> Nath >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Tue Oct 17 16:30:15 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 15:30:15 -0500 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> References: <001301c6f215$07a9a070$15327e82@pyrimidine> <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> Message-ID: <0FB91820-B2A1-4F7F-866C-8D4791DD8306@uiuc.edu> I can confirm this using bioperl-live: GenBank:AL591065.17.17 /Users/cjfields/src/bioperl-live/Bio/Annotation/DBLink.pm Could you file a bug report via bugzilla? Chris On Oct 17, 2006, at 2:01 PM, Erikjan wrote: > Hello, > > I noticed a little problem with the Annotation "DBLink" from > GenBank entries > > When I run: > > perl -MBio::DB::GenBank -e 'my $gi = > 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio = > $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my > $ac=$seq->annotation(); my @annotations = $ac->get_Annotations > ("dblink"); > for(@annotations) { print $_, "\n";} print $INC{ > "Bio/Annotation/DBLink.pm" }, "\n"; ' > > This yields: > > GenBank:AL591065.17.17 > > and the place where the used Bio/Annotation/DBLink.pm resides. > > Can others repeat this? > > I have dug into the source a little and Bio::Annotation::DBLink > seems to > be the place where this happens: it has a concatenation which leads to > that repeated version number. > > It this something that I should fix "client-side", so to speak, or > is it > worthwhile to add some logic to that concatenation to prevent this? > > > Thanks, > > Eric > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From paul.boutros at utoronto.ca Tue Oct 17 19:49:52 2006 From: paul.boutros at utoronto.ca (Paul Boutros) Date: Tue, 17 Oct 2006 19:49:52 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <001401c6f21a$836f9fc0$15327e82@pyrimidine> References: <001401c6f21a$836f9fc0$15327e82@pyrimidine> Message-ID: <20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca> Hi Chris, Yup, that's it. I installed XML::SAX::ExpatXS (make test output below). Should there be a note somewhere in the INSTALL docs saying basically what you just wrote? Or maybe it's already there somewhere and I missed it. Also, I'd propose changing BioDBSeqFeature_mysql.t so that it checks if DBD::mysql can be loaded, and if not doesn't run the test. Since the file is only one-line long, here's the modified file rather than a patch: ################################################################ BEGIN { # DBD::mysql is required eval { require DBD::mysql; }; if ( $@ ) { use Test::More skip_all => "DBD::mysql is not installed or is installed incorrectly - skipping BioDBSeqFeature _mysql.t"; exit(0); } } system "perl t/BioDBSeqFeature.t -adaptor DBI::mysql -create 1 -temp 1 -dsn test"; ################################################################ And when I run it I get: t/BioDBSeqFeature_mysql......skipped all skipped: DBD::mysql is not installed or is installed incorrectly - skipping BioDBSeqFeature_mysql.t And for the overall make test: All tests successful, 3 tests and 106 subtests skipped. Files=236, Tests=11642, 247 wallclock secs (143.85 cusr + 20.39 csys = 164.24 CPU) Hope this helps, Paul Quoting Chris Fields : > Your local copy of XML::SAX has XML::SAX::Expat set as the default SAX > backend parser. For some reason BLAST XML parsing doesn't work with that > parser (it tries to verify the XML first before parsing, hence the DTD > error). I may try getting this to work again, but so far I haven't found an > easy way to prevent XML verification via XML::SAX::Expat. > > There are two options: 1) install XML::SAX::ExpatXS (the better option), > which works AND is 4x faster than XML::SAX::Expat, or 2) set the default > parser in the PareserDetails.ini file in your local to use > XML::SAX::PurePerl. > > BTW, XML::SAX::ExpatXS is intended to replace XML::SAX::Expat (it just > hasn't officially happened yet); the latter hasn't had significant > development in about three years. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > >> -----Original Message----- >> From: Paul Boutros [mailto:paul.boutros at utoronto.ca] >> Sent: Tuesday, October 17, 2006 1:00 PM >> To: Chris Fields >> Cc: bioperl-l at lists.open-bio.org >> Subject: RE: [Bioperl-l] Bioperl 1.5.2 RC2 >> >> Hi Chris, >> >> Here it is: >> pcboutro at ccb690[643] >> perl -I. t/SearchIO.t >> 1..1337 >> ok 1 >> >> -------------------- WARNING --------------------- >> MSG: XML::SAX::Expat not currently supported; must have local copies >> of NCBI DTD docs! >> --------------------------------------------------- >> >> -------------------- WARNING --------------------- >> MSG: error in parsing a report: >> >> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' >> does not exist >> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd >> Handler couldn't resolve external entity at line 2, column 82, byte 104 >> error in processing external entity reference at line 2, column 82, >> byte 104 at >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line >> 187 >> >> --------------------------------------------------- >> not ok 2 >> # Failed test 2 in t/SearchIO.t at line 68 >> Can't call method "database_name" on an undefined value at >> t/SearchIO.t line 69. >> >> >> Quoting Chris Fields : >> >> > What do you get when you run the SearchIO.t test by itself using 'perl - >> I. >> > t/SearchIO.t'? It looks like something pretty catastrophic happened. >> > >> > Christopher Fields >> > Postdoctoral Researcher - Switzer Lab >> > Dept. of Biochemistry >> > University of Illinois Urbana-Champaign >> > >> > >> >> -----Original Message----- >> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >> >> bounces at lists.open-bio.org] On Behalf Of Paul Boutros >> >> Sent: Tuesday, October 17, 2006 11:57 AM >> >> To: bioperl-l at lists.open-bio.org >> >> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2 >> >> >> >> Hi, >> >> Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two failed >> >> tests, the first seems to be just a result of me not having DBD::mysql >> >> installed. >> >> Paul >> >> >> >> Test Summary >> >> ============ >> >> >> >> Failed Test Stat Wstat Total Fail List of Failed >> >> ----------------------------------------------------------------------- >> --- >> >> ----- >> >> t/BioDBSeqFeature_mysql.t 46 46 1-46 >> >> t/SearchIO.t 22 5632 1337 2671 2-1337 >> >> 2 tests and 106 subtests skipped. >> >> Failed 2/236 test scripts. 1382/11688 subtests failed. >> >> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys = >> >> 159.61 CPU) >> >> >> >> BioDBSeqFeature_mysql >> >> ===================== >> >> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t >> >> 1..46 >> >> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC >> >> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t >> >> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8 >> >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi >> >> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at >> >> (eval 37) line 3. >> >> Perhaps the DBD::mysql perl module hasn't been fully installed, >> >> or perhaps the capitalisation of 'mysql' isn't right. >> >> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge. >> >> at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208 >> >> >> >> SearchIO >> >> ======== >> >> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more >> >> 1..1337 >> >> ok 1 >> >> >> >> -------------------- WARNING --------------------- >> >> MSG: XML::SAX::Expat not currently supported; must have local copies >> >> of NCBI DTD docs! >> >> --------------------------------------------------- >> >> >> >> -------------------- WARNING --------------------- >> >> MSG: error in parsing a report: >> >> >> >> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' >> >> does not exist >> >> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd >> >> Handler couldn't resolve external entity at line 2, column 82, byte 104 >> >> error in processing external entity reference at line 2, column 82, >> >> byte 104 at >> >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line >> >> 187 >> >> >> >> --------------------------------------------------- >> >> not ok 2 >> >> # Failed test 2 in t/SearchIO.t at line 68 >> >> Can't call method "database_name" on an undefined value at >> >> t/SearchIO.t line 69. >> >> >> >> ------------------------------ >> >> >> >> Message: 10 >> >> Date: Tue, 17 Oct 2006 11:32:54 +0100 >> >> From: Sendu Bala >> >> Subject: [Bioperl-l] Bioperl 1.5.2 RC2 >> >> To: bioperl-l at bioperl.org >> >> Message-ID: <4534B156.4090501 at sendu.me.uk> >> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed >> >> >> >> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. >> >> See http://www.bioperl.org/wiki/Release_1.5.2 for >> >> instructions on getting and testing this RC. >> >> >> >> Developers: >> >> This should be the last RC before release ~next monday. Now would >> >> be a good time for last minute documentaiton updates and additions. >> >> >> >> Users: >> >> Even though 1.5.2 is a 'developer' release, we consider it the most >> >> stable and capable version of Bioperl, and recommend that you use >> >> it in all but the most critical production environments. Please >> >> try it out and let us know of any problems or difficulties you run >> >> into. >> >> >> >> >> >> Thank you, >> >> Sendu. >> >> >> >> >> >> >> >> _______________________________________________ >> >> Bioperl-l mailing list >> >> Bioperl-l at lists.open-bio.org >> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > >> > >> > > > From cjfields at uiuc.edu Tue Oct 17 20:51:35 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 19:51:35 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca> References: <001401c6f21a$836f9fc0$15327e82@pyrimidine> <20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca> Message-ID: <67CA244C-5A2C-46A0-B474-47C5B67D199B@uiuc.edu> On Oct 17, 2006, at 6:49 PM, Paul Boutros wrote: > Hi Chris, > > Yup, that's it. I installed XML::SAX::ExpatXS (make test output > below). Should there be a note somewhere in the INSTALL docs saying > basically what you just wrote? Or maybe it's already there somewhere > and I missed it. The INSTALL docs should have this, yes. I'll double-check though. Pretty much anything that plugs into XML::SAX except XML::SAX::Expat works (XML::LibXML also works, I found). > Also, I'd propose changing BioDBSeqFeature_mysql.t so that it checks > if DBD::mysql can be loaded, and if not doesn't run the test. Since > the file is only one-line long, here's the modified file rather than a > patch: > ################################################################ > BEGIN { > # DBD::mysql is required > eval { > require DBD::mysql; > }; > if ( $@ ) { > use Test::More skip_all => "DBD::mysql is not > installed or is installed incorrectly - skipping BioDBSeqFeature > _mysql.t"; > exit(0); > } > } > > system "perl t/BioDBSeqFeature.t -adaptor DBI::mysql -create 1 -temp 1 > -dsn test"; > ################################################################ > > And when I run it I get: > t/BioDBSeqFeature_mysql......skipped > all skipped: DBD::mysql is not installed or is installed > incorrectly - skipping BioDBSeqFeature_mysql.t > > And for the overall make test: > All tests successful, 3 tests and 106 subtests skipped. > Files=236, Tests=11642, 247 wallclock secs (143.85 cusr + 20.39 csys = > 164.24 CPU) It should check this when using 'perl Makefile.PL', since the tests are only set up if MySQL is present (so you would assume that it checks for DBD::mysql). I'll look into it. Chris > Hope this helps, > Paul > > > Quoting Chris Fields : > >> Your local copy of XML::SAX has XML::SAX::Expat set as the default >> SAX >> backend parser. For some reason BLAST XML parsing doesn't work >> with that >> parser (it tries to verify the XML first before parsing, hence the >> DTD >> error). I may try getting this to work again, but so far I >> haven't found an >> easy way to prevent XML verification via XML::SAX::Expat. >> >> There are two options: 1) install XML::SAX::ExpatXS (the better >> option), >> which works AND is 4x faster than XML::SAX::Expat, or 2) set the >> default >> parser in the PareserDetails.ini file in your local to use >> XML::SAX::PurePerl. >> >> BTW, XML::SAX::ExpatXS is intended to replace XML::SAX::Expat (it >> just >> hasn't officially happened yet); the latter hasn't had significant >> development in about three years. >> >> Christopher Fields >> Postdoctoral Researcher - Switzer Lab >> Dept. of Biochemistry >> University of Illinois Urbana-Champaign >> >> >>> -----Original Message----- >>> From: Paul Boutros [mailto:paul.boutros at utoronto.ca] >>> Sent: Tuesday, October 17, 2006 1:00 PM >>> To: Chris Fields >>> Cc: bioperl-l at lists.open-bio.org >>> Subject: RE: [Bioperl-l] Bioperl 1.5.2 RC2 >>> >>> Hi Chris, >>> >>> Here it is: >>> pcboutro at ccb690[643] >> perl -I. t/SearchIO.t >>> 1..1337 >>> ok 1 >>> >>> -------------------- WARNING --------------------- >>> MSG: XML::SAX::Expat not currently supported; must have local copies >>> of NCBI DTD docs! >>> --------------------------------------------------- >>> >>> -------------------- WARNING --------------------- >>> MSG: error in parsing a report: >>> >>> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' >>> does not exist >>> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd >>> Handler couldn't resolve external entity at line 2, column 82, >>> byte 104 >>> error in processing external entity reference at line 2, column 82, >>> byte 104 at >>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm >>> line >>> 187 >>> >>> --------------------------------------------------- >>> not ok 2 >>> # Failed test 2 in t/SearchIO.t at line 68 >>> Can't call method "database_name" on an undefined value at >>> t/SearchIO.t line 69. >>> >>> >>> Quoting Chris Fields : >>> >>>> What do you get when you run the SearchIO.t test by itself using >>>> 'perl - >>> I. >>>> t/SearchIO.t'? It looks like something pretty catastrophic >>>> happened. >>>> >>>> Christopher Fields >>>> Postdoctoral Researcher - Switzer Lab >>>> Dept. of Biochemistry >>>> University of Illinois Urbana-Champaign >>>> >>>> >>>>> -----Original Message----- >>>>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >>>>> bounces at lists.open-bio.org] On Behalf Of Paul Boutros >>>>> Sent: Tuesday, October 17, 2006 11:57 AM >>>>> To: bioperl-l at lists.open-bio.org >>>>> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2 >>>>> >>>>> Hi, >>>>> Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two >>>>> failed >>>>> tests, the first seems to be just a result of me not having >>>>> DBD::mysql >>>>> installed. >>>>> Paul >>>>> >>>>> Test Summary >>>>> ============ >>>>> >>>>> Failed Test Stat Wstat Total Fail List of Failed >>>>> ------------------------------------------------------------------ >>>>> ----- >>> --- >>>>> ----- >>>>> t/BioDBSeqFeature_mysql.t 46 46 1-46 >>>>> t/SearchIO.t 22 5632 1337 2671 2-1337 >>>>> 2 tests and 106 subtests skipped. >>>>> Failed 2/236 test scripts. 1382/11688 subtests failed. >>>>> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 >>>>> csys = >>>>> 159.61 CPU) >>>>> >>>>> BioDBSeqFeature_mysql >>>>> ===================== >>>>> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t >>>>> 1..46 >>>>> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC >>>>> (@INC >>>>> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t >>>>> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8 >>>>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi >>>>> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/ >>>>> site_perl) at >>>>> (eval 37) line 3. >>>>> Perhaps the DBD::mysql perl module hasn't been fully installed, >>>>> or perhaps the capitalisation of 'mysql' isn't right. >>>>> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge. >>>>> at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208 >>>>> >>>>> SearchIO >>>>> ======== >>>>> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more >>>>> 1..1337 >>>>> ok 1 >>>>> >>>>> -------------------- WARNING --------------------- >>>>> MSG: XML::SAX::Expat not currently supported; must have local >>>>> copies >>>>> of NCBI DTD docs! >>>>> --------------------------------------------------- >>>>> >>>>> -------------------- WARNING --------------------- >>>>> MSG: error in parsing a report: >>>>> >>>>> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/ >>>>> NCBI_BlastOutput.dtd' >>>>> does not exist >>>>> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd >>>>> Handler couldn't resolve external entity at line 2, column 82, >>>>> byte 104 >>>>> error in processing external entity reference at line 2, column >>>>> 82, >>>>> byte 104 at >>>>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/ >>>>> Parser.pm line >>>>> 187 >>>>> >>>>> --------------------------------------------------- >>>>> not ok 2 >>>>> # Failed test 2 in t/SearchIO.t at line 68 >>>>> Can't call method "database_name" on an undefined value at >>>>> t/SearchIO.t line 69. >>>>> >>>>> ------------------------------ >>>>> >>>>> Message: 10 >>>>> Date: Tue, 17 Oct 2006 11:32:54 +0100 >>>>> From: Sendu Bala >>>>> Subject: [Bioperl-l] Bioperl 1.5.2 RC2 >>>>> To: bioperl-l at bioperl.org >>>>> Message-ID: <4534B156.4090501 at sendu.me.uk> >>>>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed >>>>> >>>>> Bioperl 1.5.2 Release Candidate 2 is ready and available for >>>>> testing. >>>>> See http://www.bioperl.org/wiki/Release_1.5.2 for >>>>> instructions on getting and testing this RC. >>>>> >>>>> Developers: >>>>> This should be the last RC before release ~next monday. Now >>>>> would >>>>> be a good time for last minute documentaiton updates and >>>>> additions. >>>>> >>>>> Users: >>>>> Even though 1.5.2 is a 'developer' release, we consider it >>>>> the most >>>>> stable and capable version of Bioperl, and recommend that >>>>> you use >>>>> it in all but the most critical production environments. >>>>> Please >>>>> try it out and let us know of any problems or difficulties >>>>> you run >>>>> into. >>>>> >>>>> >>>>> Thank you, >>>>> Sendu. >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> >>> >> >> >> > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Wed Oct 18 02:52:05 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 18 Oct 2006 07:52:05 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4534B156.4090501@sendu.me.uk> References: <4534B156.4090501@sendu.me.uk> Message-ID: <4535CF15.4090502@sendu.me.uk> Sendu Bala wrote: > Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. > See http://www.bioperl.org/wiki/Release_1.5.2 for > instructions on getting and testing this RC. > > Developers: > This should be the last RC before release ~next monday. Now would > be a good time for last minute documentaiton updates and additions. Given the few issues that have come up, it would be prudent to have another RC, so expect one around the time the 'Needs investigation' issues on the release page have been solved. If you think there are more things that need investigation, please add them, but note the bias toward things that affect the successful completion of the test suite as opposed to general bugs which should go to Bugzilla as normal. From bix at sendu.me.uk Wed Oct 18 04:55:21 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 18 Oct 2006 09:55:21 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <45350BA6.3040102@genomics.dk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> Message-ID: <4535EBF9.1090706@sendu.me.uk> Niels Larsen wrote: > ------------ EBI > > I invoked the EBI script > > http://www.ebi.ac.uk/Tools/webservices/downloads/WSWUBlastSOAPLite.zip > > like this > > WSWUBlastClient.pl -p blastn -D embl test.fasta > > where the content of test.fasta is below, and got > > Can't find method element in the message at > /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311. As you admit, this is not a Bioperl issue. I would suggest you contact EBI support. In the mean time/alternatively I'd suggest investigating the Bioperl interface to the SOAP server, which is part of the Bioperl-run package. http://doc.bioperl.org/releases/bioperl-current/bioperl-run/Bio/Tools/Run/Analysis.html > ------------ DDBJ > > Inspired by this page, > > http://xml.nig.ac.jp/doc/Blast.txt > > I made this test script [snip] > which for me prints undef. Again, not something I can really help you with. You'll need to triple-check your code and then seek support from the providers of that SOAP service. > ------------- NCBI/Bioperl > > I installed 1.5.2-RC2, looked at the RemoteBlast example in > > http://www.bioperl.org/wiki/Bptutorial.pl > > and then put that into this test code, more or less cut/paste, [snip] > Maybe I am supposed to add a check for content in $rc and then stop > the inner loop? Yes, the wiki page example isn't really adequate. I'll update it. For a better code example see the RemoteBlast documentation: http://doc.bioperl.org/bioperl-live/Bio/Tools/Run/RemoteBlast.html > I could figure that out maybe, but I wish there was a > function which simply takes a single sequence + arguments and only > returns a list of matches when done, and does not return until then > (or until a specified timeout). Yes, I hardly find dealing with RIDs that pleasant. You might like to add a feature request to Bugzilla. From n.haigh at sheffield.ac.uk Wed Oct 18 05:58:00 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Wed, 18 Oct 2006 10:58:00 +0100 Subject: [Bioperl-l] RC2 test results on WinXP Message-ID: <4535FAA8.2050506@sheffield.ac.uk> I get all tests passing except for BioDBSeqFeature_mysql which fails all tests (1-46). During perl Makefile.PL I get: "I see you have Berkeleydb installed. I will create the DBD tests for Bio::DB::SeqFeature::Store..." I notice under the "needs investigation" there is mention about tests been generated even if DBD::mysql isn't installed. I assume this is the problem? If this is the problem should DBD::mysql be added to the dependencies in Makefile.PL? Is there an easy way to find out what tests are being skipped due to absent modules? Cheers Nath From n.haigh at sheffield.ac.uk Wed Oct 18 07:34:21 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Wed, 18 Oct 2006 12:34:21 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4535EBF9.1090706@sendu.me.uk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> <4535EBF9.1090706@sendu.me.uk> Message-ID: <4536113D.1080307@sheffield.ac.uk> I've just added test results for 1.5.2 RC2 to the wiki. There are lots of fails for packages other than bioperl-live. I'm not sure excatly how the test fails/skipps are/should be handled since my setups are as follows. Clean WinXP Pro: This is a clean install of WinXP Pro SP2 with no major software installed, other than ActivePerl 5.8.8.819 and a few tools for archive extracting, anti virus etc. Therefore, I'm unsure how tests in bioperl-network and bioperl-db should return. For example, I have made no effort to setup biosql-schema but I thought that maybe there would be a test that would detect this, and fail, then skip over other tests gracefully - like the bioperl-run tests when a piece of software is not installed??? Debian Linux: This is a Bio-Linux machine with quite a lot of bioinformatics software installed in the Path. So most of the tests in bioperl-run should probably have passed. The same goes for bioperl-network and bioperl-db as with my Windows setup. If my thoughts are totally wrong - let me know! Nath From bix at sendu.me.uk Wed Oct 18 08:03:11 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 18 Oct 2006 13:03:11 +0100 Subject: [Bioperl-l] RC2 test results on WinXP In-Reply-To: <4535FAA8.2050506@sheffield.ac.uk> References: <4535FAA8.2050506@sheffield.ac.uk> Message-ID: <453617FF.9080508@sendu.me.uk> Nathan Haigh wrote: > I get all tests passing except for BioDBSeqFeature_mysql which fails all > tests (1-46). > > During perl Makefile.PL I get: > "I see you have Berkeleydb installed. I will create the DBD tests for > Bio::DB::SeqFeature::Store..." > > I notice under the "needs investigation" there is mention about tests > been generated even if DBD::mysql isn't installed. I assume this is the > problem? Probably. I'm looking into it. Not sure why it wasn't causing a problem before now. > If this is the problem should DBD::mysql be added to the > dependencies in Makefile.PL? No. You can use the modules in question without mysql (presumably; ie. you have a different sql setup), so it makes no sense to warn people they don't have a module they absolutely do not need. > Is there an easy way to find out what tests are being skipped due to > absent modules? Ideally, when the skip occurs the test script will issue a message. I think that happens in most, if not all cases. From bix at sendu.me.uk Wed Oct 18 09:02:50 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 18 Oct 2006 14:02:50 +0100 Subject: [Bioperl-l] RC2 test results on WinXP In-Reply-To: <453617FF.9080508@sendu.me.uk> References: <4535FAA8.2050506@sheffield.ac.uk> <453617FF.9080508@sendu.me.uk> Message-ID: <453625FA.6090907@sendu.me.uk> Sendu Bala wrote: > Nathan Haigh wrote: ? >> I notice under the "needs investigation" there is mention about tests >> been generated even if DBD::mysql isn't installed. I assume this is the >> problem? > > Probably. I'm looking into it. Not sure why it wasn't causing a problem > before now. > > > If this is the problem should DBD::mysql be added to the > > dependencies in Makefile.PL? > > No. You can use the modules in question without mysql (presumably; ie. > you have a different sql setup), so it makes no sense to warn people > they don't have a module they absolutely do not need. Oops. It /is/ in the pre-reqs in Makefile.PL. Maybe DBD::mysql is the only supported driver? From bix at sendu.me.uk Wed Oct 18 09:16:24 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 18 Oct 2006 14:16:24 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <67CA244C-5A2C-46A0-B474-47C5B67D199B@uiuc.edu> References: <001401c6f21a$836f9fc0$15327e82@pyrimidine> <20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca> <67CA244C-5A2C-46A0-B474-47C5B67D199B@uiuc.edu> Message-ID: <45362928.8070104@sendu.me.uk> Chris Fields wrote: > On Oct 17, 2006, at 6:49 PM, Paul Boutros wrote: > >> Hi Chris, >> >> Yup, that's it. I installed XML::SAX::ExpatXS (make test output >> below). Should there be a note somewhere in the INSTALL docs saying >> basically what you just wrote? Or maybe it's already there somewhere >> and I missed it. > > The INSTALL docs should have this, yes. I'll double-check though. > > Pretty much anything that plugs into XML::SAX except XML::SAX::Expat > works (XML::LibXML also works, I found). > >> Also, I'd propose changing BioDBSeqFeature_mysql.t so that it checks >> if DBD::mysql can be loaded, [snip] > It should check this when using 'perl Makefile.PL', since the tests > are only set up if MySQL is present (so you would assume that it > checks for DBD::mysql). I'll look into it. This looks like a booboo on my part. I left t/BioDBSeqFeature_mysql.t in my t directory when I packed it up for release. I'm tweaking Makefile.PL right now in any case; there are a few errors and t/BioDBSeqFeature_mysql.t et al. don't get deleted in a make clean. From cjfields at uiuc.edu Wed Oct 18 09:55:37 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 08:55:37 -0500 Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in GenBank/EMBL/DDBJ Message-ID: <001e01c6f2bd$1769aca0$15327e82@pyrimidine> Ding dong the witch is dead! As announce previously, from the latest GenBank release (156.0): ----------------------------------------------- 1.3.8 Feature location syntax X.Y no longer supported The Feature Table has supported feature locations of the form 'X.Y', to represent a base position which is greater or equal to X, and less than or equal to Y. For example: misc_feature 1.10..20 misc_feature join(100..150,200.210..250) In the first example, the misc_feature starts somewhere between bases 1 and 10 (inclusive), and ends at basepair 20. In the second, the 51 bases from 100..150 are joined together with a second basepair interval, which could be anywhere from 200..250 to 210..250 . Although this syntax seems like a reasonable way to capture an uncertain interval, it is used for features on a vanishingly small number of sequence records, most database submission mechanisms don't support it, and the meaning of its use in a join() context is not entirely clear. As of October 2006, this type of location is no longer supported. Those records with features which utilize X.Y locations will be reviewed and converted to a non-uncertain format. ----------------------------------------------- EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations. Not sure about UniProt/SwissProt. I guess we're keeping this in for backwards compatibility only, but how do we handle any bugs that pop up related to this? Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Wed Oct 18 10:10:07 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 09:10:07 -0500 Subject: [Bioperl-l] RC2 test results on WinXP In-Reply-To: <453617FF.9080508@sendu.me.uk> Message-ID: <001f01c6f2bf$20737270$15327e82@pyrimidine> > Nathan Haigh wrote: > > I get all tests passing except for BioDBSeqFeature_mysql which fails all > > tests (1-46). > > > > During perl Makefile.PL I get: > > "I see you have Berkeleydb installed. I will create the DBD tests for > > Bio::DB::SeqFeature::Store..." > > > > I notice under the "needs investigation" there is mention about tests > > been generated even if DBD::mysql isn't installed. I assume this is the > > problem? > > Probably. I'm looking into it. Not sure why it wasn't causing a problem > before now. Funny, I don't have problems with the BioDBSeqFeature_mysql tests on WinXP because 'perl Makefile.PL' doesn't detect my MySQL installation, so the MySQL-based tests don't run even though I have DBD::mysql installed. I thought this might just be a WinXP issue, but apparently not. If I can get to it I'll run a few checks. > > If this is the problem should DBD::mysql be added to the > > dependencies in Makefile.PL? > > No. You can use the modules in question without mysql (presumably; ie. > you have a different sql setup), so it makes no sense to warn people > they don't have a module they absolutely do not need. Agreed, though I don't know if other relational DB's are supported like PostgreSQL. > > Is there an easy way to find out what tests are being skipped due to > > absent modules? > > Ideally, when the skip occurs the test script will issue a message. I > think that happens in most, if not all cases. Yes, though we may run into the same issue we had with XEMBL tests not reporting the reasons it skipped. Each test suite should run an eval{} to check the required modules, then only skip blocks of tests that rely on those modules. I think we have caught most of those, but who knows w/o doing a complete test suite audit? Our eventual complete switchover to Test::More should hopefully clean these up. I don't consider it a pressing issue for this release, though Sendu may feel differently. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Wed Oct 18 10:12:52 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 09:12:52 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <45362928.8070104@sendu.me.uk> Message-ID: <002001c6f2bf$807849c0$15327e82@pyrimidine> ... > This looks like a booboo on my part. I left t/BioDBSeqFeature_mysql.t in > my t directory when I packed it up for release. > > I'm tweaking Makefile.PL right now in any case; there are a few errors > and t/BioDBSeqFeature_mysql.t et al. don't get deleted in a make clean. Okay, makes sense now. No big deal, it's still an RC (a developer's RC at that!). Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Wed Oct 18 10:17:35 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Wed, 18 Oct 2006 15:17:35 +0100 Subject: [Bioperl-l] RC2 test results on WinXP In-Reply-To: <001f01c6f2bf$20737270$15327e82@pyrimidine> References: <001f01c6f2bf$20737270$15327e82@pyrimidine> Message-ID: <4536377F.6000408@sheffield.ac.uk> Chris Fields wrote: >> Nathan Haigh wrote: >> >>> I get all tests passing except for BioDBSeqFeature_mysql which fails all >>> tests (1-46). >>> >>> During perl Makefile.PL I get: >>> "I see you have Berkeleydb installed. I will create the DBD tests for >>> Bio::DB::SeqFeature::Store..." >>> >>> I notice under the "needs investigation" there is mention about tests >>> been generated even if DBD::mysql isn't installed. I assume this is the >>> problem? >>> >> Probably. I'm looking into it. Not sure why it wasn't causing a problem >> before now. >> > > Funny, I don't have problems with the BioDBSeqFeature_mysql tests on WinXP > because 'perl Makefile.PL' doesn't detect my MySQL installation, so the > MySQL-based tests don't run even though I have DBD::mysql installed. I > thought this might just be a WinXP issue, but apparently not. If I can get > to it I'll run a few checks. > > This was on WinXP. >> > If this is the problem should DBD::mysql be added to the >> > dependencies in Makefile.PL? >> >> No. You can use the modules in question without mysql (presumably; ie. >> you have a different sql setup), so it makes no sense to warn people >> they don't have a module they absolutely do not need. >> > > Agreed, though I don't know if other relational DB's are supported like > PostgreSQL. > > >>> Is there an easy way to find out what tests are being skipped due to >>> absent modules? >>> >> Ideally, when the skip occurs the test script will issue a message. I >> think that happens in most, if not all cases. >> > > Yes, though we may run into the same issue we had with XEMBL tests not > reporting the reasons it skipped. Each test suite should run an eval{} to > check the required modules, then only skip blocks of tests that rely on > those modules. I think we have caught most of those, but who knows w/o > doing a complete test suite audit? > > Our eventual complete switchover to Test::More should hopefully clean these > up. I don't consider it a pressing issue for this release, though Sendu may > feel differently. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > From hlapp at gmx.net Wed Oct 18 10:36:31 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 18 Oct 2006 10:36:31 -0400 Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in GenBank/EMBL/DDBJ In-Reply-To: <001e01c6f2bd$1769aca0$15327e82@pyrimidine> References: <001e01c6f2bd$1769aca0$15327e82@pyrimidine> Message-ID: On Oct 18, 2006, at 9:55 AM, Chris Fields wrote: > how do we handle any bugs that pop up related to this? By an evil grin, followed by deflecting the blame to NCBI, followed by another evil grin. -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Oct 18 10:43:31 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 09:43:31 -0500 Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in GenBank/EMBL/DDBJ In-Reply-To: Message-ID: <002401c6f2c3$c83c7e30$15327e82@pyrimidine> > On Oct 18, 2006, at 9:55 AM, Chris Fields wrote: > > > how do we handle any bugs that pop up related to this? > > By an evil grin, followed by deflecting the blame to NCBI, followed > by another evil grin. > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== Sounds good to me! One less thing to worry about. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Wed Oct 18 10:45:57 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Wed, 18 Oct 2006 15:45:57 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4536113D.1080307@sheffield.ac.uk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> <4535EBF9.1090706@sendu.me.uk> <4536113D.1080307@sheffield.ac.uk> Message-ID: <45363E25.8010806@sheffield.ac.uk> Nathan Haigh wrote: > I've just added test results for 1.5.2 RC2 to the wiki. > > There are lots of fails for packages other than bioperl-live. I'm not > sure excatly how the test fails/skipps are/should be handled since my > setups are as follows. > > Clean WinXP Pro: > This is a clean install of WinXP Pro SP2 with no major software > installed, other than ActivePerl 5.8.8.819 and a few tools for archive > extracting, anti virus etc. Therefore, I'm unsure how tests in > bioperl-network and bioperl-db should return. For example, I have made > no effort to setup biosql-schema but I thought that maybe there would be > a test that would detect this, and fail, then skip over other tests > gracefully - like the bioperl-run tests when a piece of software is not > installed??? > > Debian Linux: > This is a Bio-Linux machine with quite a lot of bioinformatics software > installed in the Path. So most of the tests in bioperl-run should > probably have passed. The same goes for bioperl-network and bioperl-db > as with my Windows setup. > > If my thoughts are totally wrong - let me know! > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > Just looking into the failed Linux tests. Several of the tests result in errors like: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Unallowed parameter: ARGUMENTS ! STACK: Error::throw STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328 STACK: Bio::Tools::Run::Alignment::Exonerate::AUTOLOAD /home/bo1nsh/cvswc/bioperl-run-live/blib/lib/Bio/Tools/Run/Alignment/Exonerate.pm:126 STACK: Bio::Tools::Run::Alignment::Exonerate::new /home/bo1nsh/cvswc/bioperl-run-live/blib/lib/Bio/Tools/Run/Alignment/Exonerate.pm:154 STACK: t/Exonerate.t:32 ----------------------------------------------------------- ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Unallowed parameter: 'arguments' ! STACK: Error::throw STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328 STACK: Bio::Tools::Run::Hmmer::AUTOLOAD Bio/Tools/Run/Hmmer.pm:172 STACK: Bio::Tools::Run::Hmmer::_run Bio/Tools/Run/Hmmer.pm:253 STACK: Bio::Tools::Run::Hmmer::run Bio/Tools/Run/Hmmer.pm:228 STACK: t/Hmmer.t:54 ----------------------------------------------------------- ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Unallowed parameter: ARGUMENTS ! STACK: Error::throw STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328 STACK: Bio::Tools::Run::Phrap::AUTOLOAD Bio/Tools/Run/Phrap.pm:137 STACK: Bio::Tools::Run::Phrap::new Bio/Tools/Run/Phrap.pm:165 STACK: t/Phrap.t:34 ----------------------------------------------------------- Any ideas?? Nath From hlapp at gmx.net Wed Oct 18 10:51:36 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 18 Oct 2006 10:51:36 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4536113D.1080307@sheffield.ac.uk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> <4535EBF9.1090706@sendu.me.uk> <4536113D.1080307@sheffield.ac.uk> Message-ID: On Oct 18, 2006, at 7:34 AM, Nathan Haigh wrote: > For example, I have made > no effort to setup biosql-schema but I thought that maybe there > would be > a test that would detect this I'm afraid there isn't. Bioperl-db is meaningless without biosql-schema. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bosborne11 at verizon.net Wed Oct 18 10:43:06 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 18 Oct 2006 10:43:06 -0400 Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in GenBank/EMBL/DDBJ In-Reply-To: <001e01c6f2bd$1769aca0$15327e82@pyrimidine> Message-ID: Chris, I'm fairly sure that Swissprot/Uniprot still supports fuzzy locations, all of the more recent examples in t/LocationFactory.t come from there. Brian O. On 10/18/06 9:55 AM, "Chris Fields" wrote: > EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations. > Not sure about UniProt/SwissProt. From cjfields at uiuc.edu Wed Oct 18 11:00:30 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 10:00:30 -0500 Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in GenBank/EMBL/DDBJ In-Reply-To: Message-ID: <002501c6f2c6$27625540$15327e82@pyrimidine> Do they still use the X.Y notations? Those are the most troublesome. I guess we still don't support the ones containing '?'. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: Brian Osborne [mailto:bosborne11 at verizon.net] > Sent: Wednesday, October 18, 2006 9:43 AM > To: Chris Fields; bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] X.Y Fuzzy locations no longer supported in > GenBank/EMBL/DDBJ > > Chris, > > I'm fairly sure that Swissprot/Uniprot still supports fuzzy locations, all > of the more recent examples in t/LocationFactory.t come from there. > > Brian O. > > > On 10/18/06 9:55 AM, "Chris Fields" wrote: > > > EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations. > > Not sure about UniProt/SwissProt. From Kevin.M.Brown at asu.edu Wed Oct 18 11:16:50 2006 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Wed, 18 Oct 2006 08:16:50 -0700 Subject: [Bioperl-l] Blast information Message-ID: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu> I just recently upgraded to 1.5.1 on WinXP to bring this version closer to live to parse some locally created blast files. I'm trying to find the method that returns the values that are underneath the Identities and Positives information as I'm trying to replicate the output of an old blast parser we have here written in RealBasic which is showing its age. Once I have it replicating the old output I then intend to add more features in terms of filtering returned hits (like not returning self->self hits or a->b so don't show b->a). Example: I'm looking for the methods that will return 117 from identities and 117 from positives. I can't just use num_identical/percent_identity as that isn't 100% accurate. >BurkM_2016 Length = 241 Score = 43.2 bits (88), Expect = 7e-005 Identities = 26/117 (22%), Positives = 51/117 (43%) Query: 298 QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL 357 Q F F + A+ ++ + + + L +R GL + P E + A+L Sbjct: 111 QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL 170 Query: 358 MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414 A+ G++L++DV ++ H R ++ L L + +S R +T+ L + L Sbjct: 171 KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227 Thanks, Kevin From cjfields at uiuc.edu Wed Oct 18 11:25:59 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 10:25:59 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4536113D.1080307@sheffield.ac.uk> Message-ID: <002601c6f2c9$b6d04a90$15327e82@pyrimidine> > I've just added test results for 1.5.2 RC2 to the wiki. > > There are lots of fails for packages other than bioperl-live. I'm not > sure excatly how the test fails/skipps are/should be handled since my > setups are as follows. > > Clean WinXP Pro: > This is a clean install of WinXP Pro SP2 with no major software > installed, other than ActivePerl 5.8.8.819 and a few tools for archive > extracting, anti virus etc. Therefore, I'm unsure how tests in > bioperl-network and bioperl-db should return. For example, I have made > no effort to setup biosql-schema but I thought that maybe there would be > a test that would detect this, and fail, then skip over other tests > gracefully - like the bioperl-run tests when a piece of software is not > installed??? > > Debian Linux: > This is a Bio-Linux machine with quite a lot of bioinformatics software > installed in the Path. So most of the tests in bioperl-run should > probably have passed. The same goes for bioperl-network and bioperl-db > as with my Windows setup. > > If my thoughts are totally wrong - let me know! > Nath The bioperl-db tests rely on a local BioSQL database and on having a properly set up configuration file (these are detailed in the bioperl-db INSTALL doc). Furthermore, there are serious problems with bioperl-db and WinXP (see Bug 1938 in bugzilla). There is a workaround, but it isn't perfect by any means. http://bugzilla.open-bio.org/show_bug.cgi?id=1938 Many of the bioperl-run tests rely on env. variables being set properly, so maybe that's why they failed. These should all be detailed in the INSTALL file (but maybe they aren't?). I haven't tried installing bioperl-network or bioperl-run on WinXP or Mac OS X yet but intended on doing this within the week. The INSTALL file details the requirements for the packages (Graph 0.80 is the only one for bioperl-network, for instance, and there isn't a PPM for that version available yet). It would be nice to skip the tests based on absence of the particular modules or installed programs, and I think the final goal is to possibly attempt to do this. However, all of the bioperl-related distributions have their own documentation which outline their installation, requirements, and use. At least we can point to that, which works for now. We could always start up a wiki page for the various bioperl distributions to monitor problems or issues with each based on OS, proposed enhancements/ideas, etc. Also, most (if not all, including core) have been primarily tested on some *nix-related system, which means that they may not work on Win32 systems. Though the Windows support is light-years ahead of what it used to be circa rel 0.7, I don't think it is full-proof yet, as witnessed by the bioperl-db bug. Frankly, we need more WinXP users for those packages willing to test them out and offer suggestions. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign l From bosborne11 at verizon.net Wed Oct 18 11:13:51 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 18 Oct 2006 11:13:51 -0400 Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in GenBank/EMBL/DDBJ In-Reply-To: <002501c6f2c6$27625540$15327e82@pyrimidine> Message-ID: Chris, No, I don't think they use the form X.Y. See below, from t/LocationFactory.t, we do support most of the forms using ?. Supposedly these tests accommodate all of the possible fuzzy locations encountered in Swissprot, I wrote these a year or so ago. Brian O. # UNCERTAIN locations and positions (Swissprot) "?2465..2774" => [$fuzzy_impl, 2465, 2465, "UNCERTAIN", 2774, 2774, "EXACT", "EXACT", 1, 1], "22..?64" => [$fuzzy_impl, 22, 22, "EXACT", 64, 64, "UNCERTAIN", "EXACT", 1, 1], "?22..?64" => [$fuzzy_impl, 22, 22, "UNCERTAIN", 64, 64, "UNCERTAIN", "EXACT", 1, 1], "?..>393" => [$fuzzy_impl, undef, undef, "UNCERTAIN", 393, undef, "AFTER", "UNCERTAIN", 1, 1], "<1..?" => [$fuzzy_impl, undef, 1, "BEFORE", undef, undef, "UNCERTAIN", "UNCERTAIN", 1, 1], "?..536" => [$fuzzy_impl, undef, undef, "UNCERTAIN", 536, 536, "EXACT", "UNCERTAIN", 1, 1], "1..?" => [$fuzzy_impl, 1, 1, "EXACT", undef, undef, "UNCERTAIN", "UNCERTAIN", 1, 1], "?..?" => [$fuzzy_impl, undef, undef, "UNCERTAIN", undef, undef, "UNCERTAIN", "UNCERTAIN", 1, 1], # Not working yet: #"12..?1" => [$fuzzy_impl, # 1, 1, "UNCERTAIN", 12, 12, "EXACT", "EXACT", 1, 1] On 10/18/06 11:00 AM, "Chris Fields" wrote: > Do they still use the X.Y notations? Those are the most troublesome. I > guess we still don't support the ones containing '?'. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > >> -----Original Message----- >> From: Brian Osborne [mailto:bosborne11 at verizon.net] >> Sent: Wednesday, October 18, 2006 9:43 AM >> To: Chris Fields; bioperl-l at lists.open-bio.org >> Subject: Re: [Bioperl-l] X.Y Fuzzy locations no longer supported in >> GenBank/EMBL/DDBJ >> >> Chris, >> >> I'm fairly sure that Swissprot/Uniprot still supports fuzzy locations, all >> of the more recent examples in t/LocationFactory.t come from there. >> >> Brian O. >> >> >> On 10/18/06 9:55 AM, "Chris Fields" wrote: >> >>> EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations. >>> Not sure about UniProt/SwissProt. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Wed Oct 18 12:56:07 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 11:56:07 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <002601c6f2c9$b6d04a90$15327e82@pyrimidine> Message-ID: <000401c6f2d6$5144e2f0$15327e82@pyrimidine> ... > I haven't tried installing bioperl-network or bioperl-run on WinXP or Mac > OS All, > X yet but intended on doing this within the week. The INSTALL file > details > the requirements for the packages (Graph 0.80 is the only one for > bioperl-network, for instance, and there isn't a PPM for that version > available yet). ... As a followup in this, I tried bioperl-network and had similar failed tests with Graph 0.79 (the only PPM available from ActiveState). However, the INSTALL docs state that Graph 0.80 is needed, and the test run gave several warnings about not having Graph 0.80 installed. I made a PPM of Graph 0.80, installed, retried bioperl-network tests, and everything passed. Maybe we need to have a Graph PPM available for those who want bioperl-network? As for bioperl-run, all tests passed from a new CVS checkout even though I have none of the programs installed, so they seem to skip properly. The test run also printed warnings when a program wasn't available or installed. Chris From bosborne11 at verizon.net Wed Oct 18 13:10:34 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 18 Oct 2006 13:10:34 -0400 Subject: [Bioperl-l] Blast information In-Reply-To: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu> Message-ID: Kevin, Are you looking for hsp_length()? See the SearchIO HOWTO for a list of methods: http://www.bioperl.org/wiki/HOWTO:SearchIO Brian O. On 10/18/06 11:16 AM, "Kevin Brown" wrote: > I just recently upgraded to 1.5.1 on WinXP to bring this version closer > to live to parse some locally created blast files. I'm trying to find > the method that returns the values that are underneath the Identities > and Positives information as I'm trying to replicate the output of an > old blast parser we have here written in RealBasic which is showing its > age. Once I have it replicating the old output I then intend to add > more features in terms of filtering returned hits (like not returning > self->self hits or a->b so don't show b->a). > > Example: > I'm looking for the methods that will return 117 from identities and 117 > from positives. I can't just use num_identical/percent_identity as that > isn't 100% accurate. > >> BurkM_2016 > Length = 241 > > Score = 43.2 bits (88), Expect = 7e-005 > Identities = 26/117 (22%), Positives = 51/117 (43%) > > Query: 298 QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL > 357 > Q F F + A+ ++ + + + L +R GL + P E + A+L > Sbjct: 111 QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL > 170 > > Query: 358 MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414 > A+ G++L++DV ++ H R ++ L L + +S R +T+ L + L > Sbjct: 171 KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227 > > Thanks, > Kevin > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From Kevin.M.Brown at asu.edu Wed Oct 18 17:25:48 2006 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Wed, 18 Oct 2006 14:25:48 -0700 Subject: [Bioperl-l] Blast information Message-ID: <1A4207F8295607498283FE9E93B775B4022A71C3@EX02.asurite.ad.asu.edu> Yes, that does indeed look like what I was after. > -----Original Message----- > From: Brian Osborne [mailto:bosborne11 at verizon.net] > Sent: Wednesday, October 18, 2006 10:11 AM > To: Kevin Brown; bioperl-l > Subject: Re: [Bioperl-l] Blast information > > Kevin, > > Are you looking for hsp_length()? See the SearchIO HOWTO for a list of > methods: > > http://www.bioperl.org/wiki/HOWTO:SearchIO > > > Brian O. > > > On 10/18/06 11:16 AM, "Kevin Brown" wrote: > > > I just recently upgraded to 1.5.1 on WinXP to bring this > version closer > > to live to parse some locally created blast files. I'm > trying to find > > the method that returns the values that are underneath the > Identities > > and Positives information as I'm trying to replicate the > output of an > > old blast parser we have here written in RealBasic which is > showing its > > age. Once I have it replicating the old output I then intend to add > > more features in terms of filtering returned hits (like not > returning > > self->self hits or a->b so don't show b->a). > > > > Example: > > I'm looking for the methods that will return 117 from > identities and 117 > > from positives. I can't just use > num_identical/percent_identity as that > > isn't 100% accurate. > > > >> BurkM_2016 > > Length = 241 > > > > Score = 43.2 bits (88), Expect = 7e-005 > > Identities = 26/117 (22%), Positives = 51/117 (43%) > > > > Query: 298 > QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL > > 357 > > Q F F + A+ ++ + + + L +R GL + > P E + A+L > > Sbjct: 111 > QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL > > 170 > > > > Query: 358 > MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414 > > A+ G++L++DV ++ H R ++ L L + +S R +T+ L + L > > Sbjct: 171 > KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227 > > > > Thanks, > > Kevin > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > From n.appleby at uq.edu.au Wed Oct 18 17:58:06 2006 From: n.appleby at uq.edu.au (Nikki Appleby) Date: Thu, 19 Oct 2006 07:58:06 +1000 Subject: [Bioperl-l] CONTIG dealing Message-ID: <000f01c6f300$7e2f3390$1b776682@IMBPC.AD> I have just entered the wonderful new world of BioPerl, so the answer to my question may be obvious to any of the gurus reading this. I need to collect sequence features and ontology annotations. Here goes. I am retrieving sequences from SwissProt via Bio::DB::SwissProt and get_Seq_by_id, for this example Q8RZV7. Once I have parsed it into an RDBMS format that I am happy with I can get at the xref ids. In this case, they are AP003451; BAB86144.1; -; Genomic_DNA. AP008207; BAF07116.1; -; Genomic_DNA. AB103395; BAC81207.1; -; mRNA. I can happily go off and fetch those from Bio::DB::GenBank (first column), and Bio::DB::GenPept (second). All good, except... AP008207 is a contig. I don't want to get all of the features for the entire thing, just the single contig that actually matches the original sequence. It takes a couple of hours to get at it and then it gives me way too much. I will come across this problem with other sequences. How do I (a) find out if it is a contig without downloading it in it's entirety and (b) extract the list of sequences that are about to be contigged together. I have searched the web for answers, including this list, but see nothing. Help! Nikki Appleby. From bosborne11 at verizon.net Wed Oct 18 20:54:04 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 18 Oct 2006 20:54:04 -0400 Subject: [Bioperl-l] LocatableSeq object vs Sequence Object In-Reply-To: <830b42e70610170926h5bdaabd6hc1352b8d4f3c8c4c@mail.gmail.com> Message-ID: Peter, I'm not understanding your question, partly because your letter and your code are saying different things. You say you want to call location_from_column() but your code shows you calling species(). What happens when you call location_from_column? Do you see errors? Brian O. On 10/17/06 12:26 PM, "Peter H. Baenziger" wrote: > I was thinking I could use: > foreach $seq ($alignment->each_seq()) > to loop through the sequences and call: > $seq->location_from_column($pos) > on each of the sequences. From cjfields at uiuc.edu Wed Oct 18 22:46:14 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 21:46:14 -0500 Subject: [Bioperl-l] CONTIG dealing In-Reply-To: <000f01c6f300$7e2f3390$1b776682@IMBPC.AD> References: <000f01c6f300$7e2f3390$1b776682@IMBPC.AD> Message-ID: On Oct 18, 2006, at 4:58 PM, Nikki Appleby wrote: > > I have just entered the wonderful new world of BioPerl, so the > answer to my > question may be obvious to any of the gurus reading this. > > I need to collect sequence features and ontology annotations. Here > goes. > > I am retrieving sequences from SwissProt via Bio::DB::SwissProt and > get_Seq_by_id, for this example Q8RZV7. Once I have parsed it into > an RDBMS > format that I am happy with I can get at the xref ids. In this > case, they > are > > AP003451; BAB86144.1; -; Genomic_DNA. > AP008207; BAF07116.1; -; Genomic_DNA. > AB103395; BAC81207.1; -; mRNA. > > I can happily go off and fetch those from Bio::DB::GenBank (first > column), > and Bio::DB::GenPept (second). All good, except... > > AP008207 is a contig. I don't want to get all of the features for > the entire > thing, just the single contig that actually matches the original > sequence. > It takes a couple of hours to get at it and then it gives me way > too much. > > I will come across this problem with other sequences. How do I (a) > find out > if it is a contig without downloading it in it's entirety and (b) > extract > the list of sequences that are about to be contigged together. > > I have searched the web for answers, including this list, but see > nothing. > Help! > > Nikki Appleby. The default setting for the retrieval format for GenBank is 'gbwithparts' (which gets the full sequence at all times). You can set this to 'gb' using request_format() to retrieve the sequence file with the contig information instead of the sequence, if it contains such (otherwise it just retrieves the sequence anyway). However, I have noticed this particular file does not represent a true contig record but is the entire chromosome sequence. The contig information is in the comments section, probably b/c the record is converted over. You could just download the sequence record and run regexp to grab the comments section, then parse out the contigs (a pain) if you really want that. Or you could try to find the equivalent GenBank record, such as the ones derived from the WGS records. I did notice the list of dbxrefs in your swissprot record indicate three EMBL sequences. If the order is consistent for the SwissProt entries you want, they probably represent: The contig (what you want): AP003451; BAB86144.1; -; Genomic_DNA. The supercontig (chromosome) : AP008207; BAF07116.1; -; Genomic_DNA. The cDNA : AB103395; BAC81207.1; -; mRNA. I checked the first one (AP003451), which seems to confirm this. Since the chromosome supercontig is built from the smaller sequence contigs you could just grab the first EMBL dbxref instead of all of them. It parses much faster than the chromosome file. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Wed Oct 18 11:47:14 2006 From: jason at bioperl.org (Jason Stajich) Date: Wed, 18 Oct 2006 08:47:14 -0700 Subject: [Bioperl-l] Blast information In-Reply-To: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu> References: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu> Message-ID: <6B7D24F3-69F1-498D-AB53-B4CEB14E4F3D@bioperl.org> I think this will work for you. The seq_inds method parses the middle homology sequence and classifies each alignment column and returns a list of the columns meeting the criteria. You can interrogate query or hit in this case since you are requiring it to be identical my $identicalbases = scalar $hsp->seq_inds('query', 'identical'); my $conservedbases = scalar $hsp->seq_inds('query','conserved'); Conserved returns those identical or conserved, if you want just those with conservative replacements use 'conserved-not-identical' See http://bioperl.org/wiki/HOWTO:SearchIO#Table_of_Methods for more info. -jason On Oct 18, 2006, at 8:16 AM, Kevin Brown wrote: > I just recently upgraded to 1.5.1 on WinXP to bring this version > closer > to live to parse some locally created blast files. I'm trying to find > the method that returns the values that are underneath the Identities > and Positives information as I'm trying to replicate the output of an > old blast parser we have here written in RealBasic which is showing > its > age. Once I have it replicating the old output I then intend to add > more features in terms of filtering returned hits (like not returning > self->self hits or a->b so don't show b->a). > > Example: > I'm looking for the methods that will return 117 from identities > and 117 > from positives. I can't just use num_identical/percent_identity as > that > isn't 100% accurate. > >> BurkM_2016 > Length = 241 > > Score = 43.2 bits (88), Expect = 7e-005 > Identities = 26/117 (22%), Positives = 51/117 (43%) > > Query: 298 > QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL > 357 > Q F F + A+ ++ + + + L +R GL + P E + > A+L > Sbjct: 111 > QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL > 170 > > Query: 358 > MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414 > A+ G++L++DV ++ H R ++ L L + +S R +T+ L + L > Sbjct: 171 > KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227 > > Thanks, > Kevin > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From jason at bioperl.org Thu Oct 19 01:00:28 2006 From: jason at bioperl.org (Jason Stajich) Date: Wed, 18 Oct 2006 22:00:28 -0700 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> References: <001301c6f215$07a9a070$15327e82@pyrimidine> <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> Message-ID: <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org> So I'm unsure what we should do here. We can certainly fix the problem which you report which is relying on the "" method -- if you were to do instead: print $_->database, ":", $_->primary_id, "\n"; you'll get the right answer. We at a minimum just fix the auto- string converting method to do The Right Thing. But I am not sure if we should keep the version out of the primary_id field. This will require some rejiggering in several modules when it comes to printing DBlinks and I don't want to do this before the release. I also am not sure if there was an explicit reason why someone did put the version information in the primary_id. (I hope it wasn't me because I don't think I'm going to remember why). Does anyone else have a strong feeling? -jason On Oct 17, 2006, at 12:01 PM, Erikjan wrote: > Hello, > > I noticed a little problem with the Annotation "DBLink" from > GenBank entries > > When I run: > > perl -MBio::DB::GenBank -e 'my $gi = > 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio = > $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my > $ac=$seq->annotation(); my @annotations = $ac->get_Annotations > ("dblink"); > for(@annotations) { print $_, "\n";} print $INC{ > "Bio/Annotation/DBLink.pm" }, "\n"; ' > > This yields: > > GenBank:AL591065.17.17 > > and the place where the used Bio/Annotation/DBLink.pm resides. > > Can others repeat this? > > I have dug into the source a little and Bio::Annotation::DBLink > seems to > be the place where this happens: it has a concatenation which leads to > that repeated version number. > > It this something that I should fix "client-side", so to speak, or > is it > worthwhile to add some logic to that concatenation to prevent this? > > > Thanks, > > Eric > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From n.haigh at sheffield.ac.uk Thu Oct 19 02:41:02 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 19 Oct 2006 07:41:02 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <000401c6f2d6$5144e2f0$15327e82@pyrimidine> References: <000401c6f2d6$5144e2f0$15327e82@pyrimidine> Message-ID: <45371DFE.6050306@sheffield.ac.uk> > As a followup in this, I tried bioperl-network and had similar failed tests > with Graph 0.79 (the only PPM available from ActiveState). However, the > INSTALL docs state that Graph 0.80 is needed, and the test run gave several > warnings about not having Graph 0.80 installed. > > I made a PPM of Graph 0.80, installed, retried bioperl-network tests, and > everything passed. Maybe we need to have a Graph PPM available for those > who want bioperl-network? > > As for bioperl-run, all tests passed from a new CVS checkout even though I > have none of the programs installed, so they seem to skip properly. The > test run also printed warnings when a program wasn't available or installed. > > > Chris > > If you can pop the Graph 0.80 ppd files in DIST/RC/ i'll make modifications to integrate them into the package.xml file for PPM4 clients. Nath From n.haigh at sheffield.ac.uk Thu Oct 19 06:40:21 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 19 Oct 2006 11:40:21 +0100 Subject: [Bioperl-l] bioperl-run t/EMBOSS.t Message-ID: <45375615.1020603@sheffield.ac.uk> Should line 25 read: require Bio::Factory::EMBOSS instead of: require Bio::EMBOSS::Factory; Nath From hlapp at gmx.net Thu Oct 19 09:56:05 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 19 Oct 2006 09:56:05 -0400 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org> References: <001301c6f215$07a9a070$15327e82@pyrimidine> <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org> Message-ID: <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net> Here is the overload code: use overload '""' => sub { (($_[0]->database ? $_[0]->database . ':' : '' ) . ($_[0]->primary_id ? $_[0]->primary_id : '') . ($_[0]->version ? '.' . $_[0]->version : '')) || '' }; Except that the last '||' is redundant and unnecessary (it either does nothing or replaces an empty string with an empty string), I don't see the potential for duplicating the version number here - unless primary_id() did that, which I don't see it doing. So, to me this seems to come from a parsing error in the beginning, rather than an erroneous mangling of version into primary_id later. Is someone in the position to confirm this? -hilmar On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: > So I'm unsure what we should do here. > > We can certainly fix the problem which you report which is relying on > the "" method -- if you were to do instead: > print $_->database, ":", $_->primary_id, "\n"; > > you'll get the right answer. We at a minimum just fix the auto- > string converting method to do The Right Thing. > > But I am not sure if we should keep the version out of the primary_id > field. This will require some rejiggering in several modules when it > comes to printing DBlinks and I don't want to do this before the > release. I also am not sure if there was an explicit reason why > someone did put the version information in the primary_id. (I hope it > wasn't me because I don't think I'm going to remember why). > > Does anyone else have a strong feeling? > > -jason > On Oct 17, 2006, at 12:01 PM, Erikjan wrote: > >> Hello, >> >> I noticed a little problem with the Annotation "DBLink" from >> GenBank entries >> >> When I run: >> >> perl -MBio::DB::GenBank -e 'my $gi = >> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio = >> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my >> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations >> ("dblink"); >> for(@annotations) { print $_, "\n";} print $INC{ >> "Bio/Annotation/DBLink.pm" }, "\n"; ' >> >> This yields: >> >> GenBank:AL591065.17.17 >> >> and the place where the used Bio/Annotation/DBLink.pm resides. >> >> Can others repeat this? >> >> I have dug into the source a little and Bio::Annotation::DBLink >> seems to >> be the place where this happens: it has a concatenation which >> leads to >> that repeated version number. >> >> It this something that I should fix "client-side", so to speak, or >> is it >> worthwhile to add some logic to that concatenation to prevent this? >> >> >> Thanks, >> >> Eric >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich, PhD > Miller Research Fellow > University of California > Dept of Plant and Microbial Biology > 321 Koshland Hall #3102 > Berkeley, CA 94720-3102 > lab: 510.642.8441 > http://pmb.berkeley.edu/~taylor/people/js.html > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From dmessina at wustl.edu Thu Oct 19 09:55:31 2006 From: dmessina at wustl.edu (David Messina) Date: Thu, 19 Oct 2006 08:55:31 -0500 Subject: [Bioperl-l] missing documentation (request for help) Message-ID: <69453D5F-7794-4DC7-BAE1-A8B2191752E6@wustl.edu> Hi all, There are a few modules missing a one-line description, and by one- line description, I'm referring to the part that comes after the module name in the POD. e.g. in =head1 NAME Bio::SearchIO - Driver for parsing Sequence Database Searches (BLAST, FASTA, ...) =head1 SYNOPSIS [etc...] "Driver for parsing Sequence Database Searches (BLAST, FASTA, ...)" is the one-line description (even though it falls onto two lines) :). I fixed the modules that I knew something about, but there are some I haven't used. Perhaps the author, or someone else familiar with these modules, could fill in an appropriate short description? Here is the list of affected modules: Bio::DB::Expression Bio::Expression::Contact Bio::Expression::DataSet Bio::Expression::Platform Bio::Expression::Sample Bio::Search::Processor Bio::DB::EUtilities::ElinkData Bio::DB::GFF::Adaptor::memory::feature_serializer Bio::DB::SeqFeature::Store::DBI::Iterator Bio::Expression::FeatureGroup::FeatureGroupMas50 Bio::Expression::FeatureSet::FeatureSetMas50 Bio::Matrix::PSM::PsmHeaderI Bio::OntologyIO::Handlers::BaseSAXHandler Some of these are missing other POD parts as well -- please add those too if you can. Thanks, Dave From mckays at cshl.edu Thu Oct 19 09:51:18 2006 From: mckays at cshl.edu (Sheldon McKay) Date: Thu, 19 Oct 2006 09:51:18 -0400 Subject: [Bioperl-l] chromosome ideograms Message-ID: <6b0de00426b3c04b0d0d7641bc8e14e3@cshl.edu> Hi, Sorry for the late reply. I have been working on a karyotype drawing tool as part of the Generic Genome Browser that may be useful. In addition to drawing features next to chromosome ideograms, it also supports making chromosome 'bands' from any kind of scored features to create a sort of heat map on the chromosome itself. I have a demo running at http://mckay.cshl.edu/cgi-bin/gbrowse_karyotype and the source is available from the GMOD CVS HEAD http://www.gmod.org/cvs Sheldon -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Sheldon McKay, PhD Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 From n.haigh at sheffield.ac.uk Thu Oct 19 11:37:31 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 19 Oct 2006 15:37:31 +0000 Subject: [Bioperl-l] bioperl-run t/EMBOSS.t In-Reply-To: <45375615.1020603@sheffield.ac.uk> References: <45375615.1020603@sheffield.ac.uk> Message-ID: <45379BBB.1040400@sheffield.ac.uk> Thanks for committing that change Brian. Now the tests proceed from this point, I get the following error: ------------- EXCEPTION: Bio::Root::NotImplemented ------------- MSG: Abstract method "Bio::Tools::Run::WrapperBase::program_dir" is not implemented by package Bio::Tools::Run::EMBOSSApplication. This is not your fault - author of Bio::Tools::Run::EMBOSSApplication should be blamed! STACK: Error::throw STACK: Bio::Root::Root::throw /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/Root.pm:350 STACK: Bio::Root::RootI::throw_not_implemented /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/RootI.pm:522 STACK: Bio::Tools::Run::WrapperBase::program_dir /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:346 STACK: Bio::Tools::Run::WrapperBase::program_path /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:327 STACK: Bio::Tools::Run::WrapperBase::executable /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:297 STACK: t/EMBOSS.t:58 ---------------------------------------------------------------- From N.Haigh at sheffield.ac.uk Thu Oct 19 11:03:00 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 16:03:00 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <45379BBB.1040400@sheffield.ac.uk> References: <45375615.1020603@sheffield.ac.uk> <45379BBB.1040400@sheffield.ac.uk> Message-ID: <1161270180.453793a432e4f@webmail.shef.ac.uk> I thought I'd have my first proper try at writing some tests. I was wondering if there is a template test file that I should use/study in order to be consistent with other tests. Failing that - Is there a good test writing style I should follow in one of the other test files? Thanks Nathan From bosborne11 at verizon.net Thu Oct 19 11:06:08 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Thu, 19 Oct 2006 11:06:08 -0400 Subject: [Bioperl-l] bioperl-run t/EMBOSS.t In-Reply-To: <45379BBB.1040400@sheffield.ac.uk> Message-ID: Nathan, Yes, I see. Those EMBOSS programs work a bit differently from the typical app run by bioperl-run, there's no need for WrapperBase methods like program_dir(), executable(), it seems. Well, I can try and take a look at this tonight but there's probably someone better suited to this than me, I've spent very little time with bioperl-run. Volunteer? Brian O. On 10/19/06 11:37 AM, "Nathan S. Haigh" wrote: > Thanks for committing that change Brian. Now the tests proceed from this > point, I get the following error: > > ------------- EXCEPTION: Bio::Root::NotImplemented ------------- > MSG: Abstract method "Bio::Tools::Run::WrapperBase::program_dir" is not > implemented by package Bio::Tools::Run::EMBOSSApplication. > This is not your fault - author of Bio::Tools::Run::EMBOSSApplication > should be blamed! > > STACK: Error::throw > STACK: Bio::Root::Root::throw > /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/Root.pm:350 > STACK: Bio::Root::RootI::throw_not_implemented > /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/RootI.pm:522 > STACK: Bio::Tools::Run::WrapperBase::program_dir > /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:346 > STACK: Bio::Tools::Run::WrapperBase::program_path > /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:327 > STACK: Bio::Tools::Run::WrapperBase::executable > /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:297 > STACK: t/EMBOSS.t:58 > ---------------------------------------------------------------- > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From niels at genomics.dk Thu Oct 19 11:16:37 2006 From: niels at genomics.dk (Niels Larsen) Date: Thu, 19 Oct 2006 17:16:37 +0200 Subject: [Bioperl-l] From EBI support re WU-Blast SOAP service In-Reply-To: <4535EBF9.1090706@sendu.me.uk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> <4535EBF9.1090706@sendu.me.uk> Message-ID: <453796D5.2070808@genomics.dk> Sendu Bala wrote: >> I invoked the EBI script >> >> http://www.ebi.ac.uk/Tools/webservices/downloads/WSWUBlastSOAPLite.zip >> >> like this >> >> WSWUBlastClient.pl -p blastn -D embl test.fasta >> >> where the content of test.fasta is below, and got >> >> Can't find method element in the message at >> /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311. > > As you admit, this is not a Bioperl issue. I would suggest you contact > EBI support. > To use EBI's WU-blast SOAP interface from perl, EBI support says it one must use SOAP::Lite v 0.60 (no later version) and include '--email you.example.com' on the command line. This is neither evident from their web pages or the script usage statement, but they promised to fix. ------------------------------------------------------------------------ Niels Larsen Danish Genome Institute Gustav Wieds vej 10 C DK-8000 Aarhus C Denmark Electronic mail: niels at genomics.dk Skype: niels_larsen_denmark Telephone: +45-8942-5268 Telefax: +45-8620-1222 ------------------------------------------------------------------------ From cjfields at uiuc.edu Thu Oct 19 11:31:45 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 10:31:45 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <45371DFE.6050306@sheffield.ac.uk> Message-ID: <001501c6f393$b66bd4a0$15327e82@pyrimidine> > > As a followup in this, I tried bioperl-network and had similar failed > tests > > with Graph 0.79 (the only PPM available from ActiveState). However, the > > INSTALL docs state that Graph 0.80 is needed, and the test run gave > several > > warnings about not having Graph 0.80 installed. > > > > I made a PPM of Graph 0.80, installed, retried bioperl-network tests, > and > > everything passed. Maybe we need to have a Graph PPM available for > those > > who want bioperl-network? > > > > As for bioperl-run, all tests passed from a new CVS checkout even though > I > > have none of the programs installed, so they seem to skip properly. The > > test run also printed warnings when a program wasn't available or > installed. > > > > > > Chris > > > > > If you can pop the Graph 0.80 ppd files in DIST/RC/ i'll make > modifications to integrate them into the package.xml file for PPM4 > clients. > > Nath Will do. Should these be forwarded to Mauricio? Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From N.Haigh at sheffield.ac.uk Thu Oct 19 11:38:05 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 16:38:05 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <001501c6f393$b66bd4a0$15327e82@pyrimidine> References: <001501c6f393$b66bd4a0$15327e82@pyrimidine> Message-ID: <1161272285.45379bdd1aea4@webmail.shef.ac.uk> > > If you can pop the Graph 0.80 ppd files in DIST/RC/ i'll make > > modifications to integrate them into the package.xml file for PPM4 > > clients. > > > > Nath > > Will do. Should these be forwarded to Mauricio? > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > If you don't have access to the web, you can send them to me - I now have an account on that server. Cheers Nath From cjfields at uiuc.edu Thu Oct 19 11:45:00 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 10:45:00 -0500 Subject: [Bioperl-l] test::more template In-Reply-To: <1161270180.453793a432e4f@webmail.shef.ac.uk> Message-ID: <001601c6f395$8a752ed0$15327e82@pyrimidine> > I thought I'd have my first proper try at writing some tests. I was > wondering if there is a template test file that I should use/study in > order to be > consistent with other tests. > > Failing that - Is there a good test writing style I should follow in one > of the other test files? > > Thanks > Nathan I would start with the Test::Simple and Test::More perldoc; they're pretty self-explanatory. You can look at the various test suites using Test::More as well for pointers. By far, most tests will use is(). You can use SKIP blocks to skip tests that have a requirement, or skip all tests if they all require something. Pretty flexible. We should probably get a wiki page for the developers underway, maybe a HOWTO on writing tests. At least have these focus on BioPerl, OOP, remote DB tests, etc. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Oct 19 12:23:40 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 11:23:40 -0500 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net> Message-ID: <001b01c6f39a$f0288ba0$15327e82@pyrimidine> > Here is the overload code: > > use overload '""' => sub { > (($_[0]->database ? $_[0]->database . ':' : '' ) > . ($_[0]->primary_id ? $_[0]->primary_id : '') > . ($_[0]->version ? '.' . $_[0]->version : '')) > || '' }; > > Except that the last '||' is redundant and unnecessary (it either > does nothing or replaces an empty string with an empty string), I > don't see the potential for duplicating the version number here - > unless primary_id() did that, which I don't see it doing. > > So, to me this seems to come from a parsing error in the beginning, > rather than an erroneous mangling of version into primary_id later. > > Is someone in the position to confirm this? > > -hilmar I have attached a script to the bug report on bugzilla, as well as the test output sequence and the actual GenBank record. There are a number of problems: 1) primary_id() is assigned both the id and version. 2) version() is still assigned the version. The above explain when printing the object directly using the overload (it concatenates them). However, there are a few more issues. The ID is printed normally (accession.version), but the source DB is not present when SeqIO handles the sequence. I have attached the output and the original GenBank record to the bug report. I can look into it but it won't be today; got my hands full with enzyme assays. Chris From N.Haigh at sheffield.ac.uk Thu Oct 19 12:50:57 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 17:50:57 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <001601c6f395$8a752ed0$15327e82@pyrimidine> References: <001601c6f395$8a752ed0$15327e82@pyrimidine> Message-ID: <1161276657.4537acf1edc80@webmail.shef.ac.uk> Quoting Chris Fields : > > I thought I'd have my first proper try at writing some tests. I was > > wondering if there is a template test file that I should use/study in > > order to be > > consistent with other tests. > > > > Failing that - Is there a good test writing style I should follow in one > > of the other test files? > > > > Thanks > > Nathan > > I would start with the Test::Simple and Test::More perldoc; they're pretty > self-explanatory. You can look at the various test suites using Test::More > as well for pointers. By far, most tests will use is(). You can use SKIP > blocks to skip tests that have a requirement, or skip all tests if they all > require something. Pretty flexible. > > We should probably get a wiki page for the developers underway, maybe a > HOWTO on writing tests. At least have these focus on BioPerl, OOP, remote > DB tests, etc. > > Chris > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > > Just working through some test things now, I thought I'd start on the bioperl-run stuff as I thought it might be a bit more straight forward, i'm familiar with some of them and they seem to get neglected. I'm heavily commenting my tests with the thought of starting a wiki guide to testing Bioperl modules. See how far I get! Nath From hlapp at gmx.net Thu Oct 19 13:11:27 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 19 Oct 2006 13:11:27 -0400 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org> References: <001301c6f215$07a9a070$15327e82@pyrimidine> <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org> <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net> <8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org> Message-ID: <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net> Actually you did that Jason: http://tinyurl.com/ye2edk Apparently the motivation was to "parse swissprot fields in genpept file (dbsource)"? It clearly looks wrong to add the version. You've probably had a reason why you did this at the time but if we (you :) can't recover that I guess it's best to just fix it to do the right thing (in both places obviously). -hilmar On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote: > Well there is explicit addition of the version to the primary id so > it isn't so much a parsing error as a deliberate decision to append > it. > see Bio::SeqIO::genbank > > to make the dblink > $annotation- > >add_Annotation > ('dblink', > > Bio::Annotation::DBLink->new > (-primary_id > => $id . "." . $version, > -version => > $version, > -database => > $db, > -tagname => > 'dblink')); > > and the code to print the dblink back out in the writer already > assumes the version number is appended... > > foreach my $ref ( $seq->annotation->get_Annotations > ('dblink') ) { > # if ($ref->comment eq 'DBSOURCE') { > $self->_print('DBSOURCE accession ', > $ref->primary_id, "\n"); > # } > } > > On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote: > >> Here is the overload code: >> >> use overload '""' => sub { >> (($_[0]->database ? $_[0]->database . ':' : '' ) >> . ($_[0]->primary_id ? $_[0]->primary_id : '') >> . ($_[0]->version ? '.' . $_[0]->version : '')) >> || '' }; >> >> Except that the last '||' is redundant and unnecessary (it either >> does nothing or replaces an empty string with an empty string), I >> don't see the potential for duplicating the version number here - >> unless primary_id() did that, which I don't see it doing. >> >> So, to me this seems to come from a parsing error in the >> beginning, rather than an erroneous mangling of version into >> primary_id later. >> >> Is someone in the position to confirm this? >> >> -hilmar >> >> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: >> >>> So I'm unsure what we should do here. >>> >>> We can certainly fix the problem which you report which is >>> relying on >>> the "" method -- if you were to do instead: >>> print $_->database, ":", $_->primary_id, "\n"; >>> >>> you'll get the right answer. We at a minimum just fix the auto- >>> string converting method to do The Right Thing. >>> >>> But I am not sure if we should keep the version out of the >>> primary_id >>> field. This will require some rejiggering in several modules >>> when it >>> comes to printing DBlinks and I don't want to do this before the >>> release. I also am not sure if there was an explicit reason why >>> someone did put the version information in the primary_id. (I >>> hope it >>> wasn't me because I don't think I'm going to remember why). >>> >>> Does anyone else have a strong feeling? >>> >>> -jason >>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote: >>> >>>> Hello, >>>> >>>> I noticed a little problem with the Annotation "DBLink" from >>>> GenBank entries >>>> >>>> When I run: >>>> >>>> perl -MBio::DB::GenBank -e 'my $gi = >>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my >>>> $seqio = >>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my >>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations >>>> ("dblink"); >>>> for(@annotations) { print $_, "\n";} print $INC{ >>>> "Bio/Annotation/DBLink.pm" }, "\n"; ' >>>> >>>> This yields: >>>> >>>> GenBank:AL591065.17.17 >>>> >>>> and the place where the used Bio/Annotation/DBLink.pm resides. >>>> >>>> Can others repeat this? >>>> >>>> I have dug into the source a little and Bio::Annotation::DBLink >>>> seems to >>>> be the place where this happens: it has a concatenation which >>>> leads to >>>> that repeated version number. >>>> >>>> It this something that I should fix "client-side", so to speak, or >>>> is it >>>> worthwhile to add some logic to that concatenation to prevent this? >>>> >>>> >>>> Thanks, >>>> >>>> Eric >>>> >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> -- >>> Jason Stajich, PhD >>> Miller Research Fellow >>> University of California >>> Dept of Plant and Microbial Biology >>> 321 Koshland Hall #3102 >>> Berkeley, CA 94720-3102 >>> lab: 510.642.8441 >>> http://pmb.berkeley.edu/~taylor/people/js.html >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> > > -- > Jason Stajich, PhD > Miller Research Fellow > University of California > Dept of Plant and Microbial Biology > 321 Koshland Hall #3102 > Berkeley, CA 94720-3102 > lab: 510.642.8441 > http://pmb.berkeley.edu/~taylor/people/js.html > > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From N.Haigh at sheffield.ac.uk Thu Oct 19 13:17:33 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 18:17:33 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <001601c6f395$8a752ed0$15327e82@pyrimidine> References: <001601c6f395$8a752ed0$15327e82@pyrimidine> Message-ID: <1161278253.4537b32dd3d15@webmail.shef.ac.uk> Quoting Chris Fields : > > I thought I'd have my first proper try at writing some tests. I was > > wondering if there is a template test file that I should use/study in > > order to be > > consistent with other tests. > > > > Failing that - Is there a good test writing style I should follow in one > > of the other test files? > > > > Thanks > > Nathan > > I would start with the Test::Simple and Test::More perldoc; they're pretty > self-explanatory. You can look at the various test suites using Test::More > as well for pointers. By far, most tests will use is(). You can use SKIP > blocks to skip tests that have a requirement, or skip all tests if they all > require something. Pretty flexible. > > We should probably get a wiki page for the developers underway, maybe a > HOWTO on writing tests. At least have these focus on BioPerl, OOP, remote > DB tests, etc. > > Chris > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > Just wrote a partial and small test script for t/Amap.t in bioperl-run. When I run "perl -I. t/Amap.t" I get the following output: 1..10 ok 1 - use Bio::Tools::Run::Alignment::Amap; ok 2 - use Bio::AlignIO; ok 3 - use Bio::SeqIO; ok 4 - use Bio::Root::IO; ok 5 - All the required modules are present ok 6 - new() returned something ok 7 - and its the right class not ok 8 - executable() got the correct filename # Failed test 'executable() got the correct filename' # in t/Amap.t at line 90. # got: undef # expected: 'filename' ok 9 # skip Got incorrect filename for executable ok 10 # skip Got incorrect filename for executable # Looks like you failed 1 test of 10. So far this looks good (well, that it's failing passing expected tests). However, when i run "make test" the output is unexpected and I don't know why. It seems to die and produce the results of the testing before the rest of the test suit is run: t/Amap....................NOK 8 # Failed test 'executable() got the correct filename' # in t/Amap.t at line 90. # got: undef # expected: 'filename' # Looks like you failed 1 test of 10. t/Amap....................dubious Test returned status 1 (wstat 256, 0x100) DIED. FAILED test 8 Failed 1/10 tests, 90.00% okay (less 2 skipped tests: 7 okay, 70.00%) t/Analysis_soap...........ok 7/17make: *** wait: No child processes. Stop. Is there something I'm missing?? If it's something less obvious, let me know and i'll post whole test file. Nath From cjfields at uiuc.edu Thu Oct 19 13:26:45 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 12:26:45 -0500 Subject: [Bioperl-l] test::more template In-Reply-To: <1161278253.4537b32dd3d15@webmail.shef.ac.uk> Message-ID: <002001c6f3a3$c00b9080$15327e82@pyrimidine> ... > Just wrote a partial and small test script for t/Amap.t in bioperl-run. > When I run "perl -I. t/Amap.t" I get the following output: > 1..10 > ok 1 - use Bio::Tools::Run::Alignment::Amap; > ok 2 - use Bio::AlignIO; > ok 3 - use Bio::SeqIO; > ok 4 - use Bio::Root::IO; > ok 5 - All the required modules are present > ok 6 - new() returned something > ok 7 - and its the right class > not ok 8 - executable() got the correct filename > # Failed test 'executable() got the correct filename' > # in t/Amap.t at line 90. > # got: undef > # expected: 'filename' > ok 9 # skip Got incorrect filename for executable > ok 10 # skip Got incorrect filename for executable > # Looks like you failed 1 test of 10. > > > So far this looks good (well, that it's failing passing expected tests). > However, when i run "make test" the output is unexpected and I don't know > why. It seems to die and produce the results of the testing before the > rest of the test suit is run: > t/Amap....................NOK 8 > # Failed test 'executable() got the correct filename' > # in t/Amap.t at line 90. > # got: undef > # expected: 'filename' > # Looks like you failed 1 test of 10. > t/Amap....................dubious > Test returned status 1 (wstat 256, 0x100) > DIED. FAILED test 8 > Failed 1/10 tests, 90.00% okay (less 2 skipped tests: 7 okay, > 70.00%) > t/Analysis_soap...........ok 7/17make: *** wait: No child processes. > Stop. > > > > Is there something I'm missing?? If it's something less obvious, let me > know and i'll post whole test file. > Nath Well, Analysis_soap.t ran immediately after, so Amap.t tests shouldn't be the problem. The only issue I can think of is that Test::More TODO blocks require a newer version of Test::Harness (which most users have anyway). Are you using a TODO block? You can send me Amap.t and I'll give it a try, but I can't promise I'll get to it immediately (busy day). Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From N.Haigh at sheffield.ac.uk Thu Oct 19 13:38:25 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 18:38:25 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <002001c6f3a3$c00b9080$15327e82@pyrimidine> References: <002001c6f3a3$c00b9080$15327e82@pyrimidine> Message-ID: <1161279505.4537b811e143f@webmail.shef.ac.uk> > Well, Analysis_soap.t ran immediately after, so Amap.t tests shouldn't be > the problem. The only issue I can think of is that Test::More TODO blocks > require a newer version of Test::Harness (which most users have anyway). > Are you using a TODO block? > > You can send me Amap.t and I'll give it a try, but I can't promise I'll get > to it immediately (busy day). > > Chris > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > No TODO blocks. I must have done something wrong - it's the first time I've seen this - but then again, I don't look that closely at the output of "make test" unless something shows as a fail. Anyway, below is the short bit of code. Thanks Nath use strict; use Bio::Root::IO; # cant test for this, might be needed to get Test::More BEGIN { # Things to do ASAP once the script is run # even before anything else in the file is parsed use vars qw($NUMTESTS $DEBUG $error); $DEBUG = $ENV{'BIOIPERLDEBUG'} || 0; # Use installed Test module, otherwise fall back # to copy of Test.pm located in the t dir eval { require Test::More; }; if ( $@ ) { use lib Bio::Root::IO->catfile('t','lib'); } # Currently no errors $error = 0; # Setup the number of tests to be run # what about using: # use Test::More 'no_plan'; use Test::More; $NUMTESTS = 10; plan tests => $NUMTESTS; # Use modules that are needed in this test that are from # any of the Bioperl packages: Bioperl-core, Bioperl-run ... etc # use_ok(''); use_ok('Bio::Tools::Run::Alignment::Amap'); use_ok('Bio::AlignIO'); use_ok('Bio::SeqIO'); use_ok('Bio::Root::IO'); } # Multiple END blocks are run in reverse order of their definition # Last In, First Out (LIFO) END { # Things to do right at the very end, just # when the interpreter finishes/exits # E.g. deleting intermediate files produced during the test foreach my $file ( qw(cysprot.dnd cysprot1a.dnd) ) { unlink $file; # check it was deleted } #unlink qw(cysprot.dnd cysprot1a.dnd) } END { # Not sure what this is doing? #for ( $Test::ntest..$NUMTESTS ) { # skip("Amap program not found. Skipping.\n",1); #} } # if we got to here, thats OK! # is this really needed? ok( 1, 'All the required modules are present'); # setup input files etc my $inputfilename = Bio::Root::IO->catfile("t","data","cysprot.fa"); # setup output files etc # none in this test # setup global objects that are to be used in more than one test # Also test they were initialised correctly my @params = (); my $aln; my $factory = Bio::Tools::Run::Alignment::Amap->new(@params); ok( defined $factory, 'new() returned something' ); ok( $factory->isa('Bio::Tools::Run::Alignment::Amap'), ' and its the right class' ); # Now onto the nitty gritty tests of the modules methods my $executable_file = $factory->executable(); #is( $factory->executable(), 'filename', 'executable() got the correct filename' ); # block of tests to skip if you know the tests will fail # under some condition. E.g.: # Need network access, # Wont work on particular OS, # Cant find the exectuable # Do not just skip tests that seem to fail for an unknown reason SKIP: { # condition used to skip this block of tests #skip($why, $how_many_in_block); skip("Got incorrect filename for executable", 2) unless is($factory->executable(), 'filename', 'executable() got the correct filename'); ok( -e $executable_file, 'Found executable' ); ok( $factory->version >= 2.0, 'Code tested on Amap versions >= 2.0' ); } From jason at bioperl.org Thu Oct 19 13:44:51 2006 From: jason at bioperl.org (Jason Stajich) Date: Thu, 19 Oct 2006 10:44:51 -0700 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net> References: <001301c6f215$07a9a070$15327e82@pyrimidine> <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org> <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net> <8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org> <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net> Message-ID: Yikes - I was worried that it might have been me..... Okay I'll look into fixing it -- ChrisF - check in with me before diving in, in case I've gotten it done and I expect your enzyme assays might take up the time. -jason On Oct 19, 2006, at 10:11 AM, Hilmar Lapp wrote: > Actually you did that Jason: http://tinyurl.com/ye2edk > > Apparently the motivation was to "parse swissprot fields in genpept > file (dbsource)"? > > It clearly looks wrong to add the version. You've probably had a > reason why you did this at the time but if we (you :) can't recover > that I guess it's best to just fix it to do the right thing (in > both places obviously). > > -hilmar > > On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote: > >> Well there is explicit addition of the version to the primary id >> so it isn't so much a parsing error as a deliberate decision to >> append it. >> see Bio::SeqIO::genbank >> >> to make the dblink >> $annotation- >> >add_Annotation >> ('dblink', >> >> Bio::Annotation::DBLink->new >> (-primary_id >> => $id . "." . $version, >> -version => >> $version, >> -database => >> $db, >> -tagname => >> 'dblink')); >> >> and the code to print the dblink back out in the writer already >> assumes the version number is appended... >> >> foreach my $ref ( $seq->annotation->get_Annotations >> ('dblink') ) { >> # if ($ref->comment eq 'DBSOURCE') { >> $self->_print('DBSOURCE accession ', >> $ref->primary_id, "\n"); >> # } >> } >> >> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote: >> >>> Here is the overload code: >>> >>> use overload '""' => sub { >>> (($_[0]->database ? $_[0]->database . ':' : '' ) >>> . ($_[0]->primary_id ? $_[0]->primary_id : '') >>> . ($_[0]->version ? '.' . $_[0]->version : '')) >>> || '' }; >>> >>> Except that the last '||' is redundant and unnecessary (it either >>> does nothing or replaces an empty string with an empty string), I >>> don't see the potential for duplicating the version number here - >>> unless primary_id() did that, which I don't see it doing. >>> >>> So, to me this seems to come from a parsing error in the >>> beginning, rather than an erroneous mangling of version into >>> primary_id later. >>> >>> Is someone in the position to confirm this? >>> >>> -hilmar >>> >>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: >>> >>>> So I'm unsure what we should do here. >>>> >>>> We can certainly fix the problem which you report which is >>>> relying on >>>> the "" method -- if you were to do instead: >>>> print $_->database, ":", $_->primary_id, "\n"; >>>> >>>> you'll get the right answer. We at a minimum just fix the auto- >>>> string converting method to do The Right Thing. >>>> >>>> But I am not sure if we should keep the version out of the >>>> primary_id >>>> field. This will require some rejiggering in several modules >>>> when it >>>> comes to printing DBlinks and I don't want to do this before the >>>> release. I also am not sure if there was an explicit reason why >>>> someone did put the version information in the primary_id. (I >>>> hope it >>>> wasn't me because I don't think I'm going to remember why). >>>> >>>> Does anyone else have a strong feeling? >>>> >>>> -jason >>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote: >>>> >>>>> Hello, >>>>> >>>>> I noticed a little problem with the Annotation "DBLink" from >>>>> GenBank entries >>>>> >>>>> When I run: >>>>> >>>>> perl -MBio::DB::GenBank -e 'my $gi = >>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my >>>>> $seqio = >>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my >>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations >>>>> ("dblink"); >>>>> for(@annotations) { print $_, "\n";} print $INC{ >>>>> "Bio/Annotation/DBLink.pm" }, "\n"; ' >>>>> >>>>> This yields: >>>>> >>>>> GenBank:AL591065.17.17 >>>>> >>>>> and the place where the used Bio/Annotation/DBLink.pm resides. >>>>> >>>>> Can others repeat this? >>>>> >>>>> I have dug into the source a little and Bio::Annotation::DBLink >>>>> seems to >>>>> be the place where this happens: it has a concatenation which >>>>> leads to >>>>> that repeated version number. >>>>> >>>>> It this something that I should fix "client-side", so to speak, or >>>>> is it >>>>> worthwhile to add some logic to that concatenation to prevent >>>>> this? >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Eric >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> -- >>>> Jason Stajich, PhD >>>> Miller Research Fellow >>>> University of California >>>> Dept of Plant and Microbial Biology >>>> 321 Koshland Hall #3102 >>>> Berkeley, CA 94720-3102 >>>> lab: 510.642.8441 >>>> http://pmb.berkeley.edu/~taylor/people/js.html >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> >>> >>> >>> >>> >> >> -- >> Jason Stajich, PhD >> Miller Research Fellow >> University of California >> Dept of Plant and Microbial Biology >> 321 Koshland Hall #3102 >> Berkeley, CA 94720-3102 >> lab: 510.642.8441 >> http://pmb.berkeley.edu/~taylor/people/js.html >> >> > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From cjfields at uiuc.edu Thu Oct 19 14:03:52 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 13:03:52 -0500 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net> Message-ID: <000001c6f3a8$f0a46a00$15327e82@pyrimidine> Also seems that the DBSOURCE line isn't caught correctly and stuffs it by default into a GenBank dblink (the dbsource ihn the test case is EMBL, not GenBank). http://bugzilla.open-bio.org/show_bug.cgi?id=2124 It looks like NCBI may be now using: DBSOURCE embl accession Z49548.1 instead of the old version: DBSOURCE embl locus SCYJR048W, accession Z49548.1 I don't recall NCBI mentioning changes regarding DBSOURCE in any of the recent release notes. Chris > Actually you did that Jason: http://tinyurl.com/ye2edk > > Apparently the motivation was to "parse swissprot fields in genpept > file (dbsource)"? > > It clearly looks wrong to add the version. You've probably had a > reason why you did this at the time but if we (you :) can't recover > that I guess it's best to just fix it to do the right thing (in both > places obviously). > > -hilmar > > On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote: > > > Well there is explicit addition of the version to the primary id so > > it isn't so much a parsing error as a deliberate decision to append > > it. > > see Bio::SeqIO::genbank > > > > to make the dblink > > $annotation- > > >add_Annotation > > ('dblink', > > > > Bio::Annotation::DBLink->new > > (-primary_id > > => $id . "." . $version, > > -version => > > $version, > > -database => > > $db, > > -tagname => > > 'dblink')); > > > > and the code to print the dblink back out in the writer already > > assumes the version number is appended... > > > > foreach my $ref ( $seq->annotation->get_Annotations > > ('dblink') ) { > > # if ($ref->comment eq 'DBSOURCE') { > > $self->_print('DBSOURCE accession ', > > $ref->primary_id, "\n"); > > # } > > } > > > > On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote: > > > >> Here is the overload code: > >> > >> use overload '""' => sub { > >> (($_[0]->database ? $_[0]->database . ':' : '' ) > >> . ($_[0]->primary_id ? $_[0]->primary_id : '') > >> . ($_[0]->version ? '.' . $_[0]->version : '')) > >> || '' }; > >> > >> Except that the last '||' is redundant and unnecessary (it either > >> does nothing or replaces an empty string with an empty string), I > >> don't see the potential for duplicating the version number here - > >> unless primary_id() did that, which I don't see it doing. > >> > >> So, to me this seems to come from a parsing error in the > >> beginning, rather than an erroneous mangling of version into > >> primary_id later. > >> > >> Is someone in the position to confirm this? > >> > >> -hilmar > >> > >> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: > >> > >>> So I'm unsure what we should do here. > >>> > >>> We can certainly fix the problem which you report which is > >>> relying on > >>> the "" method -- if you were to do instead: > >>> print $_->database, ":", $_->primary_id, "\n"; > >>> > >>> you'll get the right answer. We at a minimum just fix the auto- > >>> string converting method to do The Right Thing. > >>> > >>> But I am not sure if we should keep the version out of the > >>> primary_id > >>> field. This will require some rejiggering in several modules > >>> when it > >>> comes to printing DBlinks and I don't want to do this before the > >>> release. I also am not sure if there was an explicit reason why > >>> someone did put the version information in the primary_id. (I > >>> hope it > >>> wasn't me because I don't think I'm going to remember why). > >>> > >>> Does anyone else have a strong feeling? > >>> > >>> -jason > >>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote: > >>> > >>>> Hello, > >>>> > >>>> I noticed a little problem with the Annotation "DBLink" from > >>>> GenBank entries > >>>> > >>>> When I run: > >>>> > >>>> perl -MBio::DB::GenBank -e 'my $gi = > >>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my > >>>> $seqio = > >>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my > >>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations > >>>> ("dblink"); > >>>> for(@annotations) { print $_, "\n";} print $INC{ > >>>> "Bio/Annotation/DBLink.pm" }, "\n"; ' > >>>> > >>>> This yields: > >>>> > >>>> GenBank:AL591065.17.17 > >>>> > >>>> and the place where the used Bio/Annotation/DBLink.pm resides. > >>>> > >>>> Can others repeat this? > >>>> > >>>> I have dug into the source a little and Bio::Annotation::DBLink > >>>> seems to > >>>> be the place where this happens: it has a concatenation which > >>>> leads to > >>>> that repeated version number. > >>>> > >>>> It this something that I should fix "client-side", so to speak, or > >>>> is it > >>>> worthwhile to add some logic to that concatenation to prevent this? > >>>> > >>>> > >>>> Thanks, > >>>> > >>>> Eric > >>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> Bioperl-l mailing list > >>>> Bioperl-l at lists.open-bio.org > >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>> > >>> -- > >>> Jason Stajich, PhD > >>> Miller Research Fellow > >>> University of California > >>> Dept of Plant and Microbial Biology > >>> 321 Koshland Hall #3102 > >>> Berkeley, CA 94720-3102 > >>> lab: 510.642.8441 > >>> http://pmb.berkeley.edu/~taylor/people/js.html > >>> > >>> > >>> _______________________________________________ > >>> Bioperl-l mailing list > >>> Bioperl-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>> > >> > >> -- > >> =========================================================== > >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > >> =========================================================== > >> > >> > >> > >> > >> > > > > -- > > Jason Stajich, PhD > > Miller Research Fellow > > University of California > > Dept of Plant and Microbial Biology > > 321 Koshland Hall #3102 > > Berkeley, CA 94720-3102 > > lab: 510.642.8441 > > http://pmb.berkeley.edu/~taylor/people/js.html > > > > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From N.Haigh at sheffield.ac.uk Thu Oct 19 14:06:11 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 19:06:11 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <002001c6f3a3$c00b9080$15327e82@pyrimidine> References: <002001c6f3a3$c00b9080$15327e82@pyrimidine> Message-ID: <1161281171.4537be93b63c9@webmail.shef.ac.uk> > > Well, Analysis_soap.t ran immediately after, so Amap.t tests shouldn't be > the problem. The only issue I can think of is that Test::More TODO blocks > require a newer version of Test::Harness (which most users have anyway). > Are you using a TODO block? > > You can send me Amap.t and I'll give it a try, but I can't promise I'll get > to it immediately (busy day). > > Chris > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > Nevermind about this - It's working as expected! I got confused as a previous run threw errors but wasn't included in the final table of failed tests - working now. Nath From N.Haigh at sheffield.ac.uk Thu Oct 19 14:14:54 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 19:14:54 +0100 Subject: [Bioperl-l] bioperl-run executable In-Reply-To: <002001c6f3a3$c00b9080$15327e82@pyrimidine> References: <002001c6f3a3$c00b9080$15327e82@pyrimidine> Message-ID: <1161281694.4537c09ebd0f8@webmail.shef.ac.uk> I have a few questions about How bioperl-run modules. 1) How do modules define what the name of the executable is that it uses? 2) Is there a way to test what this is? 3) Does $factory->executable return this or does it only return the name if it successfully found it? Thanks Nath From cjfields at uiuc.edu Thu Oct 19 14:15:08 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 13:15:08 -0500 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: Message-ID: <000001c6f3aa$82845ba0$15327e82@pyrimidine> Go for it. I haven't got the time to spare at the moment, sucky protein assays.... Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Jason Stajich > Sent: Thursday, October 19, 2006 12:45 PM > To: Hilmar Lapp > Cc: bioperl-l at lists.open-bio.org; Erikjan > Subject: Re: [Bioperl-l] Annotation-DBLink- version numbers repeating > > Yikes - I was worried that it might have been me..... > > Okay I'll look into fixing it -- ChrisF - check in with me before > diving in, in case I've gotten it done and I expect your enzyme > assays might take up the time. > > -jason > On Oct 19, 2006, at 10:11 AM, Hilmar Lapp wrote: > > > Actually you did that Jason: http://tinyurl.com/ye2edk > > > > Apparently the motivation was to "parse swissprot fields in genpept > > file (dbsource)"? > > > > It clearly looks wrong to add the version. You've probably had a > > reason why you did this at the time but if we (you :) can't recover > > that I guess it's best to just fix it to do the right thing (in > > both places obviously). > > > > -hilmar > > > > On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote: > > > >> Well there is explicit addition of the version to the primary id > >> so it isn't so much a parsing error as a deliberate decision to > >> append it. > >> see Bio::SeqIO::genbank > >> > >> to make the dblink > >> $annotation- > >> >add_Annotation > >> ('dblink', > >> > >> Bio::Annotation::DBLink->new > >> (-primary_id > >> => $id . "." . $version, > >> -version => > >> $version, > >> -database => > >> $db, > >> -tagname => > >> 'dblink')); > >> > >> and the code to print the dblink back out in the writer already > >> assumes the version number is appended... > >> > >> foreach my $ref ( $seq->annotation->get_Annotations > >> ('dblink') ) { > >> # if ($ref->comment eq 'DBSOURCE') { > >> $self->_print('DBSOURCE accession ', > >> $ref->primary_id, "\n"); > >> # } > >> } > >> > >> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote: > >> > >>> Here is the overload code: > >>> > >>> use overload '""' => sub { > >>> (($_[0]->database ? $_[0]->database . ':' : '' ) > >>> . ($_[0]->primary_id ? $_[0]->primary_id : '') > >>> . ($_[0]->version ? '.' . $_[0]->version : '')) > >>> || '' }; > >>> > >>> Except that the last '||' is redundant and unnecessary (it either > >>> does nothing or replaces an empty string with an empty string), I > >>> don't see the potential for duplicating the version number here - > >>> unless primary_id() did that, which I don't see it doing. > >>> > >>> So, to me this seems to come from a parsing error in the > >>> beginning, rather than an erroneous mangling of version into > >>> primary_id later. > >>> > >>> Is someone in the position to confirm this? > >>> > >>> -hilmar > >>> > >>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: > >>> > >>>> So I'm unsure what we should do here. > >>>> > >>>> We can certainly fix the problem which you report which is > >>>> relying on > >>>> the "" method -- if you were to do instead: > >>>> print $_->database, ":", $_->primary_id, "\n"; > >>>> > >>>> you'll get the right answer. We at a minimum just fix the auto- > >>>> string converting method to do The Right Thing. > >>>> > >>>> But I am not sure if we should keep the version out of the > >>>> primary_id > >>>> field. This will require some rejiggering in several modules > >>>> when it > >>>> comes to printing DBlinks and I don't want to do this before the > >>>> release. I also am not sure if there was an explicit reason why > >>>> someone did put the version information in the primary_id. (I > >>>> hope it > >>>> wasn't me because I don't think I'm going to remember why). > >>>> > >>>> Does anyone else have a strong feeling? > >>>> > >>>> -jason > >>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote: > >>>> > >>>>> Hello, > >>>>> > >>>>> I noticed a little problem with the Annotation "DBLink" from > >>>>> GenBank entries > >>>>> > >>>>> When I run: > >>>>> > >>>>> perl -MBio::DB::GenBank -e 'my $gi = > >>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my > >>>>> $seqio = > >>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my > >>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations > >>>>> ("dblink"); > >>>>> for(@annotations) { print $_, "\n";} print $INC{ > >>>>> "Bio/Annotation/DBLink.pm" }, "\n"; ' > >>>>> > >>>>> This yields: > >>>>> > >>>>> GenBank:AL591065.17.17 > >>>>> > >>>>> and the place where the used Bio/Annotation/DBLink.pm resides. > >>>>> > >>>>> Can others repeat this? > >>>>> > >>>>> I have dug into the source a little and Bio::Annotation::DBLink > >>>>> seems to > >>>>> be the place where this happens: it has a concatenation which > >>>>> leads to > >>>>> that repeated version number. > >>>>> > >>>>> It this something that I should fix "client-side", so to speak, or > >>>>> is it > >>>>> worthwhile to add some logic to that concatenation to prevent > >>>>> this? > >>>>> > >>>>> > >>>>> Thanks, > >>>>> > >>>>> Eric > >>>>> > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> Bioperl-l mailing list > >>>>> Bioperl-l at lists.open-bio.org > >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>>> > >>>> -- > >>>> Jason Stajich, PhD > >>>> Miller Research Fellow > >>>> University of California > >>>> Dept of Plant and Microbial Biology > >>>> 321 Koshland Hall #3102 > >>>> Berkeley, CA 94720-3102 > >>>> lab: 510.642.8441 > >>>> http://pmb.berkeley.edu/~taylor/people/js.html > >>>> > >>>> > >>>> _______________________________________________ > >>>> Bioperl-l mailing list > >>>> Bioperl-l at lists.open-bio.org > >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>>> > >>> > >>> -- > >>> =========================================================== > >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > >>> =========================================================== > >>> > >>> > >>> > >>> > >>> > >> > >> -- > >> Jason Stajich, PhD > >> Miller Research Fellow > >> University of California > >> Dept of Plant and Microbial Biology > >> 321 Koshland Hall #3102 > >> Berkeley, CA 94720-3102 > >> lab: 510.642.8441 > >> http://pmb.berkeley.edu/~taylor/people/js.html > >> > >> > > > > -- > > =========================================================== > > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > > =========================================================== > > > > > > > > > > > > -- > Jason Stajich, PhD > Miller Research Fellow > University of California > Dept of Plant and Microbial Biology > 321 Koshland Hall #3102 > Berkeley, CA 94720-3102 > lab: 510.642.8441 > http://pmb.berkeley.edu/~taylor/people/js.html > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Thu Oct 19 14:35:08 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 13:35:08 -0500 Subject: [Bioperl-l] bioperl-run executable In-Reply-To: <1161281694.4537c09ebd0f8@webmail.shef.ac.uk> Message-ID: <000301c6f3ad$4ec9af10$15327e82@pyrimidine> I think a lot of the bioperl-run modules use Bio::Tools::Run::WrapperBase but I'm not sure. I haven't used them very much myself but plan on making wrappers at some point soon for some programs I use. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: Nathan Haigh [mailto:N.Haigh at sheffield.ac.uk] > Sent: Thursday, October 19, 2006 1:15 PM > To: Chris Fields > Cc: 'bioperl-l' > Subject: bioperl-run executable > > I have a few questions about How bioperl-run modules. > > 1) How do modules define what the name of the executable is that it uses? > 2) Is there a way to test what this is? > 3) Does $factory->executable return this or does it only return the name > if it successfully found it? > > Thanks > Nath From N.Haigh at sheffield.ac.uk Thu Oct 19 14:47:01 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 19:47:01 +0100 Subject: [Bioperl-l] bioperl-run executable In-Reply-To: <000301c6f3ad$4ec9af10$15327e82@pyrimidine> References: <000301c6f3ad$4ec9af10$15327e82@pyrimidine> Message-ID: <1161283620.4537c82501c43@webmail.shef.ac.uk> Quoting Chris Fields : > I think a lot of the bioperl-run modules use Bio::Tools::Run::WrapperBase > but I'm not sure. I haven't used them very much myself but plan on making > wrappers at some point soon for some programs I use. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > On closer inspection of a couple of other modules (Clustalw.pm and TCoffee.pm), they seem to hardcode the exe name in $PROGRAM_NAME and have a sub (program_name) that simply returns this value. I'd like to see the program_name become a getter/setter so users can change the default and have the string stored in the factory object. Does it matter that Bio::Tools::Run::WrapperBase is in bioperl-core not bioperl-run? I suppose not since bioperl-core is a prerep for bioperl-run but wouldn't it make sence to go in bioperl-run? Nath From cjfields at uiuc.edu Thu Oct 19 15:07:05 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 14:07:05 -0500 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: Message-ID: <000701c6f3b1$c5914230$15327e82@pyrimidine> Jason, Hilmar, How about changing the default parsed dblink in SeqIO::genbank (line 520) to if( $dbsource =~ /^(\S*?)\s*accession\s+(\S+)\.(\d+)/ ) { my ($db,$id,$version) = ($1,$2,$3); $annotation->add_Annotation ('dblink', Bio::Annotation::DBLink->new (-primary_id => $id, -version => $version, -database => $db || 'GenBank', -tagname => 'dblink')); } It passes tests and catches the optional database ('embl' for the bugzilla report). The output sequence still doesn't print the DB if it isn't GenBank via write_seq(), but that should be too hard to fix (famous last words). Okay, okay, back to the assays... Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Jason Stajich > Sent: Thursday, October 19, 2006 12:45 PM > To: Hilmar Lapp > Cc: bioperl-l at lists.open-bio.org; Erikjan > Subject: Re: [Bioperl-l] Annotation-DBLink- version numbers repeating > > Yikes - I was worried that it might have been me..... > > Okay I'll look into fixing it -- ChrisF - check in with me before > diving in, in case I've gotten it done and I expect your enzyme > assays might take up the time. > > -jason > On Oct 19, 2006, at 10:11 AM, Hilmar Lapp wrote: > > > Actually you did that Jason: http://tinyurl.com/ye2edk > > > > Apparently the motivation was to "parse swissprot fields in genpept > > file (dbsource)"? > > > > It clearly looks wrong to add the version. You've probably had a > > reason why you did this at the time but if we (you :) can't recover > > that I guess it's best to just fix it to do the right thing (in > > both places obviously). > > > > -hilmar > > > > On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote: > > > >> Well there is explicit addition of the version to the primary id > >> so it isn't so much a parsing error as a deliberate decision to > >> append it. > >> see Bio::SeqIO::genbank > >> > >> to make the dblink > >> $annotation- > >> >add_Annotation > >> ('dblink', > >> > >> Bio::Annotation::DBLink->new > >> (-primary_id > >> => $id . "." . $version, > >> -version => > >> $version, > >> -database => > >> $db, > >> -tagname => > >> 'dblink')); > >> > >> and the code to print the dblink back out in the writer already > >> assumes the version number is appended... > >> > >> foreach my $ref ( $seq->annotation->get_Annotations > >> ('dblink') ) { > >> # if ($ref->comment eq 'DBSOURCE') { > >> $self->_print('DBSOURCE accession ', > >> $ref->primary_id, "\n"); > >> # } > >> } > >> > >> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote: > >> > >>> Here is the overload code: > >>> > >>> use overload '""' => sub { > >>> (($_[0]->database ? $_[0]->database . ':' : '' ) > >>> . ($_[0]->primary_id ? $_[0]->primary_id : '') > >>> . ($_[0]->version ? '.' . $_[0]->version : '')) > >>> || '' }; > >>> > >>> Except that the last '||' is redundant and unnecessary (it either > >>> does nothing or replaces an empty string with an empty string), I > >>> don't see the potential for duplicating the version number here - > >>> unless primary_id() did that, which I don't see it doing. > >>> > >>> So, to me this seems to come from a parsing error in the > >>> beginning, rather than an erroneous mangling of version into > >>> primary_id later. > >>> > >>> Is someone in the position to confirm this? > >>> > >>> -hilmar > >>> > >>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: > >>> > >>>> So I'm unsure what we should do here. > >>>> > >>>> We can certainly fix the problem which you report which is > >>>> relying on > >>>> the "" method -- if you were to do instead: > >>>> print $_->database, ":", $_->primary_id, "\n"; > >>>> > >>>> you'll get the right answer. We at a minimum just fix the auto- > >>>> string converting method to do The Right Thing. > >>>> > >>>> But I am not sure if we should keep the version out of the > >>>> primary_id > >>>> field. This will require some rejiggering in several modules > >>>> when it > >>>> comes to printing DBlinks and I don't want to do this before the > >>>> release. I also am not sure if there was an explicit reason why > >>>> someone did put the version information in the primary_id. (I > >>>> hope it > >>>> wasn't me because I don't think I'm going to remember why). > >>>> > >>>> Does anyone else have a strong feeling? > >>>> > >>>> -jason > >>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote: > >>>> > >>>>> Hello, > >>>>> > >>>>> I noticed a little problem with the Annotation "DBLink" from > >>>>> GenBank entries > >>>>> > >>>>> When I run: > >>>>> > >>>>> perl -MBio::DB::GenBank -e 'my $gi = > >>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my > >>>>> $seqio = > >>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my > >>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations > >>>>> ("dblink"); > >>>>> for(@annotations) { print $_, "\n";} print $INC{ > >>>>> "Bio/Annotation/DBLink.pm" }, "\n"; ' > >>>>> > >>>>> This yields: > >>>>> > >>>>> GenBank:AL591065.17.17 > >>>>> > >>>>> and the place where the used Bio/Annotation/DBLink.pm resides. > >>>>> > >>>>> Can others repeat this? > >>>>> > >>>>> I have dug into the source a little and Bio::Annotation::DBLink > >>>>> seems to > >>>>> be the place where this happens: it has a concatenation which > >>>>> leads to > >>>>> that repeated version number. > >>>>> > >>>>> It this something that I should fix "client-side", so to speak, or > >>>>> is it > >>>>> worthwhile to add some logic to that concatenation to prevent > >>>>> this? > >>>>> > >>>>> > >>>>> Thanks, > >>>>> > >>>>> Eric > >>>>> > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> Bioperl-l mailing list > >>>>> Bioperl-l at lists.open-bio.org > >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>>> > >>>> -- > >>>> Jason Stajich, PhD > >>>> Miller Research Fellow > >>>> University of California > >>>> Dept of Plant and Microbial Biology > >>>> 321 Koshland Hall #3102 > >>>> Berkeley, CA 94720-3102 > >>>> lab: 510.642.8441 > >>>> http://pmb.berkeley.edu/~taylor/people/js.html > >>>> > >>>> > >>>> _______________________________________________ > >>>> Bioperl-l mailing list > >>>> Bioperl-l at lists.open-bio.org > >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>>> > >>> > >>> -- > >>> =========================================================== > >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > >>> =========================================================== > >>> > >>> > >>> > >>> > >>> > >> > >> -- > >> Jason Stajich, PhD > >> Miller Research Fellow > >> University of California > >> Dept of Plant and Microbial Biology > >> 321 Koshland Hall #3102 > >> Berkeley, CA 94720-3102 > >> lab: 510.642.8441 > >> http://pmb.berkeley.edu/~taylor/people/js.html > >> > >> > > > > -- > > =========================================================== > > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > > =========================================================== > > > > > > > > > > > > -- > Jason Stajich, PhD > Miller Research Fellow > University of California > Dept of Plant and Microbial Biology > 321 Koshland Hall #3102 > Berkeley, CA 94720-3102 > lab: 510.642.8441 > http://pmb.berkeley.edu/~taylor/people/js.html > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jason at bioperl.org Thu Oct 19 14:48:28 2006 From: jason at bioperl.org (Jason Stajich) Date: Thu, 19 Oct 2006 11:48:28 -0700 Subject: [Bioperl-l] bioperl-run executable In-Reply-To: <1161281694.4537c09ebd0f8@webmail.shef.ac.uk> References: <002001c6f3a3$c00b9080$15327e82@pyrimidine> <1161281694.4537c09ebd0f8@webmail.shef.ac.uk> Message-ID: <67650240-D61B-4842-AE7C-75F15F608F6F@bioperl.org> program_name() Should return the name of the program executable() Is a function that you don't have to mess with that tries to find the executable named program_name() based on your PATH. -jason On Oct 19, 2006, at 11:14 AM, Nathan Haigh wrote: > I have a few questions about How bioperl-run modules. > > 1) How do modules define what the name of the executable is that it > uses? > 2) Is there a way to test what this is? > 3) Does $factory->executable return this or does it only return the > name if it successfully found it? > > Thanks > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From jason at bioperl.org Thu Oct 19 17:06:43 2006 From: jason at bioperl.org (Jason Stajich) Date: Thu, 19 Oct 2006 14:06:43 -0700 Subject: [Bioperl-l] bioperl-run executable In-Reply-To: <1161283620.4537c82501c43@webmail.shef.ac.uk> References: <000301c6f3ad$4ec9af10$15327e82@pyrimidine> <1161283620.4537c82501c43@webmail.shef.ac.uk> Message-ID: It can be reset now but of course this not a very nice way of doing it: $Bio::Tools::Run::Alignment::Clustalw::PROGRAM_NAME = 'clustalw_smp'; I am not sure if there are pros and cons to making it a getter- setter, but if you want to run with it, please do. The whole run system has been hard to keep people adhering to a standard (and the standard has changed a bit) so some auditing is warranted. -jason On Oct 19, 2006, at 11:47 AM, Nathan Haigh wrote: > Quoting Chris Fields : > >> I think a lot of the bioperl-run modules use >> Bio::Tools::Run::WrapperBase >> but I'm not sure. I haven't used them very much myself but plan >> on making >> wrappers at some point soon for some programs I use. >> >> Christopher Fields >> Postdoctoral Researcher - Switzer Lab >> Dept. of Biochemistry >> University of Illinois Urbana-Champaign >> > > On closer inspection of a couple of other modules (Clustalw.pm and > TCoffee.pm), they seem to hardcode the exe name in $PROGRAM_NAME > and have a sub > (program_name) that simply returns this value. I'd like to see the > program_name become a getter/setter so users can change the default > and have the > string stored in the factory object. > > Does it matter that Bio::Tools::Run::WrapperBase is in bioperl-core > not bioperl-run? I suppose not since bioperl-core is a prerep for > bioperl-run but > wouldn't it make sence to go in bioperl-run? > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From torsten.seemann at infotech.monash.edu.au Thu Oct 19 19:24:03 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Fri, 20 Oct 2006 09:24:03 +1000 Subject: [Bioperl-l] test::more template In-Reply-To: <1161279505.4537b811e143f@webmail.shef.ac.uk> References: <002001c6f3a3$c00b9080$15327e82@pyrimidine> <1161279505.4537b811e143f@webmail.shef.ac.uk> Message-ID: <45380913.3070506@infotech.monash.edu.au> Nathan, > use strict; > use Bio::Root::IO; # cant test for this, might be needed to get Test::More use File::Spec->catfile as Bio::Root::IO->catfile just delegates anyway, and File::Spec is "guaranteed" to be installed with Perl 5.6+. > use lib Bio::Root::IO->catfile('t','lib'); Simpler as: use lib 't/lib'; I understand the 'lib.pm' accepts Unix style directories REGARDLESS of native platform. -- Torsten Seemann Victorian Bioinformatics Consortium, Monash University, Australia From prabubio at gmail.com Thu Oct 19 20:11:36 2006 From: prabubio at gmail.com (Prabu Raja) Date: 20 Oct 2006 00:11:36 -0000 Subject: [Bioperl-l] Prabu Raja sent you this link Message-ID: <20061020001136.86586.qmail@x05.namesdatabase.com> Remember your link from Prabu Raja: http://namesdatabase.com/2m.pl?k2=40892642637&s=nsender2 1 -> Use Prabu Raja's link by clicking above. 2 -> Enter your info for a membership connected to Prabu. 3 -> Share links with other friends, family and co-workers. 4 -> Use the members-only people search tools. Prabu selected you for this on 09-02-2004 22:52 ET. prabubio at gmail.com (Prabu Raja) initiated this to bioperl-l at lists.open-bio.org at 10-05-2006 02:30 on namesdatabase.com from the IP address 59.92.88.99. If you do not know a Prabu Raja, use http://namesdatabase.com/u.pl?bb2=40892642637&s=nsender2 to halt more reminders about this. For reference, the address of The Names Database is 1253 N. Research Way, Suite Q-2500, Orem, UT 84097. From cjfields at uiuc.edu Thu Oct 19 20:29:11 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 19:29:11 -0500 Subject: [Bioperl-l] test::more template In-Reply-To: <45380913.3070506@infotech.monash.edu.au> Message-ID: <000f01c6f3de$c3d91170$15327e82@pyrimidine> > Nathan, > > > use strict; > > use Bio::Root::IO; # cant test for this, might be needed to get > Test::More > > use File::Spec->catfile as Bio::Root::IO->catfile just delegates anyway, > and File::Spec is "guaranteed" to be installed with Perl 5.6+. > > > use lib Bio::Root::IO->catfile('t','lib'); > > Simpler as: > use lib 't/lib'; > I understand the 'lib.pm' accepts Unix style directories REGARDLESS of > native > platform. > > -- > Torsten Seemann > Victorian Bioinformatics Consortium, Monash University, Australia That is true, at least for WinXP (not sure about older Windows versions out there). I was using 'Root::IO->catfile' but found 'use lib 't/lib' works. I may have a few of the 'catfile' versions floating around out there, which may be where that originated. Note that if you plan on using Test::More with the bioperl-run test suite, you should add it to the bioperl-run CVS distribution directory in 't/lib'. Most people will have it installed, but you never know. Chris From cjfields at uiuc.edu Thu Oct 19 20:33:22 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 19:33:22 -0500 Subject: [Bioperl-l] Prabu Raja sent you this link In-Reply-To: <20061020001136.86586.qmail@x05.namesdatabase.com> Message-ID: <001001c6f3df$598a24c0$15327e82@pyrimidine> That Prabu Raja sure gets around... Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Prabu Raja > Sent: Thursday, October 19, 2006 7:12 PM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] Prabu Raja sent you this link > > Remember your link from Prabu Raja: > > http://namesdatabase.com/2m.pl?k2=40892642637&s=nsender2 > > > 1 -> Use Prabu Raja's link by clicking above. > > 2 -> Enter your info for a membership connected to Prabu. > > 3 -> Share links with other friends, family and co-workers. > > 4 -> Use the members-only people search tools. > > Prabu selected you for this on 09-02-2004 22:52 ET. > > > prabubio at gmail.com (Prabu Raja) initiated this to bioperl-l at lists.open- > bio.org > at 10-05-2006 02:30 on namesdatabase.com from the IP address 59.92.88.99. > If you do not know a Prabu Raja, use > http://namesdatabase.com/u.pl?bb2=40892642637&s=nsender2 to halt more > reminders about this. > For reference, the address of The Names Database is 1253 N. Research Way, > Suite Q-2500, Orem, UT 84097. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From keithplayer at hotmail.com Thu Oct 19 22:13:52 2006 From: keithplayer at hotmail.com (Keith Player) Date: Fri, 20 Oct 2006 02:13:52 +0000 (UTC) Subject: [Bioperl-l] Bio::DB::GFF::Util::Binning Message-ID: I know that there may be some changes resulting from new GFF3 implementations, but thought I would see if the following is useful anyway. I implemented the R-tree binning schema as used by Bio::DB::GFF::Util::Binning and as mention in this article: I tested the following query on a normal table (no binning), but it assumes that you know the longest range in the table. So for example with a table of human genes, where the longest gene we know of is around 2.4Mb. SELECT COUNT(*) as count FROM groups WHERE start > max(0,[start-2.4Mb]) AND g.start < [end] AND g.end > [start] AND g.chromosome = '1' so for 100Mb:101Mb SELECT COUNT(*) as count FROM groups WHERE start > 97600000 AND g.start < 101000000 AND g.end > 100000000 AND g.chromosome = '1' where [start] and [end] define the region of interest. This query outperforms the R-Tree implementation on all tests that I have performed (for lengths of 200bp to 10Mb across a whole chromsome). Could this be of some practical use? From jason at bioperl.org Thu Oct 19 11:50:49 2006 From: jason at bioperl.org (Jason Stajich) Date: Thu, 19 Oct 2006 08:50:49 -0700 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net> References: <001301c6f215$07a9a070$15327e82@pyrimidine> <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org> <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net> Message-ID: <8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org> Well there is explicit addition of the version to the primary id so it isn't so much a parsing error as a deliberate decision to append it. see Bio::SeqIO::genbank to make the dblink $annotation- >add_Annotation ('dblink', Bio::Annotation::DBLink->new (-primary_id => $id . "." . $version, -version => $version, -database => $db, -tagname => 'dblink')); and the code to print the dblink back out in the writer already assumes the version number is appended... foreach my $ref ( $seq->annotation->get_Annotations ('dblink') ) { # if ($ref->comment eq 'DBSOURCE') { $self->_print('DBSOURCE accession ', $ref->primary_id, "\n"); # } } On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote: > Here is the overload code: > > use overload '""' => sub { > (($_[0]->database ? $_[0]->database . ':' : '' ) > . ($_[0]->primary_id ? $_[0]->primary_id : '') > . ($_[0]->version ? '.' . $_[0]->version : '')) > || '' }; > > Except that the last '||' is redundant and unnecessary (it either > does nothing or replaces an empty string with an empty string), I > don't see the potential for duplicating the version number here - > unless primary_id() did that, which I don't see it doing. > > So, to me this seems to come from a parsing error in the beginning, > rather than an erroneous mangling of version into primary_id later. > > Is someone in the position to confirm this? > > -hilmar > > On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: > >> So I'm unsure what we should do here. >> >> We can certainly fix the problem which you report which is relying on >> the "" method -- if you were to do instead: >> print $_->database, ":", $_->primary_id, "\n"; >> >> you'll get the right answer. We at a minimum just fix the auto- >> string converting method to do The Right Thing. >> >> But I am not sure if we should keep the version out of the primary_id >> field. This will require some rejiggering in several modules when it >> comes to printing DBlinks and I don't want to do this before the >> release. I also am not sure if there was an explicit reason why >> someone did put the version information in the primary_id. (I hope it >> wasn't me because I don't think I'm going to remember why). >> >> Does anyone else have a strong feeling? >> >> -jason >> On Oct 17, 2006, at 12:01 PM, Erikjan wrote: >> >>> Hello, >>> >>> I noticed a little problem with the Annotation "DBLink" from >>> GenBank entries >>> >>> When I run: >>> >>> perl -MBio::DB::GenBank -e 'my $gi = >>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my >>> $seqio = >>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my >>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations >>> ("dblink"); >>> for(@annotations) { print $_, "\n";} print $INC{ >>> "Bio/Annotation/DBLink.pm" }, "\n"; ' >>> >>> This yields: >>> >>> GenBank:AL591065.17.17 >>> >>> and the place where the used Bio/Annotation/DBLink.pm resides. >>> >>> Can others repeat this? >>> >>> I have dug into the source a little and Bio::Annotation::DBLink >>> seems to >>> be the place where this happens: it has a concatenation which >>> leads to >>> that repeated version number. >>> >>> It this something that I should fix "client-side", so to speak, or >>> is it >>> worthwhile to add some logic to that concatenation to prevent this? >>> >>> >>> Thanks, >>> >>> Eric >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> Jason Stajich, PhD >> Miller Research Fellow >> University of California >> Dept of Plant and Microbial Biology >> 321 Koshland Hall #3102 >> Berkeley, CA 94720-3102 >> lab: 510.642.8441 >> http://pmb.berkeley.edu/~taylor/people/js.html >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From n.haigh at sheffield.ac.uk Fri Oct 20 04:35:03 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 20 Oct 2006 08:35:03 +0000 Subject: [Bioperl-l] test::more template In-Reply-To: <000f01c6f3de$c3d91170$15327e82@pyrimidine> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> Message-ID: <45388A37.7040505@sheffield.ac.uk> Chris Fields wrote: >> Nathan, >> >> >>> use strict; >>> use Bio::Root::IO; # cant test for this, might be needed to get >>> >> Test::More >> >> use File::Spec->catfile as Bio::Root::IO->catfile just delegates anyway, >> and File::Spec is "guaranteed" to be installed with Perl 5.6+. >> >> >>> use lib Bio::Root::IO->catfile('t','lib'); >>> >> Simpler as: >> use lib 't/lib'; >> I understand the 'lib.pm' accepts Unix style directories REGARDLESS of >> native >> platform. >> >> -- >> Torsten Seemann >> Victorian Bioinformatics Consortium, Monash University, Australia >> > > That is true, at least for WinXP (not sure about older Windows versions out > there). I was using 'Root::IO->catfile' but found 'use lib 't/lib' works. > I may have a few of the 'catfile' versions floating around out there, which > may be where that originated. > > Note that if you plan on using Test::More with the bioperl-run test suite, > you should add it to the bioperl-run CVS distribution directory in 't/lib'. > Most people will have it installed, but you never know. > > Chris > > > What is the reason for including Test::More in 't/lib' rather than having it as a prereq? -- > A: Yes. >> Q: Are you sure? >> >>> A: Because it reverses the logical flow of conversation. >>> >>>> Q: Why is top posting frowned upon? >>>> Get Thunderbird From n.haigh at sheffield.ac.uk Fri Oct 20 05:27:19 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Fri, 20 Oct 2006 10:27:19 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <000f01c6f3de$c3d91170$15327e82@pyrimidine> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> Message-ID: <45389677.1000709@sheffield.ac.uk> Is it really necessary to specify the number of tests that are to be conducted in advance? It seems a bit annoying to have to count the number of tests in the script or to run the test just to see how many tests were done, we could just use: use Test::More 'no_plan'; And then it's up to Test::More to keep a track of how many tests it's run. The only thing then to worry about is how many tests are in a SKIP block if the skip criteria are met. This is unless there is a good reason to use it that I am unaware of. Thanks Nath From bix at sendu.me.uk Fri Oct 20 06:01:09 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 20 Oct 2006 11:01:09 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <45389677.1000709@sheffield.ac.uk> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> <45389677.1000709@sheffield.ac.uk> Message-ID: <45389E65.6080908@sendu.me.uk> Nathan Haigh wrote: > Is it really necessary to specify the number of tests that are to be > conducted in advance? It seems a bit annoying to have to count the > number of tests in the script or to run the test just to see how many > tests were done, we could just use: > use Test::More 'no_plan'; It's very important to have a plan. That way you know all the tests actually ran and weren't skipped (either due to an actual SKIP block or an if block that returned false due to a bug, or a for/foreach/while that didn't loop enough times due to a bug, or any number of other reasons). From bix at sendu.me.uk Fri Oct 20 06:04:48 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 20 Oct 2006 11:04:48 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <45388A37.7040505@sheffield.ac.uk> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> <45388A37.7040505@sheffield.ac.uk> Message-ID: <45389F40.5060601@sendu.me.uk> Nathan S. Haigh wrote: > Chris Fields wrote: > >> Note that if you plan on using Test::More with the bioperl-run test suite, >> you should add it to the bioperl-run CVS distribution directory in 't/lib'. >> Most people will have it installed, but you never know. > > What is the reason for including Test::More in 't/lib' rather than > having it as a prereq? Because we want to ensure that the test suite runs and tells you real problems (if any) about the code (Bioperl) that it is testing, not problems about actually running the tests (which are NOT required for using Bioperl, so cannot be considered 'pre-requisites'). From n.haigh at sheffield.ac.uk Fri Oct 20 06:54:30 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Fri, 20 Oct 2006 11:54:30 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <45389E65.6080908@sendu.me.uk> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> <45389677.1000709@sheffield.ac.uk> <45389E65.6080908@sendu.me.uk> Message-ID: <4538AAE6.5070600@sheffield.ac.uk> If there are known bugs in a particular version of software, what is the best approach for dealing with tests that would fail due to this bug? Simply skip those tests that would be affected by the bug, or to fail if the affected version is detected and report the reason so the user is informed? Or simply bump the minimum version to one above the affected versions? For example, t/Clustalw has a test for at least version 1.8. It then has some profile alignment tests that are only run if version > 1.82 is installed. It states that versions 1.81 and 1.82 are affected by a profile alignment bug - which i assume would make the tests fail. Cheers Nath From bix at sendu.me.uk Fri Oct 20 07:06:07 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 20 Oct 2006 12:06:07 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <4538AAE6.5070600@sheffield.ac.uk> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> <45389677.1000709@sheffield.ac.uk> <45389E65.6080908@sendu.me.uk> <4538AAE6.5070600@sheffield.ac.uk> Message-ID: <4538AD9F.8040003@sendu.me.uk> Nathan Haigh wrote: > If there are known bugs in a particular version of software, what is the > best approach for dealing with tests that would fail due to this bug? > Simply skip those tests that would be affected by the bug, or to fail if > the affected version is detected and report the reason so the user is > informed? Or simply bump the minimum version to one above the affected > versions? > > For example, t/Clustalw has a test for at least version 1.8. It then has > some profile alignment tests that are only run if version > 1.82 is > installed. It states that versions 1.81 and 1.82 are affected by a > profile alignment bug - which i assume would make the tests fail. Specific cases like this, I'd discuss on the list/ with the author of the module in question. Maybe there is some great need to allow usage with <1.81? My view, based purely on what you've said above, bump the pre-requisite to a version that works. From cjfields at uiuc.edu Fri Oct 20 08:36:37 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 20 Oct 2006 07:36:37 -0500 Subject: [Bioperl-l] test::more template In-Reply-To: <45388A37.7040505@sheffield.ac.uk> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> <45388A37.7040505@sheffield.ac.uk> Message-ID: <80A2D210-B0DB-4CD2-9B56-A38097F4F63F@uiuc.edu> >> ,,, >> > What is the reason for including Test::More in 't/lib' rather than > having it as a prereq? We could do that. Many CPAN modules include it in 't/lib' b/c it is only needed for testing purposes. Chris > > -- >> A: Yes. >>> Q: Are you sure? >>> >>>> A: Because it reverses the logical flow of conversation. >>>> >>>>> Q: Why is top posting frowned upon? >>>>> > Get Thunderbird Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Fri Oct 20 10:44:29 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 20 Oct 2006 15:44:29 +0100 Subject: [Bioperl-l] Updated Makefile.PL Message-ID: <4538E0CD.1030908@sendu.me.uk> Hi, I've just committed an updated Makefile.PL to HEAD for bioperl-live. Could some people test it on multiple platforms and confirm it is ok (try out the different possible options as well)? (NB. in the below, 'pre-reqs' are things the makefile considers optional dependencies) Note that some pre-reqs have been removed: # DBD::mysql (nothing in Bioperl-live uses it; Bioperl modules may end up requiring it but only after the user makes an explicit choice by typing 'DBD::mysql' in their own code to supply as an option to Bioperl code) # File::Temp (standard in 5.6.1) This pre-req was wrong: # Data::Stag::Writer and has been replaced with: Data::Stag::XMLWriter Also, I note that very many Bioperl modules need IO::String, including Bio::SeqIO, so I'm not sure to what extent we can pretend it is an optional module. I didn't make any change though. I don't know if these changes affect the Windows ppm Nathan, or anything else (Bundle?)? The INSTALL docs need updating with these new and improved pre-reqs (note that some pre-reqs had wrong/not enough Bioperl modules listed as needing them); does someone want to correct the wiki (based on the new Makefile.PL) and then Chris can re-create the text version? From hlapp at gmx.net Fri Oct 20 11:03:34 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 20 Oct 2006 11:03:34 -0400 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <4538E0CD.1030908@sendu.me.uk> References: <4538E0CD.1030908@sendu.me.uk> Message-ID: <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net> On Oct 20, 2006, at 10:44 AM, Sendu Bala wrote: > Also, I note that very many Bioperl modules need IO::String, including > Bio::SeqIO, so I'm not sure to what extent we can pretend it is an > optional module. I didn't make any change though. I agree. There's really not that many terribly useful things you can do with Bioperl w/o having IO::String installed, which is in stark contrast to many other dependencies. I don't have a problem with making it (and a few others used all over the place) required, to better contrast them with the dependencies that are really optional (and not needed for 90% of users). -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Fri Oct 20 11:18:32 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 20 Oct 2006 10:18:32 -0500 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <4538E0CD.1030908@sendu.me.uk> Message-ID: <001501c6f45b$019103c0$15327e82@pyrimidine> > Hi, > I've just committed an updated Makefile.PL to HEAD for bioperl-live. > Could some people test it on multiple platforms and confirm it is ok > (try out the different possible options as well)? > > (NB. in the below, 'pre-reqs' are things the makefile considers optional > dependencies) > > Note that some pre-reqs have been removed: > # DBD::mysql (nothing in Bioperl-live uses it; Bioperl modules may end > up requiring it but only after the user makes an explicit choice by > typing 'DBD::mysql' in their own code to supply as an option to Bioperl > code) > # File::Temp (standard in 5.6.1) I'll try it out on WinXP and Mac OS X. BTW, do any of Lincoln's Bio::DB* use DBD::mySQL? Bio::DB::GFF comes to mind. I don't think it should be an absolute requirement, though. If we plan on removing those, then we should also remove them from Bundle::Bioperl (if they are present). > This pre-req was wrong: > # Data::Stag::Writer > and has been replaced with: > Data::Stag::XMLWriter > > > Also, I note that very many Bioperl modules need IO::String, including > Bio::SeqIO, so I'm not sure to what extent we can pretend it is an > optional module. I didn't make any change though. Do they all require IO::String or is it an option? There are a few instances (WebDBSeqI-implementing, for instance) where this is presented as an option for most OS's (along with the default, pipeline, and tempfile). However, it is currently used by default with Windows due to lack of pipe/fork support at the time. BTW, the latter may now work with WinXP ActivePerl. ActiveState has been working on WinXP fork() emulation for a while, but I think it is still somewhat experimental. > I don't know if these changes affect the Windows ppm Nathan, or anything > else (Bundle?)? > > The INSTALL docs need updating with these new and improved pre-reqs > (note that some pre-reqs had wrong/not enough Bioperl modules listed as > needing them); does someone want to correct the wiki (based on the new > Makefile.PL) and then Chris can re-create the text version? Easier to just modify the text version based on what is changed in the wiki, at least for the time being. The text dumping from elinks/lynx isn't full-proof re: tables and such, which is one reason I think we should move the prereqs to a separate file as it's easier to maintain long-term (this seems to be where most changes occur anyway). Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Fri Oct 20 11:23:38 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 20 Oct 2006 16:23:38 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <1161270180.453793a432e4f@webmail.shef.ac.uk> References: <45375615.1020603@sheffield.ac.uk> <45379BBB.1040400@sheffield.ac.uk> <1161270180.453793a432e4f@webmail.shef.ac.uk> Message-ID: <4538E9FA.60701@sendu.me.uk> Nathan Haigh wrote: > I thought I'd have my first proper try at writing some tests. I was wondering if there is a template test file that I should use/study in order to be > consistent with other tests. > > Failing that - Is there a good test writing style I should follow in one of the other test files? I originally based mine on one of Chris's EUtilities tests, but now refer to t/ESEfinder.t since it is small and demonstrates all the major tricky things you might have to do - skip remote tests if no BIOPERLDEBUG, skip remote tests on remote server failure, skip all tests under some condition, fall-back to t/lib for Test::More if necessary. (Though I just spotted an oops in the latter...) From cjfields at uiuc.edu Fri Oct 20 11:38:02 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 20 Oct 2006 10:38:02 -0500 Subject: [Bioperl-l] test::more template In-Reply-To: <4538E9FA.60701@sendu.me.uk> Message-ID: <001601c6f45d$bb824350$15327e82@pyrimidine> > Nathan Haigh wrote: > > I thought I'd have my first proper try at writing some tests. I was > wondering if there is a template test file that I should use/study in > order to be > > consistent with other tests. > > > > Failing that - Is there a good test writing style I should follow in one > of the other test files? > > I originally based mine on one of Chris's EUtilities tests, but now > refer to t/ESEfinder.t since it is small and demonstrates all the major > tricky things you might have to do - skip remote tests if no > BIOPERLDEBUG, skip remote tests on remote server failure, skip all tests > under some condition, fall-back to t/lib for Test::More if necessary. > > (Though I just spotted an oops in the latter...) I agree. The EUtilities tests are quite long. I plan on eventually cutting out some of them Making them somewhat less prone to changes in returned XML data has also been a pain, as demonstrated by some of the tests from MAIN now failing... d'oh! Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Fri Oct 20 11:39:32 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 20 Oct 2006 16:39:32 +0100 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <001501c6f45b$019103c0$15327e82@pyrimidine> References: <001501c6f45b$019103c0$15327e82@pyrimidine> Message-ID: <4538EDB4.3030500@sendu.me.uk> Chris Fields wrote: > BTW, do any of Lincoln's Bio::DB* > use DBD::mySQL? Bio::DB::GFF comes to mind. No, just a require on a user-passed variable as I described. >> Also, I note that very many Bioperl modules need IO::String, including >> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an >> optional module. I didn't make any change though. > > Do they all require IO::String or is it an option? Oops, I take that back. Bio::SeqIO doesn't use IO::String. That's what you get for relying on grep output... It's still many modules that use it, but I suppose you could do useful things without. So actually, let's keep it optional. From cjfields at uiuc.edu Fri Oct 20 16:32:32 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 20 Oct 2006 15:32:32 -0500 Subject: [Bioperl-l] Error retrieving sequence from BioSQL Message-ID: <000001c6f486$df508930$15327e82@pyrimidine> Seth, Did you work out the problem here? There was a recent CVS update to OBDA tests (16obda.t) that fixed similar problems on Mac OS X. Old OBDA tests apparently left data from tests in the database, which caused problems with repeated test runs. Chris > > -----Original Message----- > > From: bioperl-l-bounces lists.open-bio.org [mailto:bioperl-l- > > bounces lists.open-bio.org] On Behalf Of Seth Johnson > > Sent: Saturday, September 30, 2006 6:35 PM > > To: Hilmar Lapp > > Cc: Chris Fields; Bioperl List > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > > > Here're complete test details: > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > ... > > > FAILED tests 10-12 > > Failed 3/12 tests, 75.00% okay > > Failed Test Stat Wstat Total Fail Failed List of Failed > > > -------------------------------------------------------------------------- > > ----- > > t\02species.t 65 2 3.08% 63 65 > > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > > t\12ontology.t 2 512 738 1471 199.32% 3-738 > > t\16obda.t 12 3 25.00% 10-12 > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- Best Regards, Seth Johnson Senior Bioinformatics Associate Ph: (202) 470-0900 Fx: (775) 251-0358 Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From olenka.m at gmail.com Fri Oct 20 17:47:15 2006 From: olenka.m at gmail.com (Olena Morozova) Date: Fri, 20 Oct 2006 14:47:15 -0700 Subject: [Bioperl-l] GO annotations Message-ID: <259a224c0610201447i44bd5226x3d66c31dbf3f715d@mail.gmail.com> Dear all, Does anyone know an easy way to get GO-BP annotations for ensembl genes? Thank you very much for your help, Olena From olenka.m at gmail.com Fri Oct 20 17:47:15 2006 From: olenka.m at gmail.com (Olena Morozova) Date: Fri, 20 Oct 2006 14:47:15 -0700 Subject: [Bioperl-l] GO annotations Message-ID: <259a224c0610201447i44bd5226x3d66c31dbf3f715d@mail.gmail.com> Dear all, Does anyone know an easy way to get GO-BP annotations for ensembl genes? Thank you very much for your help, Olena From sdavis2 at mail.nih.gov Sat Oct 21 11:05:26 2006 From: sdavis2 at mail.nih.gov (Davis, Sean (NIH/NCI) [E]) Date: Sat, 21 Oct 2006 11:05:26 -0400 Subject: [Bioperl-l] GO annotations References: <259a224c0610201447i44bd5226x3d66c31dbf3f715d@mail.gmail.com> Message-ID: <014DBF86B19310419F0DF8910FC56457240CE3@nihcesmlbx10.nih.gov> You can use the ensembl perl API, or (more simply) use the Ensembl MART interface: http://www.ensembl.org/Multi/martview Sean -----Original Message----- From: Olena Morozova [mailto:olenka.m at gmail.com] Sent: Fri 10/20/2006 5:47 PM To: bioperl-l Subject: [Bioperl-l] GO annotations Dear all, Does anyone know an easy way to get GO-BP annotations for ensembl genes? Thank you very much for your help, Olena _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From n.haigh at sheffield.ac.uk Sun Oct 22 06:34:51 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sun, 22 Oct 2006 10:34:51 +0000 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net> References: <4538E0CD.1030908@sendu.me.uk> <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net> Message-ID: <453B494B.7040702@sheffield.ac.uk> Hilmar Lapp wrote: > On Oct 20, 2006, at 10:44 AM, Sendu Bala wrote: > > >> Also, I note that very many Bioperl modules need IO::String, including >> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an >> optional module. I didn't make any change though. >> > > I agree. There's really not that many terribly useful things you can > do with Bioperl w/o having IO::String installed, which is in stark > contrast to many other dependencies. > > I don't have a problem with making it (and a few others used all over > the place) required, to better contrast them with the dependencies > that are really optional (and not needed for 90% of users). > > -hilmar > > Is it possible to make a distinction in Makefile.PL between those modules that are an absolute must for Bioperl-core and those which are optional and should go into Bundle::BioPerl? Once I'm sure what should be "option" I'll do the Bundle::BioPerl package and PPD's. Cheers Nath From vitacolonna at appliedgenomics.org Sun Oct 22 09:04:48 2006 From: vitacolonna at appliedgenomics.org (Nicola Vitacolonna) Date: Sun, 22 Oct 2006 15:04:48 +0200 Subject: [Bioperl-l] Submission proposal: ABIF module Message-ID: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org> Hi everybody, I would like to submit to CPAN a module for reading and parsing the ABIF files (with .ab1 suffix) produced by Applied Biosequence sequencers. The need for such a module arose in our lab because the existing ABI module we found on CPAN had too limited functionality. As an example, our module allows us to easily produce analysis reports similar to the ones generated by the Sequencing Analysis software. May I call the module Bio::ABIF? Or should I follow other conventions? Nicola From cjfields at uiuc.edu Sun Oct 22 09:54:51 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 22 Oct 2006 08:54:51 -0500 Subject: [Bioperl-l] Submission proposal: ABIF module In-Reply-To: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org> References: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org> Message-ID: On Oct 22, 2006, at 8:04 AM, Nicola Vitacolonna wrote: > Hi everybody, > I would like to submit to CPAN a module for reading and parsing the > ABIF files (with .ab1 suffix) produced by Applied Biosequence > sequencers. The need for such a module arose in our lab because the > existing ABI module we found on CPAN had too limited functionality. > As an example, our module allows us to easily produce analysis > reports similar to the ones generated by the Sequencing Analysis > software. > > May I call the module Bio::ABIF? Or should I follow other conventions? > > Nicola It depends. Does it interact with bioperl in any way? Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Sun Oct 22 09:57:18 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 22 Oct 2006 08:57:18 -0500 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <453B494B.7040702@sheffield.ac.uk> References: <4538E0CD.1030908@sendu.me.uk> <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net> <453B494B.7040702@sheffield.ac.uk> Message-ID: <7E563B1E-11D7-44F4-9E3D-3E319CE8DDA4@uiuc.edu> On Oct 22, 2006, at 5:34 AM, Nathan S. Haigh wrote: > Is it possible to make a distinction in Makefile.PL between those > modules that are an absolute must for Bioperl-core and those which are > optional and should go into Bundle::BioPerl? > > Once I'm sure what should be "option" I'll do the Bundle::BioPerl > package and PPD's. > > Cheers > Nath We probably should steer this way eventually. Do you aim on placing prereqs required for bioperl core in the bioperl PPD and the 'optional' ones with the bundle? Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From vitacolonna at appliedgenomics.org Sun Oct 22 10:16:26 2006 From: vitacolonna at appliedgenomics.org (Nicola Vitacolonna) Date: Sun, 22 Oct 2006 16:16:26 +0200 Subject: [Bioperl-l] Submission proposal: ABIF module In-Reply-To: References: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org> Message-ID: <8619131E-0F3D-4718-B7E3-7B5118B1139E@appliedgenomics.org> On 22/ott/06, at 15:54, Chris Fields wrote: > > On Oct 22, 2006, at 8:04 AM, Nicola Vitacolonna wrote: > >> Hi everybody, >> I would like to submit to CPAN a module for reading and parsing the >> ABIF files (with .ab1 suffix) [...] >> May I call the module Bio::ABIF? Or should I follow other >> conventions? > > It depends. Does it interact with bioperl in any way? No. Can you suggest a suitable pattern for the name? Nicola From cjfields at uiuc.edu Sun Oct 22 10:55:46 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 22 Oct 2006 09:55:46 -0500 Subject: [Bioperl-l] Submission proposal: ABIF module In-Reply-To: <8619131E-0F3D-4718-B7E3-7B5118B1139E@appliedgenomics.org> References: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org> <8619131E-0F3D-4718-B7E3-7B5118B1139E@appliedgenomics.org> Message-ID: On Oct 22, 2006, at 9:16 AM, Nicola Vitacolonna wrote: > On 22/ott/06, at 15:54, Chris Fields wrote: > >> >> On Oct 22, 2006, at 8:04 AM, Nicola Vitacolonna wrote: >> >>> Hi everybody, >>> I would like to submit to CPAN a module for reading and parsing the >>> ABIF files (with .ab1 suffix) [...] >>> May I call the module Bio::ABIF? Or should I follow other >>> conventions? >> >> It depends. Does it interact with bioperl in any way? > > No. Can you suggest a suitable pattern for the name? > > Nicola I don't think it will be a problem to name it Bio::ABIF; there is already a Bio::ASN1::EntrezGene, and Rutger Vos's Bio::Phylo modules (the latter doesn't require BioPerl either). Saying that, if you plan on contributing more CPAN modules with similar functionality (such as parsing other trace files), you might want to consider using a namespace that isn't limiting but doesn't conflict with Bioperl core (like Bio::Trace or similar, then name your module Bio::Trace::ABIF). You can use search.cpan.org to check namespaces for conflicts. Just as an note: we have bioperl-ext, which also parses ABI and other trace file formats. It's a bit old now and needs updating, but is supposed to be quite fast (it uses the Staden io_lib C library via PerlXS). -c Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From arareko at campus.iztacala.unam.mx Sun Oct 22 13:26:37 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Sun, 22 Oct 2006 12:26:37 -0500 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <4538E0CD.1030908@sendu.me.uk> References: <4538E0CD.1030908@sendu.me.uk> Message-ID: <453BA9CD.4060107@campus.iztacala.unam.mx> Works fine on FreeBSD. Mauricio. Sendu Bala wrote: > Hi, > I've just committed an updated Makefile.PL to HEAD for bioperl-live. > Could some people test it on multiple platforms and confirm it is ok > (try out the different possible options as well)? > > (NB. in the below, 'pre-reqs' are things the makefile considers optional > dependencies) > > Note that some pre-reqs have been removed: > # DBD::mysql (nothing in Bioperl-live uses it; Bioperl modules may end > up requiring it but only after the user makes an explicit choice by > typing 'DBD::mysql' in their own code to supply as an option to Bioperl > code) > # File::Temp (standard in 5.6.1) > > > This pre-req was wrong: > # Data::Stag::Writer > and has been replaced with: > Data::Stag::XMLWriter > > > Also, I note that very many Bioperl modules need IO::String, including > Bio::SeqIO, so I'm not sure to what extent we can pretend it is an > optional module. I didn't make any change though. > > > I don't know if these changes affect the Windows ppm Nathan, or anything > else (Bundle?)? > > The INSTALL docs need updating with these new and improved pre-reqs > (note that some pre-reqs had wrong/not enough Bioperl modules listed as > needing them); does someone want to correct the wiki (based on the new > Makefile.PL) and then Chris can re-create the text version? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From n.haigh at sheffield.ac.uk Sun Oct 22 15:37:07 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sun, 22 Oct 2006 20:37:07 +0100 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <7E563B1E-11D7-44F4-9E3D-3E319CE8DDA4@uiuc.edu> References: <4538E0CD.1030908@sendu.me.uk> <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net> <453B494B.7040702@sheffield.ac.uk> <7E563B1E-11D7-44F4-9E3D-3E319CE8DDA4@uiuc.edu> Message-ID: <453BC863.4090803@sheffield.ac.uk> Chris Fields wrote: > > On Oct 22, 2006, at 5:34 AM, Nathan S. Haigh wrote: > >> Is it possible to make a distinction in Makefile.PL between those >> modules that are an absolute must for Bioperl-core and those which are >> optional and should go into Bundle::BioPerl? >> >> Once I'm sure what should be "option" I'll do the Bundle::BioPerl >> package and PPD's. >> >> Cheers >> Nath > > We probably should steer this way eventually. Do you aim on placing > prereqs required for bioperl core in the bioperl PPD and the > 'optional' ones with the bundle? > That's correct. However, PPM will always try to update packages to the latest available. Therefore, if at some point in the future, a dependency is removed, and thus removed from Bundle::BioPerl, a situation may arise where an older version of BioPerl is running with the a recent version of Bundle::BioPerl and could have missing dependencies - not ideal but it is how things currently stand. The process of making the Bundle::BioPerl PPD would be simplified if these "optional" dependencies are separated from the "core" dependencies. If one of the following solutions is possible (i'm not sure if they are), it would be very useful: 1) Maintain 2 hashes in Makefile.PL that contain the "core" and "optional" dependencies. In unsure of the way dependencies are ordered during a "make ppd", but it may be possible to pass hash references of both to PREREQS_PM in MakeMakefile and have the "optional" depenencies grouped separately from "core" depenedcies in the ppd file - thus making it easy to stip them out into a Bundle::BioPerl ppd. 2) Again, maintain 2 hashes in Makefile.PL that contain the "core" and "optional" dependencies. Have some Makefile setup that allows the generation of a Bundle::BioPerl ppd separately from the main Bioperl ppd. Like I said, these are just some thoughts and I'm not sure if they are even viable options. Nath From chhalling at alumni.ls.berkeley.edu Sun Oct 22 19:45:33 2006 From: chhalling at alumni.ls.berkeley.edu (Conrad Halling) Date: Sun, 22 Oct 2006 19:45:33 -0400 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl Message-ID: <453C029D.1070708@alumni.ls.berkeley.edu> I have found three misspellings in Bundle::BioPerl 2.1.6 of 17-Oct-2006 that prevent these modules from being installed: Data::Stag::Writer (listed as Data::Stag::writer) HTTP::Request::Common (listed as HTTP::Request::Common-) Spreadsheet::ParseExcel (listed as Spreadhseet::ParseExcel) -- Conrad Halling chhalling at alumni.ls.berkeley.edu From cjfields at uiuc.edu Sun Oct 22 22:24:07 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 22 Oct 2006 21:24:07 -0500 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C029D.1070708@alumni.ls.berkeley.edu> References: <453C029D.1070708@alumni.ls.berkeley.edu> Message-ID: <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> Thanks for letting us know! Did PPM4 throw errors or just silently pass them over? Chris On Oct 22, 2006, at 6:45 PM, Conrad Halling wrote: > I have found three misspellings in Bundle::BioPerl 2.1.6 of 17- > Oct-2006 > that prevent these modules from being installed: > > Data::Stag::Writer (listed as Data::Stag::writer) > HTTP::Request::Common (listed as HTTP::Request::Common-) > Spreadsheet::ParseExcel (listed as Spreadhseet::ParseExcel) > > -- > Conrad Halling > chhalling at alumni.ls.berkeley.edu > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Mon Oct 23 02:45:29 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 06:45:29 +0000 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> Message-ID: <453C6509.90005@sheffield.ac.uk> Chris Fields wrote: > Thanks for letting us know! Did PPM4 throw errors or just silently > pass them over? > > Chris > > On Oct 22, 2006, at 6:45 PM, Conrad Halling wrote: > > I believe he is talking about the bundle on cpan and not the ppd. I will get this updated as soon as possible. Sendu/Chris - can you confirm to me which Bioperl modules are essential to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any reason for not putting *all* dependencies into the bundle? Nath From bix at sendu.me.uk Mon Oct 23 02:43:36 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 07:43:36 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C029D.1070708@alumni.ls.berkeley.edu> References: <453C029D.1070708@alumni.ls.berkeley.edu> Message-ID: <453C6498.5@sendu.me.uk> Conrad Halling wrote: > I have found three misspellings in Bundle::BioPerl 2.1.6 of 17-Oct-2006 > that prevent these modules from being installed: > > Data::Stag::Writer (listed as Data::Stag::writer) This should be Data::Stag::XMLWriter > HTTP::Request::Common (listed as HTTP::Request::Common-) > Spreadsheet::ParseExcel (listed as Spreadhseet::ParseExcel) From bix at sendu.me.uk Mon Oct 23 02:52:47 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 07:52:47 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C6509.90005@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> Message-ID: <453C66BF.1060008@sendu.me.uk> Nathan S. Haigh wrote: > Sendu/Chris - can you confirm to me which Bioperl modules are essential > to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any > reason for not putting *all* dependencies into the bundle? AFAIK, there are no essential external dependencies. Everything in %packages in Makefile.PL, for example, is optional. We had the discussion about making all the easy-to-install ones a forced requirement anyway (so that most things work out of the box), but perhaps we'll hold off on making such a change until after 1.5.2. From jyotikshah at gmail.com Mon Oct 23 03:10:43 2006 From: jyotikshah at gmail.com (Jyoti Shah) Date: Mon, 23 Oct 2006 00:10:43 -0700 Subject: [Bioperl-l] short motif searches Message-ID: <769931430610230010t5f68e54eg7fbc1be658c2cb3a@mail.gmail.com> Hi, I am interested in searching motifs as small as 6 or 7 nucleotides in genomic databases. I need exact matches. Is there any bioperl module available which can help me do this? I tried WU BLAST with word size one, but I am getting warning messages such as "WARNING: the maximum achievable score of 7 in context 0 (frame +1) is less than the ungapped cutoff score S2 (=13). Exit code 0...". Any suggestions? Thanks in advance, Jyoti From bix at sendu.me.uk Mon Oct 23 03:55:40 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 08:55:40 +0100 Subject: [Bioperl-l] short motif searches In-Reply-To: <769931430610230010t5f68e54eg7fbc1be658c2cb3a@mail.gmail.com> References: <769931430610230010t5f68e54eg7fbc1be658c2cb3a@mail.gmail.com> Message-ID: <453C757C.1010408@sendu.me.uk> Jyoti Shah wrote: > Hi, > > I am interested in searching motifs as small as 6 or 7 nucleotides in > genomic databases. I need exact matches. Is there any bioperl module > available which can help me do this? At 6 or 7bp long doing a simple exact match I should point out you're going to get very many hits; are you sure this is an appropriate thing to do for your purposes? Assuming yes, you can use Bio::SeqIO, Bio::Index or Bio::DB:: to get your genomic sequences of interest, then simply use a normal perl regexp on the resulting $seq->seq strings. If your motifs are anything like transcription factor binding sites, and you have more information than just a single sequence string for the motif, investigate Bio::Matrix::PSM. From bix at sendu.me.uk Mon Oct 23 04:29:52 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 09:29:52 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C7648.8030004@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> Message-ID: <453C7D80.80207@sendu.me.uk> Nathan S. Haigh wrote: > Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> Sendu/Chris - can you confirm to me which Bioperl modules are essential >>> to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any >>> reason for not putting *all* dependencies into the bundle? >> AFAIK, there are no essential external dependencies. Everything in >> %packages in Makefile.PL, for example, is optional. >> >> We had the discussion about making all the easy-to-install ones a >> forced requirement anyway (so that most things work out of the box), >> but perhaps we'll hold off on making such a change until after 1.5.2. > > How are they forced? They're not. Right now they're optional. I'm suggesting we might change that in the future. If you're asking how we /would/ force them, probably by adding PREREQ_FATAL to the WriteMakefile() call. As it is, Bioperl installs successfully (or should!) without its optional dependencies given in PREREQ_PM because make test succeeds (because tests skip ok when the optional dependency isn't there). I don't really know how CPAN discovers dependencies and auto-installs them before a dependent module though. Anyone care to explain? From n.haigh at sheffield.ac.uk Mon Oct 23 06:09:12 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 10:09:12 +0000 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C7D80.80207@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> Message-ID: <453C94C8.5040900@sheffield.ac.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: >> Sendu Bala wrote: >>> Nathan S. Haigh wrote: >>>> Sendu/Chris - can you confirm to me which Bioperl modules are >>>> essential >>>> to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any >>>> reason for not putting *all* dependencies into the bundle? >>> AFAIK, there are no essential external dependencies. Everything in >>> %packages in Makefile.PL, for example, is optional. >>> >>> We had the discussion about making all the easy-to-install ones a >>> forced requirement anyway (so that most things work out of the box), >>> but perhaps we'll hold off on making such a change until after 1.5.2. > > >> How are they forced? > > They're not. Right now they're optional. I'm suggesting we might > change that in the future. > If you're asking how we /would/ force them, probably by adding > PREREQ_FATAL to the WriteMakefile() call. As it is, Bioperl installs > successfully (or should!) without its optional dependencies given in > PREREQ_PM because make test succeeds (because tests skip ok when the > optional dependency isn't there). > > I don't really know how CPAN discovers dependencies and auto-installs > them before a dependent module though. Anyone care to explain? I thought so! I misunderstood something earlier which confused me. Just to clarify for my own sanities sake: 1) Currently all dependencies are optional. 2) All dependencies are in %packages 3) all these are passed to PREREQ_PM As far as CPAN discovering dependencies, here is a snip from the CPAN FAQ's: --snip-- I installed a Bundle and had a couple of fails. When I retried, everything resolved nicely. Can this be fixed to work on first try? The reason for this is that CPAN does not know the dependencies of all modules when it starts out. To decide about the additional items to install, it just uses data found in the META.yml file or the generated Makefile. An undetected missing piece breaks the process. But it may well be that your Bundle installs some prerequisite later than some depending item and thus your second try is able to resolve everything. Please note, CPAN.pm does not know the dependency tree in advance and cannot sort the queue of things to install in a topologically correct order. It resolves perfectly well IF all modules declare the prerequisites correctly with the PREREQ_PM attribute to MakeMaker or the |requires| stanza of Module::Build. For bundles which fail and you need to install often, it is recommended to sort the Bundle definition file manually. --snip-- Therefore, recent modifications to Makefile.PL should result in a fully operational Bioperl installation, if installed via CPAN. Although only Bioperl 1.4 is available via CPAN currently. It is possible to upload a developer release to CPAN which can only be ownloaded via CPAN if specifically asked for - would be good for 1.5.x.: --snip-- How do I install a "DEVELOPER RELEASE" of a module? By default, CPAN will install the latest non-developer release of a module. If you want to install a dev release, you have to specify the partial path starting with the author id to the tarball you wish to install, like so: cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz Note that you can use the |ls| command to get this path listed. --snip-- HTH Nath From bix at sendu.me.uk Mon Oct 23 05:41:52 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 10:41:52 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C94C8.5040900@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> Message-ID: <453C8E60.7000105@sendu.me.uk> Nathan S. Haigh wrote: > Sendu Bala wrote: > >> I don't really know how CPAN discovers dependencies and auto-installs >> them before a dependent module though. Anyone care to explain? > > I thought so! I misunderstood something earlier which confused me. Just > to clarify for my own sanities sake: > > 1) Currently all dependencies are optional. > 2) All dependencies are in %packages > 3) all these are passed to PREREQ_PM All correct. > As far as CPAN discovering dependencies, here is a snip from the CPAN FAQ's: > --snip-- > > I installed a Bundle and had a couple of fails. When I retried, > everything resolved nicely. Can this be fixed to work on first try? > > The reason for this is that CPAN does not know the dependencies of > all modules when it starts out. To decide about the additional items > to install, it just uses data found in the META.yml file or the > generated Makefile. An undetected missing piece breaks the process. > But it may well be that your Bundle installs some prerequisite later > than some depending item and thus your second try is able to resolve > everything. Please note, CPAN.pm does not know the dependency tree > in advance and cannot sort the queue of things to install in a > topologically correct order. It resolves perfectly well IF all > modules declare the prerequisites correctly with the PREREQ_PM > attribute to MakeMaker or the |requires| stanza of Module::Build. > For bundles which fail and you need to install often, it is > recommended to sort the Bundle definition file manually. > > --snip-- > > Therefore, recent modifications to Makefile.PL should result in a fully > operational Bioperl installation, if installed via CPAN. Right, thanks for that. > Although only Bioperl 1.4 is available via CPAN currently. It is possible to upload a > developer release to CPAN which can only be ownloaded via CPAN if > specifically asked for - would be good for 1.5.x.: > --snip-- > > How do I install a "DEVELOPER RELEASE" of a module? > > By default, CPAN will install the latest non-developer release of a > module. If you want to install a dev release, you have to specify > the partial path starting with the author id to the tarball you wish > to install, like so: > > cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz > > Note that you can use the |ls| command to get this path listed. > > --snip-- That's the user point of view - how does the developer actually tell CPAN that something is a developer release so that normal users don't automatically install it? From bix at sendu.me.uk Mon Oct 23 05:59:52 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 10:59:52 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C8E60.7000105@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> Message-ID: <453C9298.9000900@sendu.me.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: >> As far as CPAN discovering dependencies, here is a snip from the CPAN >> FAQ's: >> --snip-- >> >> I installed a Bundle and had a couple of fails. When I retried, >> everything resolved nicely. Can this be fixed to work on first try? >> >> The reason for this is that CPAN does not know the dependencies of >> all modules when it starts out. To decide about the additional items >> to install, it just uses data found in the META.yml file or the >> generated Makefile. An undetected missing piece breaks the process. >> But it may well be that your Bundle installs some prerequisite later >> than some depending item and thus your second try is able to resolve >> everything. Please note, CPAN.pm does not know the dependency tree >> in advance and cannot sort the queue of things to install in a >> topologically correct order. It resolves perfectly well IF all >> modules declare the prerequisites correctly with the PREREQ_PM >> attribute to MakeMaker or the |requires| stanza of Module::Build. >> For bundles which fail and you need to install often, it is >> recommended to sort the Bundle definition file manually. >> >> --snip-- >> >> Therefore, recent modifications to Makefile.PL should result in a fully >> operational Bioperl installation, if installed via CPAN. > > Right, thanks for that. Oh, so this effectively means that our 'optional' dependencies are installed for CPAN users, which matches up to my 'force the optional ones anyway' desire, leaving Bundle::BioPerl without any use. Makefile.PL could be altered again to remove from PREREQ_PM those modules the user didn't already have installed, thus CPAN would only install Bioperl itself and nothing optional. The user could then install Bundle::BioPerl if they wanted a quick way of getting all the optional stuff to work. I'm happy either way; what do other people think? From n.haigh at sheffield.ac.uk Mon Oct 23 07:22:17 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 11:22:17 +0000 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C9298.9000900@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453C9298.9000900@sendu.me.uk> Message-ID: <453CA5E9.1060406@sheffield.ac.uk> Sendu Bala wrote: > Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> As far as CPAN discovering dependencies, here is a snip from the >>> CPAN FAQ's: >>> --snip-- >>> >>> I installed a Bundle and had a couple of fails. When I retried, >>> everything resolved nicely. Can this be fixed to work on first try? >>> >>> The reason for this is that CPAN does not know the dependencies of >>> all modules when it starts out. To decide about the additional >>> items >>> to install, it just uses data found in the META.yml file or the >>> generated Makefile. An undetected missing piece breaks the process. >>> But it may well be that your Bundle installs some prerequisite >>> later >>> than some depending item and thus your second try is able to >>> resolve >>> everything. Please note, CPAN.pm does not know the dependency tree >>> in advance and cannot sort the queue of things to install in a >>> topologically correct order. It resolves perfectly well IF all >>> modules declare the prerequisites correctly with the PREREQ_PM >>> attribute to MakeMaker or the |requires| stanza of Module::Build. >>> For bundles which fail and you need to install often, it is >>> recommended to sort the Bundle definition file manually. >>> >>> --snip-- >>> >>> Therefore, recent modifications to Makefile.PL should result in a fully >>> operational Bioperl installation, if installed via CPAN. >> >> Right, thanks for that. > > Oh, so this effectively means that our 'optional' dependencies are > installed for CPAN users, which matches up to my 'force the optional > ones anyway' desire, leaving Bundle::BioPerl without any use. > > Makefile.PL could be altered again to remove from PREREQ_PM those > modules the user didn't already have installed, thus CPAN would only > install Bioperl itself and nothing optional. The user could then > install Bundle::BioPerl if they wanted a quick way of getting all the > optional stuff to work. > > I'm happy either way; what do other people think? >From my point of view, removing them from PREREQ_PM means building the Bundle::BioPerl a bit of a pain :o( I prefer the way it is currently set up - most people have fast internet connections and GB of harddrive space. Other than the reason "why install something I won't ever need" I don't see much point maintaining Bundle::BioPerl and having "optional" dependencies. I think if there are any modules which are not going to be used by the majority of users, then this could be used as the rationale for removing them from bioperl-core into another package? Nath From n.haigh at sheffield.ac.uk Mon Oct 23 07:38:05 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 11:38:05 +0000 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C8E60.7000105@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> Message-ID: <453CA99D.9060009@sheffield.ac.uk> >> Although only Bioperl 1.4 is available via CPAN currently. It is >> possible to upload a >> developer release to CPAN which can only be ownloaded via CPAN if >> specifically asked for - would be good for 1.5.x.: >> --snip-- >> >> How do I install a "DEVELOPER RELEASE" of a module? >> >> By default, CPAN will install the latest non-developer release of a >> module. If you want to install a dev release, you have to specify >> the partial path starting with the author id to the tarball you wish >> to install, like so: >> >> cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz >> >> Note that you can use the |ls| command to get this path listed. >> >> --snip-- > > That's the user point of view - how does the developer actually tell > CPAN that something is a developer release so that normal users don't > automatically install it? I found this: http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt Is says that $VERSION should simply be changed from a naked number into a single quoted number and this should be recognized by the CPAN indexer. Nath From bix at sendu.me.uk Mon Oct 23 06:47:38 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 11:47:38 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> <4535EBF9.1090706@sendu.me.uk> <4536113D.1080307@sheffield.ac.uk> Message-ID: <453C9DCA.4020802@sendu.me.uk> Hilmar Lapp wrote: > On Oct 18, 2006, at 7:34 AM, Nathan Haigh wrote: > >> For example, I have made no effort to setup biosql-schema but I >> thought that maybe there would be a test that would detect this > > I'm afraid there isn't. Bioperl-db is meaningless without > biosql-schema. Can you suggest a way we might detect if biosql-schema has been installed prior to running the test suite, so we can give some meaningful error message? From bix at sendu.me.uk Mon Oct 23 08:43:30 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 13:43:30 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453CA5E9.1060406@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk> Message-ID: <453CB8F2.7070703@sendu.me.uk> Nathan S. Haigh wrote: > Sendu Bala wrote: > >> Makefile.PL could be altered again to remove from PREREQ_PM those >> modules the user didn't already have installed, thus CPAN would only >> install Bioperl itself and nothing optional. The user could then >> install Bundle::BioPerl if they wanted a quick way of getting all the >> optional stuff to work. >> >> I'm happy either way; what do other people think? > > From my point of view, removing them from PREREQ_PM means building the > Bundle::BioPerl a bit of a pain :o( Can I ask how you're generating Bundle::BioPerl? That is, how did the typos get in there? Is there a way to certainly avoid typos in the future? From n.haigh at sheffield.ac.uk Mon Oct 23 09:46:17 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 13:46:17 +0000 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453CB8F2.7070703@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk> <453CB8F2.7070703@sendu.me.uk> Message-ID: <453CC7A9.6090609@sheffield.ac.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: >> Sendu Bala wrote: >> >>> Makefile.PL could be altered again to remove from PREREQ_PM those >>> modules the user didn't already have installed, thus CPAN would only >>> install Bioperl itself and nothing optional. The user could then >>> install Bundle::BioPerl if they wanted a quick way of getting all the >>> optional stuff to work. >>> >>> I'm happy either way; what do other people think? > > >> From my point of view, removing them from PREREQ_PM means building the >> Bundle::BioPerl a bit of a pain :o( > > Can I ask how you're generating Bundle::BioPerl? That is, how did the > typos get in there? Is there a way to certainly avoid typos in the > future? I just modified the list by hand a while back :o( - I'm sure there must be a better way. From bix at sendu.me.uk Mon Oct 23 08:58:13 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 13:58:13 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453CC7A9.6090609@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk> <453CB8F2.7070703@sendu.me.uk> <453CC7A9.6090609@sheffield.ac.uk> Message-ID: <453CBC65.2020202@sendu.me.uk> Nathan S. Haigh wrote: > Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> Sendu Bala wrote: >>> >>>> Makefile.PL could be altered again to remove from PREREQ_PM those >>>> modules the user didn't already have installed, thus CPAN would only >>>> install Bioperl itself and nothing optional. The user could then >>>> install Bundle::BioPerl if they wanted a quick way of getting all the >>>> optional stuff to work. >>>> >>>> I'm happy either way; what do other people think? >>> >>> From my point of view, removing them from PREREQ_PM means building the >>> Bundle::BioPerl a bit of a pain :o( >> >> Can I ask how you're generating Bundle::BioPerl? That is, how did the >> typos get in there? Is there a way to certainly avoid typos in the >> future? > > I just modified the list by hand a while back :o( - I'm sure there must > be a better way. I'm not sure I understand why removing things from PREREQ_PM would be a problem for you then; the %packages hash would remain unchanged (ie. have everything) so you have something to refer to when manually editing the Bundle. http://www.cpan.org/misc/cpan-faq.html#How_make_bundle might be helpful? I didn't really pay too much attention to the advice - does it offer a typo-avoiding solution? From n.haigh at sheffield.ac.uk Mon Oct 23 10:04:12 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 14:04:12 +0000 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453CBC65.2020202@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk> <453CB8F2.7070703@sendu.me.uk> <453CC7A9.6090609@sheffield.ac.uk> <453CBC65.2020202@sendu.me.uk> Message-ID: <453CCBDC.6030904@sheffield.ac.uk> > I'm not sure I understand why removing things from PREREQ_PM would be > a problem for you then; the %packages hash would remain unchanged (ie. > have everything) so you have something to refer to when manually > editing the Bundle. > > http://www.cpan.org/misc/cpan-faq.html#How_make_bundle > might be helpful? I didn't really pay too much attention to the advice > - does it offer a typo-avoiding solution? It's helpful in producing the Bundle PPD as all the XML tags are present in the Bioperl PPD and they simply need to be copied over to a Bundle-BioPerl PPD file. Looks like manual editing of the relevant file is required for making a CPAN bundle. Unfortunately - no typo-avoiding solution. :o( From dhoworth at mrc-lmb.cam.ac.uk Mon Oct 23 08:46:29 2006 From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth) Date: Mon, 23 Oct 2006 13:46:29 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453CA99D.9060009@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> Message-ID: <453CB9A5.2020409@mrc-lmb.cam.ac.uk> >> That's the user point of view - how does the developer actually tell >> CPAN that something is a developer release so that normal users don't >> automatically install it? > > I found this: > http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt > > Is says that $VERSION should simply be changed from a naked number into > a single quoted number and this should be recognized by the CPAN indexer. Cheers, Dave From hlapp at gmx.net Mon Oct 23 09:40:29 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 23 Oct 2006 09:40:29 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <453C9DCA.4020802@sendu.me.uk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> <4535EBF9.1090706@sendu.me.uk> <4536113D.1080307@sheffield.ac.uk> <453C9DCA.4020802@sendu.me.uk> Message-ID: <5C22B9C8-CEF0-457B-8565-793D56389A86@gmx.net> You would need a lot of information to make that determination (host, port, db driver, db name, user, password; i.e., the entire connection information, and there is no 'standard'). You might just ask a simple question in Makefile.PL as to whether biosql is installed or not, similar to the DB::GFF tests. -hilmar On Oct 23, 2006, at 6:47 AM, Sendu Bala wrote: > Hilmar Lapp wrote: >> On Oct 18, 2006, at 7:34 AM, Nathan Haigh wrote: >> >>> For example, I have made no effort to setup biosql-schema but I >>> thought that maybe there would be a test that would detect this >> >> I'm afraid there isn't. Bioperl-db is meaningless without >> biosql-schema. > > Can you suggest a way we might detect if biosql-schema has been > installed prior to running the test suite, so we can give some > meaningful error message? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Mon Oct 23 09:59:23 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 14:59:23 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CB9A5.2020409@mrc-lmb.cam.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> Message-ID: <453CCABB.2060308@sendu.me.uk> Dave Howorth wrote: >>> That's the user point of view - how does the developer actually tell >>> CPAN that something is a developer release so that normal users don't >>> automatically install it? >> I found this: >> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt >> >> Is says that $VERSION should simply be changed from a naked number into >> a single quoted number and this should be recognized by the CPAN indexer. > > Thanks for that. I guess from that the 1.5.2 version number should be: $VERSION = 1.05_02 And 1.6 would be $VERSION = 1.06 But will this cause a problem wrt 1.4? 1.4 has: $VERSION = 1.4; Is 1.4 lower than 1.06? Should we keep to a single digit version, so 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them version fifty and version sixty? 1.50_02, 1.60? From cjfields at uiuc.edu Mon Oct 23 10:12:16 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 09:12:16 -0500 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C9298.9000900@sendu.me.uk> Message-ID: <002001c6f6ad$3f68dd90$15327e82@pyrimidine> ... > > Right, thanks for that. > > Oh, so this effectively means that our 'optional' dependencies are > installed for CPAN users, which matches up to my 'force the optional > ones anyway' desire, leaving Bundle::BioPerl without any use. > > Makefile.PL could be altered again to remove from PREREQ_PM those > modules the user didn't already have installed, thus CPAN would only > install Bioperl itself and nothing optional. The user could then install > Bundle::BioPerl if they wanted a quick way of getting all the optional > stuff to work. > > I'm happy either way; what do other people think? I think that we should have it so Bioperl installs as-is (no additional reqs) and have Bundle::BioPerl used as a convenient way to install all optional modules for full functionality. The catch is to make sure that any optional installations do not crash tests during a CPAN bioperl installation, otherwise they aren't considered optional by CPAN, and the install won't work without forcing it. Frankly, most users will find themselves wanting to install the Bundle anyway to get full functionality, so we could always 'strongly recommend' preceding the bioperl installation with a Bundle::Bioperl CPAN installation to avoid problems, at least for this release. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 23 10:23:04 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 09:23:04 -0500 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453CA5E9.1060406@sheffield.ac.uk> Message-ID: <002101c6f6ae$c14d7860$15327e82@pyrimidine> ... > >> Right, thanks for that. > > > > Oh, so this effectively means that our 'optional' dependencies are > > installed for CPAN users, which matches up to my 'force the optional > > ones anyway' desire, leaving Bundle::BioPerl without any use. > > > > Makefile.PL could be altered again to remove from PREREQ_PM those > > modules the user didn't already have installed, thus CPAN would only > > install Bioperl itself and nothing optional. The user could then > > install Bundle::BioPerl if they wanted a quick way of getting all the > > optional stuff to work. > > > > I'm happy either way; what do other people think? > >From my point of view, removing them from PREREQ_PM means building the > Bundle::BioPerl a bit of a pain :o( > > I prefer the way it is currently set up - most people have fast internet > connections and GB of harddrive space. Other than the reason "why > install something I won't ever need" I don't see much point maintaining > Bundle::BioPerl and having "optional" dependencies. I think if there are > any modules which are not going to be used by the majority of users, > then this could be used as the rationale for removing them from > bioperl-core into another package? > > Nath I think you'll likely find it much easier to maintain a Bundle package long-term and indicate that it should be installed along with bioperl, than to have users complain about a particular Bioperl module failing b/c a particular dependency wasn't installed. If we have the Bundle around in CPAN and in PPM for Win32 users, and indicate in the INSTALL docs and the wiki our preference that it be installed prior to or along with a Bioperl installation for beginners, we can mitigate most of those problems. Nip it in the bud, to quote a Mr. Barney Fife. My 2c Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 23 10:29:33 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 09:29:33 -0500 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CCABB.2060308@sendu.me.uk> Message-ID: <002201c6f6af$a91e4200$15327e82@pyrimidine> > Dave Howorth wrote: > >>> That's the user point of view - how does the developer actually tell > >>> CPAN that something is a developer release so that normal users don't > >>> automatically install it? > >> I found this: > >> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt > >> > >> Is says that $VERSION should simply be changed from a naked number into > >> a single quoted number and this should be recognized by the CPAN > indexer. > > > > 5.8.8/pod/perlmodstyle.pod#Version_numbering> > > Thanks for that. > > I guess from that the 1.5.2 version number should be: > > $VERSION = 1.05_02 > > And 1.6 would be > > $VERSION = 1.06 > > But will this cause a problem wrt 1.4? 1.4 has: > > $VERSION = 1.4; > > Is 1.4 lower than 1.06? Should we keep to a single digit version, so > 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them > version fifty and version sixty? 1.50_02, 1.60? Doesn't perl 5.6.1 and up use the 'x.y.z' versioning syntax? It would be much simpler to use that. Simon Cozens wrote about this a while back: http://www.perl.com/pub/a/2000/04/whatsnew.html ... Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Mon Oct 23 10:41:24 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 15:41:24 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <002201c6f6af$a91e4200$15327e82@pyrimidine> References: <002201c6f6af$a91e4200$15327e82@pyrimidine> Message-ID: <453CD494.8070905@sendu.me.uk> Chris Fields wrote: >> Dave Howorth wrote: >>>>> That's the user point of view - how does the developer actually tell >>>>> CPAN that something is a developer release so that normal users don't >>>>> automatically install it? >>>> I found this: >>>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt >>>> >>>> Is says that $VERSION should simply be changed from a naked number into >>>> a single quoted number and this should be recognized by the CPAN >> indexer. >>> > 5.8.8/pod/perlmodstyle.pod#Version_numbering> >> >> Thanks for that. >> >> I guess from that the 1.5.2 version number should be: >> >> $VERSION = 1.05_02 >> >> And 1.6 would be >> >> $VERSION = 1.06 >> >> But will this cause a problem wrt 1.4? 1.4 has: >> >> $VERSION = 1.4; >> >> Is 1.4 lower than 1.06? Should we keep to a single digit version, so >> 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them >> version fifty and version sixty? 1.50_02, 1.60? > > Doesn't perl 5.6.1 and up use the 'x.y.z' versioning syntax? It would be > much simpler to use that. That does not present us with a way to have 1.5.2 marked as a developer release in CPAN. Also, see the discussion here: http://perldoc.perl.org/functions/require.html Since we require 5.6.1 the backwards-compatible issues maybe don't apply to us, but do these ideas work with modules, or just Perl itself? Is CPAN et al. happy with this form of versioning? /Something/ needs to be done about Bioperl versioning, because the current 1.4 or 1.5 is completely inadequate. From bix at sendu.me.uk Mon Oct 23 10:51:25 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 15:51:25 +0100 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <002001c6f6ad$3f68dd90$15327e82@pyrimidine> References: <002001c6f6ad$3f68dd90$15327e82@pyrimidine> Message-ID: <453CD6ED.5050507@sendu.me.uk> Chris Fields wrote: [option 1] >> Oh, so this effectively means that our 'optional' dependencies are >> installed for CPAN users, which matches up to my 'force the >> optional ones anyway' desire, leaving Bundle::BioPerl without any >> use. [option 2] >> Makefile.PL could be altered again to remove from PREREQ_PM those >> modules the user didn't already have installed, thus CPAN would >> only install Bioperl itself and nothing optional. The user could >> then install Bundle::BioPerl if they wanted a quick way of getting >> all the optional stuff to work. >> >> I'm happy either way; what do other people think? > > I think that we should have it so Bioperl installs as-is (no > additional reqs) and have Bundle::BioPerl used as a convenient way to > install all optional modules for full functionality. Note we're specifically considering a CPAN install here. If you download the tar.gz or use cvs, [option 1] doesn't affect you. Bundle::Bioperl is still needed as a convenience if you want to install the optional external dependencies. > The catch is to make sure that any optional installations do not > crash tests during a CPAN bioperl installation, otherwise they aren't > considered optional by CPAN, and the install won't work without > forcing it. I'm pretty sure this isn't a problem, though it would be nice if someone could test it on a clean system: does 'make test' pass all ok with none of the optional modules installed? Anyway, to reiterate the question: Do we care if CPAN users get all the optional external dependencies installed for them automatically, or do we want to force them to install Bundle? The current situation is: CPAN users will get all optional external dependencies without using Bundle::BioPerl. Manual installers of bioperl (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to get full functionality. From n.haigh at sheffield.ac.uk Mon Oct 23 12:30:34 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 16:30:34 +0000 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CCABB.2060308@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> Message-ID: <453CEE2A.8000002@sheffield.ac.uk> Sendu Bala wrote: > Dave Howorth wrote: > >>>> That's the user point of view - how does the developer actually tell >>>> CPAN that something is a developer release so that normal users don't >>>> automatically install it? >>>> >>> I found this: >>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt >>> >>> Is says that $VERSION should simply be changed from a naked number into >>> a single quoted number and this should be recognized by the CPAN indexer. >>> >> >> > > Thanks for that. > > I guess from that the 1.5.2 version number should be: > > $VERSION = 1.05_02 > > And 1.6 would be > > $VERSION = 1.06 > > But will this cause a problem wrt 1.4? 1.4 has: > > $VERSION = 1.4; > > Is 1.4 lower than 1.06? Should we keep to a single digit version, so > 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them > version fifty and version sixty? 1.50_02, 1.60? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > I believe the link to the documentation above describes a common CPAN versioning scheme as follows: 1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32 Therefore version 1.5 of Bioperl would be: 1.50 and version 1.5.2 would be better as 1.52. Then to indicate that the 1.5 series is a developer release, you append the underscore and at least 2 digits. Thus resulting in the following: Bioperl 1.5 would be 1.50_01 and 1.5.2 would be 1.52_01. The only thing i'm unsure about would be when does the _01 get incremented? I suspect we would probably not increment this number since each release would be an increment of the minor release number e.g. 1.52_01, 1.53_01, 1.54_01 etc. Although I'm still not sure how this versioning would affect bioperl 1.4 since 1.4 uses a non-standard versioning scheme :o( As I understand it, the versioning of the Perl releases uses the x.y.z scheme. But apparently CPAN modules should use the above versioning scheme. Nath From cjfields at uiuc.edu Mon Oct 23 11:36:37 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 10:36:37 -0500 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <453CD6ED.5050507@sendu.me.uk> Message-ID: <000c01c6f6b9$0781af40$15327e82@pyrimidine> ... > > Note we're specifically considering a CPAN install here. If you download > the tar.gz or use cvs, [option 1] doesn't affect you. Bundle::Bioperl is > still needed as a convenience if you want to install the optional > external dependencies. > Agreed. I don't think the Bundle is dispensable. For instance, it's very easy for us to just state to beginners to install Bundle::Bioperl before installing bioperl itself, as opposed to having them inundate the mail list with requests on why x.pl script didn't work, which could be simply from lack of the required module. > I'm pretty sure this isn't a problem, though it would be nice if someone > could test it on a clean system: does 'make test' pass all ok with none > of the optional modules installed? So far on WinXP everything passes; I ran a clean perl installation a while ago using nmake and tests passed. > Anyway, to reiterate the question: Do we care if CPAN users get all the > optional external dependencies installed for them automatically, or do > we want to force them to install Bundle? > > The current situation is: CPAN users will get all optional external > dependencies without using Bundle::BioPerl. Manual installers of bioperl > (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to > get full functionality. I don't think forcing is necessary, so a CPAN installation shouldn't force someone to install optional modules. Graph.pm, for instance has a few optional modules, and the tests which use those get skipped and pass so the installation proceeds w/o problems. We could do the same (any tests using those optional modules display the reason why they are skipped). I would strongly state in the INSTALL and INSTALL.WIN docs that (new) users should install Bundle::Bioperl before installing Bioperl core for full functionality. If you are an advanced user and know your way around CPAN/Perl, then you can install the various independent requirements depending on your particular requirements. Chris From n.haigh at sheffield.ac.uk Mon Oct 23 12:38:00 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 16:38:00 +0000 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <453CD6ED.5050507@sendu.me.uk> References: <002001c6f6ad$3f68dd90$15327e82@pyrimidine> <453CD6ED.5050507@sendu.me.uk> Message-ID: <453CEFE8.4000704@sheffield.ac.uk> Sendu Bala wrote: > Chris Fields wrote: > > [option 1] > >>> Oh, so this effectively means that our 'optional' dependencies are >>> installed for CPAN users, which matches up to my 'force the >>> optional ones anyway' desire, leaving Bundle::BioPerl without any >>> use. >>> > > [option 2] > >>> Makefile.PL could be altered again to remove from PREREQ_PM those >>> modules the user didn't already have installed, thus CPAN would >>> only install Bioperl itself and nothing optional. The user could >>> then install Bundle::BioPerl if they wanted a quick way of getting >>> all the optional stuff to work. >>> >>> I'm happy either way; what do other people think? >>> >> I think that we should have it so Bioperl installs as-is (no >> additional reqs) and have Bundle::BioPerl used as a convenient way to >> install all optional modules for full functionality. >> > > Note we're specifically considering a CPAN install here. If you download > the tar.gz or use cvs, [option 1] doesn't affect you. Bundle::Bioperl is > still needed as a convenience if you want to install the optional > external dependencies. > > > >> The catch is to make sure that any optional installations do not >> crash tests during a CPAN bioperl installation, otherwise they aren't >> considered optional by CPAN, and the install won't work without >> forcing it. >> > > I'm pretty sure this isn't a problem, though it would be nice if someone > could test it on a clean system: does 'make test' pass all ok with none > of the optional modules installed? > > I could definitely do this on WinXP and *possibly* on a Linux system. > Anyway, to reiterate the question: Do we care if CPAN users get all the > optional external dependencies installed for them automatically, or do > we want to force them to install Bundle? > > I'd prefer any dependencies, whether the are seen as vital to the main functionality of Bioperl or not actually specified in PREREQ_PM (as they currently are). A dependency is a dependency - is it not? If a distinction is to be made based on whether the requiring module is simply adding additional functionality to Bioperl-core, then shouldn't it be moved out of core and into another package as with the run modules if we are to have "optional" dependencies? my 2p Nath > The current situation is: CPAN users will get all optional external > dependencies without using Bundle::BioPerl. Manual installers of bioperl > (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to > get full functionality. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at uiuc.edu Mon Oct 23 11:39:09 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 10:39:09 -0500 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CD494.8070905@sendu.me.uk> Message-ID: <000d01c6f6b9$62033d80$15327e82@pyrimidine> ... > That does not present us with a way to have 1.5.2 marked as a developer > release in CPAN. > > Also, see the discussion here: > http://perldoc.perl.org/functions/require.html > > Since we require 5.6.1 the backwards-compatible issues maybe don't apply > to us, but do these ideas work with modules, or just Perl itself? Is > CPAN et al. happy with this form of versioning? > > /Something/ needs to be done about Bioperl versioning, because the > current 1.4 or 1.5 is completely inadequate. I think using 'require Foo x.y.z' is applicable to modules as well. There is something in Programming Perl about this, just don't have it on hand... Not sure about CPAN, so we need to look into it. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Mon Oct 23 11:42:15 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 16:42:15 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CEE2A.8000002@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> Message-ID: <453CE2D7.5080608@sendu.me.uk> Nathan S. Haigh wrote: > I believe the link to the documentation above describes a common CPAN > versioning scheme as follows: > > 1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32 > > Therefore version 1.5 of Bioperl would be: 1.50 and version 1.5.2 would > be better as 1.52. Then to indicate that the 1.5 series is a developer > release, you append the underscore and at least 2 digits. Thus resulting > in the following: Bioperl 1.5 would be 1.50_01 and 1.5.2 would be > 1.52_01. The only thing i'm unsure about would be when does the _01 get > incremented? I suspect we would probably not increment this number since > each release would be an increment of the minor release number e.g. > 1.52_01, 1.53_01, 1.54_01 etc. > > Although I'm still not sure how this versioning would affect bioperl 1.4 > since 1.4 uses a non-standard versioning scheme :o( Ok, I'm going to go ahead and call it 1.52_01 then. Surely 1.60 will be treated higher than 1.4? Anyway, we can cross that bridge when we get there, but this seems appropriate now. Cheers, Sendu. From bix at sendu.me.uk Mon Oct 23 11:59:01 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 16:59:01 +0100 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <000c01c6f6b9$0781af40$15327e82@pyrimidine> References: <000c01c6f6b9$0781af40$15327e82@pyrimidine> Message-ID: <453CE6C5.6000108@sendu.me.uk> Chris Fields wrote: > ... >> The current situation is: CPAN users will get all optional external >> dependencies without using Bundle::BioPerl. Manual installers of bioperl >> (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to >> get full functionality. > > I don't think forcing is necessary, so a CPAN installation shouldn't force > someone to install optional modules. Graph.pm, for instance has a few > optional modules, and the tests which use those get skipped and pass so the > installation proceeds w/o problems. We could do the same (any tests using > those optional modules display the reason why they are skipped). I should clarify and say that that's what happens in Bioperl as well. The 'forcing' that I talk about is simply what I assume will happen if the user has CPAN set to automatically install dependencies. The user could say 'no' to every question regarding the installation of dependencies that CPAN discovers and Bioperl would still install fine. So really the difference between the current situation and, say, the situation when 1.5.1 was released, is that the CPAN user doesn't have to use Bundle::BioPerl for full functionality anymore, but can still chose not to install all the optional external modules. The difference is the possible default behaviour. Those users that auto-install dependencies get all the optional ones, whereas in the past they would not have. I have to point out the benefit of this behaviour: those people that don't care and just want it to work are more likely to get an installation that does just work. People who know what they're doing can still do what they want. Before we decide what to do I guess we need hard confirmation of how CPAN will actually behave with the current Makefile.PL. Any ideas how we can find out? It would also be good to have more options to break the current tie (Nathan is for keeping PREREQ_PM populated, Chris is for having it empty, I can go either way)... From dhoworth at mrc-lmb.cam.ac.uk Mon Oct 23 11:55:42 2006 From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth) Date: Mon, 23 Oct 2006 16:55:42 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CD494.8070905@sendu.me.uk> References: <002201c6f6af$a91e4200$15327e82@pyrimidine> <453CD494.8070905@sendu.me.uk> Message-ID: <453CE5FE.9070001@mrc-lmb.cam.ac.uk> Sendu Bala wrote: > Chris Fields wrote: >>> Dave Howorth wrote: >>>>>> That's the user point of view - how does the developer actually tell >>>>>> CPAN that something is a developer release so that normal users don't >>>>>> automatically install it? >>>>> I found this: >>>>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt >>>>> >>>>> Is says that $VERSION should simply be changed from a naked number into >>>>> a single quoted number and this should be recognized by the CPAN >>> indexer. >>>> >> 5.8.8/pod/perlmodstyle.pod#Version_numbering> >>> >>> Thanks for that. >>> >>> I guess from that the 1.5.2 version number should be: >>> >>> $VERSION = 1.05_02 I believe so - the underscore is key. Look at your favourite CPAN modules and see what they do. >>> And 1.6 would be >>> >>> $VERSION = 1.06 >>> >>> But will this cause a problem wrt 1.4? 1.4 has: I think it will cause a problem, yes. 1.4 > 1.06 As a workaround, you could remove 1.4 from CPAN and require everybody who installs from CPAN to uninstall it before installing 1.06. >>> $VERSION = 1.4; >>> >>> Is 1.4 lower than 1.06? Should we keep to a single digit version, so >>> 1.5_02 and 1.6? Does this really not work with CPAN? I think that would work but see at the end. >> Should we call them >>> version fifty and version sixty? 1.50_02, 1.60? Then you can count 1.50_02, 1.50_03, 1.52, 1.53_01 ... if you wish. >> Doesn't perl 5.6.1 and up use the 'x.y.z' versioning syntax? It would be >> much simpler to use that. > > That does not present us with a way to have 1.5.2 marked as a developer > release in CPAN. > > Also, see the discussion here: > http://perldoc.perl.org/functions/require.html > > Since we require 5.6.1 the backwards-compatible issues maybe don't apply > to us, but do these ideas work with modules, or just Perl itself? Is > CPAN et al. happy with this form of versioning? I'm not an expert :( It's my understanding that there is an awful lot of flexibility in Perl module version numbering (as you might expect :) However, I believe there are some gotchas. So I would recommend (a) finding an expert and (b) trying an experiment! > /Something/ needs to be done about Bioperl versioning, because the > current 1.4 or 1.5 is completely inadequate. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From n.haigh at sheffield.ac.uk Mon Oct 23 13:37:13 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 17:37:13 +0000 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <453CE6C5.6000108@sendu.me.uk> References: <000c01c6f6b9$0781af40$15327e82@pyrimidine> <453CE6C5.6000108@sendu.me.uk> Message-ID: <453CFDC9.8030107@sheffield.ac.uk> Sendu Bala wrote: > Chris Fields wrote: > >> ... >> >>> The current situation is: CPAN users will get all optional external >>> dependencies without using Bundle::BioPerl. Manual installers of bioperl >>> (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to >>> get full functionality. >>> >> I don't think forcing is necessary, so a CPAN installation shouldn't force >> someone to install optional modules. Graph.pm, for instance has a few >> optional modules, and the tests which use those get skipped and pass so the >> installation proceeds w/o problems. We could do the same (any tests using >> those optional modules display the reason why they are skipped). >> > > I should clarify and say that that's what happens in Bioperl as well. > The 'forcing' that I talk about is simply what I assume will happen if > the user has CPAN set to automatically install dependencies. The user > could say 'no' to every question regarding the installation of > dependencies that CPAN discovers and Bioperl would still install fine. > > So really the difference between the current situation and, say, the > situation when 1.5.1 was released, is that the CPAN user doesn't have to > use Bundle::BioPerl for full functionality anymore, but can still chose > not to install all the optional external modules. > > --snip-- Obviously, we could maintain a Bundle::BioPerl which includes all dependencies required for a fully functional Bioperl. I think the whole idea for a Bundle is to provide a common environment for a particular package. If for example, someone chooses not to install the dependencies through CPAN (in the current setup), that can easily go back and install Bundle::BioPerl and it would retrieve any missing dependencies for a fully functional Bioperl-core. Nath From n.haigh at sheffield.ac.uk Mon Oct 23 14:06:16 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 18:06:16 +0000 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CE2D7.5080608@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> Message-ID: <453D0498.8050206@sheffield.ac.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: > >> I believe the link to the documentation above describes a common CPAN >> versioning scheme as follows: >> >> 1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32 >> >> Therefore version 1.5 of Bioperl would be: 1.50 and version 1.5.2 would >> be better as 1.52. Then to indicate that the 1.5 series is a developer >> release, you append the underscore and at least 2 digits. Thus resulting >> in the following: Bioperl 1.5 would be 1.50_01 and 1.5.2 would be >> 1.52_01. The only thing i'm unsure about would be when does the _01 get >> incremented? I suspect we would probably not increment this number since >> each release would be an increment of the minor release number e.g. >> 1.52_01, 1.53_01, 1.54_01 etc. >> >> Although I'm still not sure how this versioning would affect bioperl 1.4 >> since 1.4 uses a non-standard versioning scheme :o( >> > > Ok, I'm going to go ahead and call it 1.52_01 then. Surely 1.60 will be > treated higher than 1.4? Anyway, we can cross that bridge when we get > there, but this seems appropriate now. > > > Cheers, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > Just tried the suggested: perl -MExtUtils::MakeMaker -le 'print MM->parse_version(shift)' bioperl-1-5-2/Bio/Root/Version.pm To see how it parses the various different version schemes - here are the results: 1.5 -> 1.5 1.4 -> 1.4 1.60 -> 1.60 1.05_01 -> 1.0501 1.5_01 -> 1.501 1.50_01 -> 1.5001 Nath From cjfields at uiuc.edu Mon Oct 23 13:15:44 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 12:15:44 -0500 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <453CE6C5.6000108@sendu.me.uk> Message-ID: <002701c6f6c6$e2622c40$15327e82@pyrimidine> ... > I should clarify and say that that's what happens in Bioperl as well. > The 'forcing' that I talk about is simply what I assume will happen if > the user has CPAN set to automatically install dependencies. The user > could say 'no' to every question regarding the installation of > dependencies that CPAN discovers and Bioperl would still install fine. > > So really the difference between the current situation and, say, the > situation when 1.5.1 was released, is that the CPAN user doesn't have to > use Bundle::BioPerl for full functionality anymore, but can still chose > not to install all the optional external modules. > > The difference is the possible default behaviour. Those users that > auto-install dependencies get all the optional ones, whereas in the past > they would not have. I have to point out the benefit of this behaviour: > those people that don't care and just want it to work are more likely to > get an installation that does just work. People who know what they're > doing can still do what they want. OK with me. Any way we go about it, we have to assume that anyone who set CPAN to automatically install dependencies would want this behavior. > Before we decide what to do I guess we need hard confirmation of how > CPAN will actually behave with the current Makefile.PL. Any ideas how we > can find out? > > It would also be good to have more options to break the current tie > (Nathan is for keeping PREREQ_PM populated, Chris is for having it > empty, I can go either way)... Frankly I'm for whatever is easiest for the end-user. I think we should continue maintaining Bundle::Bioperl b/c of its convenience (easier for us to say 'install Bundle::Bioperl' as opposed to 'install modules a b d d e f g...' ). I should note that Chris D. maintains Bundle::Bioperl via CPAN and can easily add/remove modules as needed, so all that would be necessary prior to a release is to make sure the various modules present in the Bundle are up-to-date. The only difficulty would updating the bundle PPM version for Win32; I agree with Nathan that it would be nice if it were easier to maintain. The PPD file generated using 'nmake ppd' needs modifications, likely b/c these are probably still generated as PPM3-compatible vs PPM4-compatible. I also think the idea of having the developer releases available via CPAN is a good one, as long as they are marked as such (which you are taking care of with versioning changes). It makes them a little more official, even if they are interim developer releases. Chris From cjfields at uiuc.edu Mon Oct 23 13:19:08 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 12:19:08 -0500 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <453CFDC9.8030107@sheffield.ac.uk> Message-ID: <002801c6f6c7$5a58ed60$15327e82@pyrimidine> ... > > So really the difference between the current situation and, say, the > > situation when 1.5.1 was released, is that the CPAN user doesn't have to > > use Bundle::BioPerl for full functionality anymore, but can still chose > > not to install all the optional external modules. > > > > > --snip-- > > Obviously, we could maintain a Bundle::BioPerl which includes all > dependencies required for a fully functional Bioperl. I think the whole > idea for a Bundle is to provide a common environment for a particular > package. If for example, someone chooses not to install the dependencies > through CPAN (in the current setup), that can easily go back and install > Bundle::BioPerl and it would retrieve any missing dependencies for a > fully functional Bioperl-core. > > Nath Succinctly put; I would've spent five paragraphs describing that! Too much coffee (from lab meetings...) Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 23 13:26:57 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 12:26:57 -0500 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: Message-ID: <002c01c6f6c8$7163dd20$15327e82@pyrimidine> Seth, Did you try this with a clean, taxonomy-installed database? There may be some junk left over tfrom the previous test runs. I'm looking into it this week; it may not make the developer release but we'll try to get it in. BTW, the 02sinmpleseq.t test failures have to do with a call to gzip. I'll look into a workaround for that. Sendu has posted a Bio::Root::Root fix which does get rid of some bugs but introduces others. One alternative which I found works is cygwin, but there's a catch: DBD-mysql is hard to install. If it isn't one thing it's another... Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign _____ From: Seth Johnson [mailto:johnson.biotech at gmail.com] Sent: Monday, October 23, 2006 11:37 AM To: Chris Fields Cc: bioperl-l Subject: Re: Error retrieving sequence from BioSQL Chris, There's definite improvement: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Failed Test Stat Wstat Total Fail Failed List of Failed ---------------------------------------------------------------------------- --- t/02species.t 65 2 3.08% 63 65 t/03simpleseq.t 1 256 59 106 179.66% 7-59 t/04swiss.t 52 14 26.92% 25 27-34 38-42 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There's some weirdness going on during the 'swiss.t' test. It almost seems to me that expectations of some tests are swapped (27 & 39, 28 & 40, 29 & 41, 31 & 27, 32 & 28, 33 & 29, 39 & 31): ================================ not ok 25 # Test 25 got: '10097078' (t/04swiss.t at line 79) # Expected: '91309150' ok 26 not ok 27 # Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t at line 85) # Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' not ok 28 # Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic mitochondrial matrix protein' (t/04swiss.t at line 86) # Expected: 'Functional expression of cloned human splicing factor SF2: homology to RNA-binding proteins, U1 70K, and Drosophila splicing regulators' not ok 29 # Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' (t/04swiss.t at line 87) # Expected: 'Cell 66 (2), 383-394 (1991)' not ok 30 # Test 30 got: (t/04swiss.t at line 88) # Expected: '91309150' not ok 31 # Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' (t/04swiss.t at line 85 fail #2) # Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., Celis, J.E. and Leffers,H.' not ok 32 # Test 32 got: 'Functional expression of cloned human splicing factor SF2: homology to RNA-binding proteins, U1 70K, and Drosophila splicing regulators' (t/04swiss.t at line 86 fail #2) # Expected: 'Cloning and expression of a cDNA covering the complete coding region of the P32 subunit of human pre-mRNA splicing factor SF2' not ok 33 # Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail #2) # Expected: 'Gene 134 (2), 283-287 (1993)' not ok 34 # Test 34 got: (t/04swiss.t at line 88 fail #2) # Expected: '94085792' ok 35 ok 36 ok 37 not ok 38 # Test 38 got: (t/04swiss.t at line 88 fail #3) # Expected: '94253723' not ok 39 # Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4) # Expected: 'Jiang,J., Zhang,Y., Krainer, A.R. and Xu,R.M.' not ok 40 # Test 40 got: 'Cloning and expression of a cDNA covering the complete coding region of the P32 subunit of human pre-mRNA splicing factor SF2' (t/04swiss.t at line 86 fail #4) # Expected: 'Crystal structure of human p32, a doughnut-shaped acidic mitochondrial matrix protein' not ok 41 # Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail #4) # Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' not ok 42 # Test 42 got: (t/04swiss.t at line 88 fail #4) # Expected: '99199225' ============================== On 10/20/06, Chris Fields < cjfields at uiuc.edu> wrote: Seth, Did you work out the problem here? There was a recent CVS update to OBDA tests (16obda.t) that fixed similar problems on Mac OS X. Old OBDA tests apparently left data from tests in the database, which caused problems with repeated test runs. Chris > > -----Original Message----- > > From: bioperl-l-bounces lists.open-bio.org [mailto: bioperl-l- > > bounces lists.open-bio.org] On Behalf Of Seth Johnson > > Sent: Saturday, September 30, 2006 6:35 PM > > To: Hilmar Lapp > > Cc: Chris Fields; Bioperl List > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > > > Here're complete test details: > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > ... > > > FAILED tests 10-12 > > Failed 3/12 tests, 75.00% okay > > Failed Test Stat Wstat Total Fail Failed List of Failed > > > -------------------------------------------------------------------------- > > ----- > > t\02species.t 65 2 3.08% 63 65 > > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > > t\12ontology.t 2 512 738 1471 199.32% 3-738 > > t\16obda.t 12 3 25.00% 10-12 > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From johnson.biotech at gmail.com Mon Oct 23 12:36:36 2006 From: johnson.biotech at gmail.com (Seth Johnson) Date: Mon, 23 Oct 2006 12:36:36 -0400 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: <000001c6f486$df508930$15327e82@pyrimidine> References: <000001c6f486$df508930$15327e82@pyrimidine> Message-ID: Chris, There's definite improvement: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Failed Test Stat Wstat Total Fail Failed List of Failed ------------------------------------------------------------------------------- t/02species.t 65 2 3.08% 63 65 t/03simpleseq.t 1 256 59 106 179.66% 7-59 t/04swiss.t 52 14 26.92% 25 27-34 38-42 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There's some weirdness going on during the 'swiss.t' test. It almost seems to me that expectations of some tests are swapped (27 & 39, 28 & 40, 29 & 41, 31 & 27, 32 & 28, 33 & 29, 39 & 31): ================================ not ok 25 # Test 25 got: '10097078' (t/04swiss.t at line 79) # Expected: '91309150' ok 26 not ok 27 # Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t at line 85) # Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' not ok 28 # Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic mitochondrial matrix protein' (t/04swiss.t at line 86) # Expected: 'Functional expression of cloned human splicing factor SF2: homology to RNA-binding proteins, U1 70K, and Drosophila splicing regulators' not ok 29 # Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' (t/04swiss.t at line 87) # Expected: 'Cell 66 (2), 383-394 (1991)' not ok 30 # Test 30 got: (t/04swiss.t at line 88) # Expected: '91309150' not ok 31 # Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' (t/04swiss.t at line 85 fail #2) # Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., Celis,J.E. and Leffers,H.' not ok 32 # Test 32 got: 'Functional expression of cloned human splicing factor SF2: homology to RNA-binding proteins, U1 70K, and Drosophila splicing regulators' (t/04swiss.t at line 86 fail #2) # Expected: 'Cloning and expression of a cDNA covering the complete coding region of the P32 subunit of human pre-mRNA splicing factor SF2' not ok 33 # Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail #2) # Expected: 'Gene 134 (2), 283-287 (1993)' not ok 34 # Test 34 got: (t/04swiss.t at line 88 fail #2) # Expected: '94085792' ok 35 ok 36 ok 37 not ok 38 # Test 38 got: (t/04swiss.t at line 88 fail #3) # Expected: '94253723' not ok 39 # Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4) # Expected: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' not ok 40 # Test 40 got: 'Cloning and expression of a cDNA covering the complete coding region of the P32 subunit of human pre-mRNA splicing factor SF2' (t/04swiss.t at line 86 fail #4) # Expected: 'Crystal structure of human p32, a doughnut-shaped acidic mitochondrial matrix protein' not ok 41 # Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail #4) # Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' not ok 42 # Test 42 got: (t/04swiss.t at line 88 fail #4) # Expected: '99199225' ============================== On 10/20/06, Chris Fields wrote: > > > > Seth, > > Did you work out the problem here? There was a recent CVS update to OBDA > tests (16obda.t) that fixed similar problems on Mac OS X. Old OBDA tests > apparently left data from tests in the database, which caused problems > with > repeated test runs. > > Chris > > > > -----Original Message----- > > > From: bioperl-l-bounces lists.open-bio.org [mailto: bioperl-l- > > > bounces lists.open-bio.org] On Behalf Of Seth Johnson > > > Sent: Saturday, September 30, 2006 6:35 PM > > > To: Hilmar Lapp > > > Cc: Chris Fields; Bioperl List > > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > > > > > Here're complete test details: > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > ... > > > > > FAILED tests 10-12 > > > Failed 3/12 tests, 75.00% okay > > > Failed Test Stat Wstat Total Fail Failed List of Failed > > > > > > -------------------------------------------------------------------------- > > > ----- > > > t\02species.t 65 2 3.08% 63 65 > > > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > > > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > > > t\12ontology.t 2 512 738 1471 199.32% 3-738 > > > t\16obda.t 12 3 25.00% 10-12 > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > From n.haigh at sheffield.ac.uk Mon Oct 23 16:08:00 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 20:08:00 +0000 Subject: [Bioperl-l] CPAN testing Service Message-ID: <453D2120.9010301@sheffield.ac.uk> We should also check the CPAN testing service (CPANTS) to see how "good" our package is for CPAN and try to increase the Kwalitee score. There only appears to be details for bioperl-1.2.3 for some reason: http://cpants.perl.org/dist/bioperl Nath From pabloivan at gmail.com Sun Oct 22 15:54:35 2006 From: pabloivan at gmail.com (Pablo Ivan) Date: Sun, 22 Oct 2006 16:54:35 -0300 Subject: [Bioperl-l] Bioperl installation under Windows Message-ID: Hello, I have been trying to install Bioperl 1.4 on a Windows XP system, but I didn't get too far; my perl installation was made using ActiveState 5.8.8build 816. I then tried the ppm method of searching for bioperl in the repositories and installing the core package 1.4. It says that the installation was made successfully, but the /Bio folder doesn't show up in /lib, and it's like nothing new was installed at all. I was wondering if using that version of ActiveState could be causing it, but the uninstall option for it isn't showing in Add/Remove, and I'm afraid just deleting the folders and installing version 5.6 of AS could somehow damage and make things worse. Or should I just forget about it and try using Cygwin? Thank you, Pablo. From cjfields at uiuc.edu Mon Oct 23 17:34:47 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 16:34:47 -0500 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: Message-ID: <000401c6f6eb$111df040$15327e82@pyrimidine> Don't know what that particular error is, but it looks ActivePerl-related (PPM generates HTML from the blib directory). You may need to run 'nmake clean' in between test cycles get rid of old blib and other files. The carryover issue from old test runs was a definite problem. Brian fixed that in the bioperl-db CVS recently. Also, I tried Sendu's fixes from CVS head to Bio::Root::Root and they seem to fix the problems with Bio::Root::Root. The issue came down to a use of indirect syntax (a bad perl practice). There are other errors popping up related to Bio::Species, but these seem fixable at least. I committed a few changes to bioperl-db CVS to fix 03simpleseq.t test failures due to a lack of gzip on WinXP (I didn't see them b/c I had a copy on GNU gzip in my path). These should pass w/o problems now on WinXP. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign _____ From: Seth Johnson [mailto:johnson.biotech at gmail.com] Sent: Monday, October 23, 2006 4:22 PM To: Chris Fields Cc: bioperl-l Subject: Re: Error retrieving sequence from BioSQL Chris, I have not cleaned my test database yet. I'll purge it and redo the tests. This error keeps popping up in unexpected places while running nmake during installation: "Undefined subroutine &main::UpdateHTML_blib called at -e line 1. NMAKE : fatal error U1077: 'C:\WINDOWS\system32\cmd.exe' : return code '0xff'" Is there a way around it?? Seth On 10/23/06, Chris Fields wrote: Seth, Did you try this with a clean, taxonomy-installed database? There may be some junk left over tfrom the previous test runs. I'm looking into it this week; it may not make the developer release but we'll try to get it in. BTW, the 02sinmpleseq.t test failures have to do with a call to gzip. I'll look into a workaround for that. Sendu has posted a Bio::Root::Root fix which does get rid of some bugs but introduces others. One alternative which I found works is cygwin, but there's a catch: DBD-mysql is hard to install. If it isn't one thing it's another... Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign _____ From: Seth Johnson [mailto:johnson.biotech at gmail.com] Sent: Monday, October 23, 2006 11:37 AM To: Chris Fields Cc: bioperl-l Subject: Re: Error retrieving sequence from BioSQL Chris, There's definite improvement: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Failed Test Stat Wstat Total Fail Failed List of Failed ---------------------------------------------------------------------------- --- t/02species.t 65 2 3.08% 63 65 t/03simpleseq.t 1 256 59 106 179.66% 7-59 t/04swiss.t 52 14 26.92% 25 27-34 38-42 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There's some weirdness going on during the 'swiss.t' test. It almost seems to me that expectations of some tests are swapped (27 & 39, 28 & 40, 29 & 41, 31 & 27, 32 & 28, 33 & 29, 39 & 31): ================================ not ok 25 # Test 25 got: '10097078' (t/04swiss.t at line 79) # Expected: '91309150' ok 26 not ok 27 # Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t at line 85) # Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' not ok 28 # Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic mitochondrial matrix protein' (t/04swiss.t at line 86) # Expected: 'Functional expression of cloned human splicing factor SF2: homology to RNA-binding proteins, U1 70K, and Drosophila splicing regulators' not ok 29 # Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' (t/04swiss.t at line 87) # Expected: 'Cell 66 (2), 383-394 (1991)' not ok 30 # Test 30 got: (t/04swiss.t at line 88) # Expected: '91309150' not ok 31 # Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' (t/04swiss.t at line 85 fail #2) # Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., Celis, J.E. and Leffers,H.' not ok 32 # Test 32 got: 'Functional expression of cloned human splicing factor SF2: homology to RNA-binding proteins, U1 70K, and Drosophila splicing regulators' (t/04swiss.t at line 86 fail #2) # Expected: 'Cloning and expression of a cDNA covering the complete coding region of the P32 subunit of human pre-mRNA splicing factor SF2' not ok 33 # Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail #2) # Expected: 'Gene 134 (2), 283-287 (1993)' not ok 34 # Test 34 got: (t/04swiss.t at line 88 fail #2) # Expected: '94085792' ok 35 ok 36 ok 37 not ok 38 # Test 38 got: (t/04swiss.t at line 88 fail #3) # Expected: '94253723' not ok 39 # Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4) # Expected: 'Jiang,J., Zhang,Y., Krainer, A.R. and Xu,R.M.' not ok 40 # Test 40 got: 'Cloning and expression of a cDNA covering the complete coding region of the P32 subunit of human pre-mRNA splicing factor SF2' (t/04swiss.t at line 86 fail #4) # Expected: 'Crystal structure of human p32, a doughnut-shaped acidic mitochondrial matrix protein' not ok 41 # Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail #4) # Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' not ok 42 # Test 42 got: (t/04swiss.t at line 88 fail #4) # Expected: '99199225' ============================== On 10/20/06, Chris Fields < cjfields at uiuc.edu > wrote: Seth, Did you work out the problem here? There was a recent CVS update to OBDA tests (16obda.t) that fixed similar problems on Mac OS X. Old OBDA tests apparently left data from tests in the database, which caused problems with repeated test runs. Chris > > -----Original Message----- > > From: bioperl-l-bounces lists.open-bio.org [mailto: bioperl-l- > > bounces lists.open-bio.org] On Behalf Of Seth Johnson > > Sent: Saturday, September 30, 2006 6:35 PM > > To: Hilmar Lapp > > Cc: Chris Fields; Bioperl List > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > > > Here're complete test details: > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > ... > > > FAILED tests 10-12 > > Failed 3/12 tests, 75.00% okay > > Failed Test Stat Wstat Total Fail Failed List of Failed > > > -------------------------------------------------------------------------- > > ----- > > t\02species.t 65 2 3.08% 63 65 > > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > > t\12ontology.t 2 512 738 1471 199.32% 3-738 > > t\16obda.t 12 3 25.00% 10-12 > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- Best Regards, Seth Johnson Senior Bioinformatics Associate Ph: (202) 470-0900 Fx: (775) 251-0358 From cjfields at uiuc.edu Mon Oct 23 17:53:27 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 16:53:27 -0500 Subject: [Bioperl-l] Bioperl installation under Windows In-Reply-To: References: Message-ID: <9994CFF6-FCA1-4C7F-9A33-31765C6AE255@uiuc.edu> It won't install in Perl\lib, but in Perl\site\lib. Check there. We are working intently on the next developer release for BioPerl and plan on having several PPMs available, but we only are supporting ActivePerl 5.8.8.819. I would suggest that you upgrade your ActivePerl installation to that if possible since PPM has undergone major changes (they use PPM4 now, which has a GUI by default). Most repositories are now moving over to using PPM4 so you'll likely be seeing less PPM3-compatible packages being made. Chris On Oct 22, 2006, at 2:54 PM, Pablo Ivan wrote: > Hello, > > I have been trying to install Bioperl 1.4 on a Windows XP system, > but I > didn't get too far; my perl installation was made using ActiveState > 5.8.8build 816. I then tried the ppm method of searching for bioperl > in the > repositories and installing the core package 1.4. It says that the > installation was made successfully, but the /Bio folder doesn't > show up in > /lib, and it's like nothing new was installed at all. I was > wondering if > using that version of ActiveState could be causing it, but the > uninstall > option for it isn't showing in Add/Remove, and I'm afraid just > deleting the > folders and installing version 5.6 of AS could somehow damage and make > things worse. Or should I just forget about it and try using Cygwin? > > Thank you, > > Pablo. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From johnson.biotech at gmail.com Mon Oct 23 17:22:13 2006 From: johnson.biotech at gmail.com (Seth Johnson) Date: Mon, 23 Oct 2006 17:22:13 -0400 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: <002c01c6f6c8$7163dd20$15327e82@pyrimidine> References: <002c01c6f6c8$7163dd20$15327e82@pyrimidine> Message-ID: Chris, I have not cleaned my test database yet. I'll purge it and redo the tests. This error keeps popping up in unexpected places while running nmake during installation: "Undefined subroutine &main::UpdateHTML_blib called at -e line 1. NMAKE : fatal error U1077: 'C:\WINDOWS\system32\cmd.exe' : return code '0xff'" Is there a way around it?? Seth On 10/23/06, Chris Fields wrote: > > Seth, > > Did you try this with a clean, taxonomy-installed database? There may be > some junk left over tfrom the previous test runs. > > I'm looking into it this week; it may not make the developer release but > we'll try to get it in. BTW, the 02sinmpleseq.t test failures have to do > with a call to gzip. I'll look into a workaround for that. > > Sendu has posted a Bio::Root::Root fix which does get rid of some bugs but > introduces others. One alternative which I found works is cygwin, but > there's a catch: DBD-mysql is hard to install. If it isn't one thing it's > another... > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > ------------------------------ > > *From:* Seth Johnson [mailto:johnson.biotech at gmail.com] > *Sent:* Monday, October 23, 2006 11:37 AM > *To:* Chris Fields > *Cc:* bioperl-l > *Subject:* Re: Error retrieving sequence from BioSQL > > > > Chris, > > There's definite improvement: > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Failed Test Stat Wstat Total Fail Failed List of Failed > ------------------------------------------------------------------------------- > > t/02species.t 65 2 3.08% 63 65 > t/03simpleseq.t 1 256 59 106 179.66% 7-59 > t/04swiss.t 52 14 26.92% 25 27-34 38-42 > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > There's some weirdness going on during the 'swiss.t' test. It almost > seems to me that expectations of some tests are swapped (27 & 39, 28 & 40, > 29 & 41, 31 & 27, 32 & 28, 33 & 29, 39 & 31): > ================================ > not ok 25 > # Test 25 got: '10097078' (t/04swiss.t at line 79) > # Expected: '91309150' > ok 26 > not ok 27 > # Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t > at line 85) > # Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' > not ok 28 > # Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic > mitochondrial matrix protein' (t/04swiss.t at line 86) > # Expected: 'Functional expression of cloned human splicing factor SF2: > homology to RNA-binding proteins, U1 70K, and Drosophila splicing > regulators' > not ok 29 > # Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' > (t/04swiss.t at line 87) > # Expected: 'Cell 66 (2), 383-394 (1991)' > not ok 30 > # Test 30 got: (t/04swiss.t at line 88) > # Expected: '91309150' > not ok 31 > # Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' > (t/04swiss.t at line 85 fail #2) > # Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., > Celis, J.E. and Leffers,H.' > not ok 32 > # Test 32 got: 'Functional expression of cloned human splicing factor SF2: > homology to RNA-binding proteins, U1 70K, and Drosophila splicing > regulators' (t/04swiss.t at line 86 fail #2) > # Expected: 'Cloning and expression of a cDNA covering the complete > coding region of the P32 subunit of human pre-mRNA splicing factor SF2' > not ok 33 > # Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail > #2) > # Expected: 'Gene 134 (2), 283-287 (1993)' > not ok 34 > # Test 34 got: (t/04swiss.t at line 88 fail #2) > # Expected: '94085792' > ok 35 > ok 36 > ok 37 > not ok 38 > # Test 38 got: (t/04swiss.t at line 88 fail #3) > # Expected: '94253723' > not ok 39 > # Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., > Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4) > # Expected: 'Jiang,J., Zhang,Y., Krainer, A.R. and Xu,R.M.' > not ok 40 > # Test 40 got: 'Cloning and expression of a cDNA covering the complete > coding region of the P32 subunit of human pre-mRNA splicing factor SF2' > (t/04swiss.t at line 86 fail #4) > # Expected: 'Crystal structure of human p32, a doughnut-shaped acidic > mitochondrial matrix protein' > not ok 41 > # Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail > #4) > # Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' > not ok 42 > # Test 42 got: (t/04swiss.t at line 88 fail #4) > # Expected: '99199225' > ============================== > > On 10/20/06, *Chris Fields* < cjfields at uiuc.edu> wrote: > > > > Seth, > > Did you work out the problem here? There was a recent CVS update to OBDA > tests (16obda.t) that fixed similar problems on Mac OS X. Old OBDA tests > apparently left data from tests in the database, which caused problems > with > repeated test runs. > > Chris > > > > -----Original Message----- > > > From: bioperl-l-bounces lists.open-bio.org [mailto: bioperl-l- > > > bounces lists.open-bio.org] On Behalf Of Seth Johnson > > > Sent: Saturday, September 30, 2006 6:35 PM > > > To: Hilmar Lapp > > > Cc: Chris Fields; Bioperl List > > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > > > > > Here're complete test details: > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > ... > > > > > FAILED tests 10-12 > > > Failed 3/12 tests, 75.00% okay > > > Failed Test Stat Wstat Total Fail Failed List of Failed > > > > > > -------------------------------------------------------------------------- > > > ----- > > > t\02species.t 65 2 3.08% 63 65 > > > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > > > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > > > t\12ontology.t 2 512 738 1471 199.32% 3-738 > > > t\16obda.t 12 3 25.00% 10-12 > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > -- Best Regards, Seth Johnson Senior Bioinformatics Associate Ph: (202) 470-0900 Fx: (775) 251-0358 From chhalling at alumni.ls.berkeley.edu Mon Oct 23 21:02:24 2006 From: chhalling at alumni.ls.berkeley.edu (Conrad Halling) Date: Mon, 23 Oct 2006 21:02:24 -0400 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C6509.90005@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> Message-ID: <453D6620.5020401@alumni.ls.berkeley.edu> Sorry, I should know better about giving all the details. This was using the Bundle::BioPerl on CPAN, installing for Perl 5.8.8 (a fresh compile) with Mac OS X 10.4.8. -- Conrad Nathan S. Haigh wrote: > Chris Fields wrote: > >> Thanks for letting us know! Did PPM4 throw errors or just silently >> pass them over? >> >> Chris >> >> On Oct 22, 2006, at 6:45 PM, Conrad Halling wrote: >> >> >> > I believe he is talking about the bundle on cpan and not the ppd. I will > get this updated as soon as possible. > > Sendu/Chris - can you confirm to me which Bioperl modules are essential > to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any > reason for not putting *all* dependencies into the bundle? > > Nath > > > > > > -- Conrad Halling chhalling at alumni.ls.berkeley.edu From n.haigh at sheffield.ac.uk Tue Oct 24 03:05:53 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 24 Oct 2006 08:05:53 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453D6620.5020401@alumni.ls.berkeley.edu> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453D6620.5020401@alumni.ls.berkeley.edu> Message-ID: <453DBB51.6010505@sheffield.ac.uk> Conrad Halling wrote: > Sorry, I should know better about giving all the details. > > This was using the Bundle::BioPerl on CPAN, installing for Perl 5.8.8 (a > fresh compile) with Mac OS X 10.4.8. > > -- Conrad > > My apologies Conrad, this was my bad! Are you in need of the corrections being made swiftly or can you wait until the Bioperl 1.5.2 release when I'll ensure the Bundle is updated correctly for that release? Cheers Nath From n.haigh at sheffield.ac.uk Tue Oct 24 05:57:25 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Tue, 24 Oct 2006 10:57:25 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CE2D7.5080608@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> Message-ID: <453DE385.8010700@sheffield.ac.uk> --snip-- > Ok, I'm going to go ahead and call it 1.52_01 then. Surely 1.60 will be > treated higher than 1.4? Anyway, we can cross that bridge when we get > there, but this seems appropriate now. > > > Cheers, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > Just been having a think about this versioning. Does this work well and is it intuitive with versioning the official 1.5.2 developer release and also the 1.6 stable release? I'd like to put forward the following versioning scheme for consideration (most is the same as what it is now, but with some clarification - hopefully): major-version . minor-version sub-version _ developer-release-version RC-version The sub-version represents bug-fixes and possibly some minor feature enhancements with no API changes. The minor-version represents some significant feature enhancements/API changes/bug fixes. The major-version represents significant rewrites of Bioperl. For an RC of a developer release the version would have _0x (where x=the RC number) For a non RC of a developer release the version would have _10 For an RC of a stable release the version would have _0x (where x=RC number) Fo a non RC of a stable release the version would not have the underscore suffix Therefore I would see the following $VERSION being applied: 1.5.2 RC1 = 1.52_01 1.5.2 RC2 = 1.52_02 1.5.2 RC3 = 1.52_03 1.5.2 = 1.52_10 1.6 RC1 = 1.60_01 1.6 RC2 = 1.60_02 1.6 = 1.60 1.6.1 RC1 = 1.61_01 1.6.1 = 1.61 This should satisfy the requirement of CPAN for having underscores in versions to indicate a developer release, which here is a Bioperl release with an odd minor version number or any RC whether it be of a developer release or a stable release. This should mean that we could have the RC's on CPAN, but by default, CPAN would only install the latest "non developer release" (i.e. the last package without an underscore in the version). If we are going ahead with the new $VERSION scheme (as it currently is in HEAD), we should, for the sake of clarity, try to talk about Bioperl 1.52 instead of Bioperl 1.5.2 and make an effort to sync the documentation with regards to this. Nath From bix at sendu.me.uk Tue Oct 24 06:19:05 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 24 Oct 2006 11:19:05 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453DE385.8010700@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> <453DE385.8010700@sheffield.ac.uk> Message-ID: <453DE899.4030603@sendu.me.uk> Nathan Haigh wrote: > > Therefore I would see the following $VERSION being applied: > 1.5.2 RC1 = 1.52_01 > 1.5.2 RC2 = 1.52_02 > 1.5.2 RC3 = 1.52_03 > 1.5.2 = 1.52_10 > 1.6 RC1 = 1.60_01 > 1.6 RC2 = 1.60_02 > 1.6 = 1.60 > 1.6.1 RC1 = 1.61_01 > 1.6.1 = 1.61 > > This should satisfy the requirement of CPAN for having underscores in > versions to indicate a developer release, which here is a Bioperl > release with an odd minor version number or any RC whether it be of a > developer release or a stable release. This should mean that we could > have the RC's on CPAN, but by default, CPAN would only install the > latest "non developer release" (i.e. the last package without an > underscore in the version). That all sounds good to me, except I worry about potential confusion if people look manually at the things available in CPAN, see 1.60_02 and think it is more recent than 1.60 and try to install it manually. Since $VERSION = 1.52_10; is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, final release version should be $VERSION = 1.6010. > If we are going ahead with the new $VERSION scheme (as it currently is > in HEAD), we should, for the sake of clarity, try to talk about Bioperl > 1.52 instead of Bioperl 1.5.2 and make an effort to sync the > documentation with regards to this. I might disagree with this though. I think perl people, and perhaps unix people in general, should be used to version numbers like '1.5.2', but then getting '1.52' from the code since such a number allows simple numerical comparisons while the former does not. The former is easier to read and understand. This is just how Perl itself behaves. Most users who wouldn't expect such a behaviour aren't going to be checking the version number programatically anyway. BTW. do we have someone with a CPAN account, or should I get one? From n.haigh at sheffield.ac.uk Tue Oct 24 07:37:12 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Tue, 24 Oct 2006 12:37:12 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453DE899.4030603@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> <453DE385.8010700@sheffield.ac.uk> <453DE899.4030603@sendu.me.uk> Message-ID: <453DFAE8.5050602@sheffield.ac.uk> Sendu Bala wrote: > Nathan Haigh wrote: > >> Therefore I would see the following $VERSION being applied: >> 1.5.2 RC1 = 1.52_01 >> 1.5.2 RC2 = 1.52_02 >> 1.5.2 RC3 = 1.52_03 >> 1.5.2 = 1.52_10 >> 1.6 RC1 = 1.60_01 >> 1.6 RC2 = 1.60_02 >> 1.6 = 1.60 >> 1.6.1 RC1 = 1.61_01 >> 1.6.1 = 1.61 >> >> This should satisfy the requirement of CPAN for having underscores in >> versions to indicate a developer release, which here is a Bioperl >> release with an odd minor version number or any RC whether it be of a >> developer release or a stable release. This should mean that we could >> have the RC's on CPAN, but by default, CPAN would only install the >> latest "non developer release" (i.e. the last package without an >> underscore in the version). >> > > That all sounds good to me, except I worry about potential confusion if > people look manually at the things available in CPAN, see 1.60_02 and > think it is more recent than 1.60 and try to install it manually. > > I not sure if this would be a problem. As far as I understand, CPAN treats these packages with underscores in $VERSION as something distinctly different to the others releases (i.e. developer releases). If you look at such a page, it is clearly evident that it is a developers release. For example, if you search on CPAN for the latest version of the CPAN module is shows 1.8802. if you go to that page: http://search.cpan.org/~andk/CPAN-1.8802/ There is also a link for the latest developer release, released 1 day after 1.8802 with a version of 1.88_57 (which would convert to 1.8857). This too appears to be later that 1.8802, but since it is dealt with as a developer release it doesn't seem to matter - CPAN will only deal with the stable (non-developer) releases, while the developer releases can be used as a convenient way to access developer releases. Although I'm thinking CPAN uses some hocus pocus with release dates too. > Since > $VERSION = 1.52_10; > is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, > final release version should be > $VERSION = 1.6010. > > > Because they are dealt with separately, I don't think this is an issue (see above). >> If we are going ahead with the new $VERSION scheme (as it currently is >> in HEAD), we should, for the sake of clarity, try to talk about Bioperl >> 1.52 instead of Bioperl 1.5.2 and make an effort to sync the >> documentation with regards to this. >> > > I might disagree with this though. I think perl people, and perhaps unix > people in general, should be used to version numbers like '1.5.2', but > then getting '1.52' from the code since such a number allows simple > numerical comparisons while the former does not. The former is easier to > read and understand. This is just how Perl itself behaves. > > Most users who wouldn't expect such a behaviour aren't going to be > checking the version number programatically anyway. > > > BTW. do we have someone with a CPAN account, or should I get one? > It says Ewan Birney is the author of Bioperl - I assume it must be possible to have multiple people have the permissions to update a single package. Nath From chhalling at alumni.ls.berkeley.edu Tue Oct 24 07:15:12 2006 From: chhalling at alumni.ls.berkeley.edu (Conrad Halling) Date: Tue, 24 Oct 2006 07:15:12 -0400 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453DBB51.6010505@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453D6620.5020401@alumni.ls.berkeley.edu> <453DBB51.6010505@sheffield.ac.uk> Message-ID: <453DF5C0.3040104@alumni.ls.berkeley.edu> Nathan S. Haigh wrote: > Conrad Halling wrote: >> Sorry, I should know better about giving all the details. >> >> This was using the Bundle::BioPerl on CPAN, installing for Perl 5.8.8 >> (a fresh compile) with Mac OS X 10.4.8. >> >> -- Conrad > My apologies Conrad, this was my bad! Are you in need of the > corrections being made swiftly or can you wait until the Bioperl 1.5.2 > release when I'll ensure the Bundle is updated correctly for that > release? > > Cheers > Nath No, I'm fine. I used the cpan utility to load the three modules manually. -- Conrad -- Conrad Halling chhalling at alumni.ls.berkeley.edu From bix at sendu.me.uk Tue Oct 24 08:16:54 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 24 Oct 2006 13:16:54 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453DFAE8.5050602@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> <453DE385.8010700@sheffield.ac.uk> <453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk> Message-ID: <453E0436.3050903@sendu.me.uk> Nathan Haigh wrote: > Sendu Bala wrote: > >> That all sounds good to me, except I worry about potential confusion if >> people look manually at the things available in CPAN, see 1.60_02 and >> think it is more recent than 1.60 and try to install it manually. > > I not sure if this would be a problem. As far as I understand, CPAN > treats these packages with underscores in $VERSION as something > distinctly different to the others releases (i.e. developer releases). > If you look at such a page, it is clearly evident that it is a > developers release. For example, if you search on CPAN for the latest > version of the CPAN module is shows 1.8802. if you go to that page: > http://search.cpan.org/~andk/CPAN-1.8802/ > There is also a link for the latest developer release, released 1 day > after 1.8802 with a version of 1.88_57 (which would convert to 1.8857). [snip] >> Since >> $VERSION = 1.52_10; >> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, >> final release version should be >> $VERSION = 1.6010. > > Because they are dealt with separately, I don't think this is an issue > (see above). If you don't notice the dates, or are doing numerical version number comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may not be automatic, but you can still chose to download the developer releases. Which means if we say to someone 'use Bioperl 1.6 or better' they may choose to get the latest version and think it is 1.6002 when infact 1.60 was the more recent version. 1.6010 solves the problem, is consistent with your 1.50_10 suggestion, and doesn't cause any problems as far as I can see. >>> If we are going ahead with the new $VERSION scheme (as it currently is >>> in HEAD), we should, for the sake of clarity, try to talk about Bioperl >>> 1.52 instead of Bioperl 1.5.2 and make an effort to sync the >>> documentation with regards to this. >>> >> I might disagree with this though. I think perl people, and perhaps unix >> people in general, should be used to version numbers like '1.5.2', but >> then getting '1.52' from the code since such a number allows simple >> numerical comparisons while the former does not. The former is easier to >> read and understand. This is just how Perl itself behaves. >> >> Most users who wouldn't expect such a behaviour aren't going to be >> checking the version number programatically anyway. >> >> >> BTW. do we have someone with a CPAN account, or should I get one? >> > > It says Ewan Birney is the author of Bioperl - I assume it must be > possible to have multiple people have the permissions to update a single > package. How did you get Bundle::BioPerl updated? Did you just ask Chris Dagdigian to do it for you? Or do you have access to his account? I'll ask Ewan about it. From n.haigh at sheffield.ac.uk Tue Oct 24 08:21:56 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Tue, 24 Oct 2006 13:21:56 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453E0436.3050903@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> <453DE385.8010700@sheffield.ac.uk> <453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk> <453E0436.3050903@sendu.me.uk> Message-ID: <453E0564.9030302@sheffield.ac.uk> Sendu Bala wrote: > Nathan Haigh wrote: >> Sendu Bala wrote: >> >>> That all sounds good to me, except I worry about potential confusion >>> if people look manually at the things available in CPAN, see 1.60_02 >>> and think it is more recent than 1.60 and try to install it manually. >> >> I not sure if this would be a problem. As far as I understand, CPAN >> treats these packages with underscores in $VERSION as something >> distinctly different to the others releases (i.e. developer releases). >> If you look at such a page, it is clearly evident that it is a >> developers release. For example, if you search on CPAN for the latest >> version of the CPAN module is shows 1.8802. if you go to that page: >> http://search.cpan.org/~andk/CPAN-1.8802/ >> There is also a link for the latest developer release, released 1 day >> after 1.8802 with a version of 1.88_57 (which would convert to 1.8857). > > [snip] > >>> Since >>> $VERSION = 1.52_10; >>> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before >>> release, final release version should be >>> $VERSION = 1.6010. >> >> Because they are dealt with separately, I don't think this is an issue >> (see above). > > If you don't notice the dates, or are doing numerical version number > comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may > not be automatic, but you can still chose to download the developer > releases. Which means if we say to someone 'use Bioperl 1.6 or better' > they may choose to get the latest version and think it is 1.6002 when > infact 1.60 was the more recent version. 1.6010 solves the problem, is > consistent with your 1.50_10 suggestion, and doesn't cause any > problems as far as I can see. > > I see - you mean for a non-RC release append 10 to the version number and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to the version. --snip-- > > How did you get Bundle::BioPerl updated? Did you just ask Chris > Dagdigian to do it for you? Or do you have access to his account? I'll > ask Ewan about it. I just asked Chris D. to do it for me :o) Nath From bix at sendu.me.uk Tue Oct 24 09:01:22 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 24 Oct 2006 14:01:22 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453E0564.9030302@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> <453DE385.8010700@sheffield.ac.uk> <453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk> <453E0436.3050903@sendu.me.uk> <453E0564.9030302@sheffield.ac.uk> Message-ID: <453E0EA2.6050306@sendu.me.uk> Nathan Haigh wrote: > I see - you mean for a non-RC release append 10 to the version number > and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to > the version. Precisely. 1.5.2 RC3 will have in Bio::Root::Version : $VERSION = 1.52_03; $VERSION = eval $VERSION; # $VERSION is 1.5203 1.5.2 final release would have: $VERSION = 1.52_10; $VERSION = eval $VERSION; # $VERSION is 1.5210 1.6.0 RC1 would have: $VERSION = 1.60_01; $VERSION = eval $VERSION; # $VERSION is 1.6001 1.6.0 final release would have: $VERSION = 1.6010; Nice thing about putting RCs up on CPAN is that I suppose we'd see the test results from cpantesters. The more test results the better :) From n.haigh at sheffield.ac.uk Tue Oct 24 09:05:54 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Tue, 24 Oct 2006 14:05:54 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453E0EA2.6050306@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> <453DE385.8010700@sheffield.ac.uk> <453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk> <453E0436.3050903@sendu.me.uk> <453E0564.9030302@sheffield.ac.uk> <453E0EA2.6050306@sendu.me.uk> Message-ID: <453E0FB2.4080002@sheffield.ac.uk> Sendu Bala wrote: > Nathan Haigh wrote: >> I see - you mean for a non-RC release append 10 to the version number >> and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to >> the version. > > Precisely. > > 1.5.2 RC3 will have in Bio::Root::Version : > > $VERSION = 1.52_03; > $VERSION = eval $VERSION; # $VERSION is 1.5203 > > 1.5.2 final release would have: > > $VERSION = 1.52_10; > $VERSION = eval $VERSION; # $VERSION is 1.5210 > > 1.6.0 RC1 would have: > > $VERSION = 1.60_01; > $VERSION = eval $VERSION; # $VERSION is 1.6001 > > 1.6.0 final release would have: > > $VERSION = 1.6010; > > > Nice thing about putting RCs up on CPAN is that I suppose we'd see the > test results from cpantesters. The more test results the better :) Did you see the cpants site I sent earlier: http://cpants.perl.org/dist/bioperl But I'm not sure why 1.4 didn't make it in there instead of 1.2.3 From bix at sendu.me.uk Tue Oct 24 09:14:08 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 24 Oct 2006 14:14:08 +0100 Subject: [Bioperl-l] CPAN testing Service In-Reply-To: <453D2120.9010301@sheffield.ac.uk> References: <453D2120.9010301@sheffield.ac.uk> Message-ID: <453E11A0.20304@sendu.me.uk> Nathan S. Haigh wrote: > We should also check the CPAN testing service (CPANTS) to see how "good" > our package is for CPAN and try to increase the Kwalitee score. There > only appears to be details for bioperl-1.2.3 for some reason: > http://cpants.perl.org/dist/bioperl Yes, but I think it will be pretty similar score this time round. We'll resolve the remaining issues for 1.6. From cjfields at uiuc.edu Tue Oct 24 10:24:44 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 24 Oct 2006 09:24:44 -0500 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453E0436.3050903@sendu.me.uk> Message-ID: <000501c6f778$279cee10$15327e82@pyrimidine> ... > >> Since > >> $VERSION = 1.52_10; > >> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, > >> final release version should be > >> $VERSION = 1.6010. > > > > Because they are dealt with separately, I don't think this is an issue > > (see above). > > If you don't notice the dates, or are doing numerical version number > comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may > not be automatic, but you can still chose to download the developer > releases. Which means if we say to someone 'use Bioperl 1.6 or better' > they may choose to get the latest version and think it is 1.6002 when > infact 1.60 was the more recent version. 1.6010 solves the problem, is > consistent with your 1.50_10 suggestion, and doesn't cause any problems > as far as I can see. CPAN looks like it can handle 'x.y.z', at least for Pugs: http://search.cpan.org/~audreyt/Perl6-Pugs-6.2.13/ >From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013': our $VERSION = 6.002013; That's also a very perlish-way to do it. And there are no developer versions of Pugs, since it is always under active development. We could try something like: our $VERSION = 1.005002_01; just to tag it as a developer release or release candidate, if that's what you want; I'm neutral to that point. I don't think it's necessary to post every RC to CPAN, though, unless you feel very strongly about it. It just seems like more hassle than it's worth, esp. since you've been releasing about one per week leading up to a final 1.5.2 (due soon). > >> I might disagree with this though. I think perl people, and perhaps > unix > >> people in general, should be used to version numbers like '1.5.2', but > >> then getting '1.52' from the code since such a number allows simple > >> numerical comparisons while the former does not. The former is easier > to > >> read and understand. This is just how Perl itself behaves. > >> > >> Most users who wouldn't expect such a behaviour aren't going to be > >> checking the version number programatically anyway. > >> > >> > >> BTW. do we have someone with a CPAN account, or should I get one? > >> > > > > It says Ewan Birney is the author of Bioperl - I assume it must be > > possible to have multiple people have the permissions to update a single > > package. As a quick response to the above, I would read 'rel. 1.5.2' as the second patched release of the second revision (here in a developer cycle) of the first major release. I would read 'rel 1.52' as the 52nd release of the major release (just can't quite make it to version 2, I guess). I don't think we can use the latter as it is just too confusing, especially since we've adopted the 'major.minor.patch' versioning quite early on. As for CPAN, I believe there is usually a person or group responsible for maintaining each distribution. As Ewan seems to be the point man, you'll have to ask him. I suppose it is possible to add more if needed > How did you get Bundle::BioPerl updated? Did you just ask Chris > Dagdigian to do it for you? Or do you have access to his account? I'll > ask Ewan about it. When I inquired about XML::Simple, I emailed Chris D. via his contact information from CPAN. He let me know that adding it would be pretty easy, so all you need to do is let him know about any errors/additions/deletions. I think his wiki page also has some contact info. Which reminds me, if anyone contacts him, could you make sure that XML::Simple is added? I can't remember if it has been. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 24 10:29:11 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 24 Oct 2006 09:29:11 -0500 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453E0FB2.4080002@sheffield.ac.uk> Message-ID: <000601c6f778$c639f0e0$15327e82@pyrimidine> > Sendu Bala wrote: > > Nathan Haigh wrote: > >> I see - you mean for a non-RC release append 10 to the version number > >> and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to > >> the version. > > > > Precisely. > > > > 1.5.2 RC3 will have in Bio::Root::Version : > > > > $VERSION = 1.52_03; > > $VERSION = eval $VERSION; # $VERSION is 1.5203 > > > > 1.5.2 final release would have: > > > > $VERSION = 1.52_10; > > $VERSION = eval $VERSION; # $VERSION is 1.5210 > > > > 1.6.0 RC1 would have: > > > > $VERSION = 1.60_01; > > $VERSION = eval $VERSION; # $VERSION is 1.6001 > > > > 1.6.0 final release would have: > > > > $VERSION = 1.6010; > > > > > > Nice thing about putting RCs up on CPAN is that I suppose we'd see the > > test results from cpantesters. The more test results the better :) > Did you see the cpants site I sent earlier: > http://cpants.perl.org/dist/bioperl > > But I'm not sure why 1.4 didn't make it in there instead of 1.2.3 Yes, odd. Another thing to note is that CPAN also list two bugs related to bioperl 1.4. We may need to have some way of either redirecting users from there to bugzilla, or routinely checking the CPAN site. Otherwise we'll miss those. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From JK at novozymes.com Tue Oct 24 10:45:26 2006 From: JK at novozymes.com (JK (Jesper Agerbo Krogh)) Date: Tue, 24 Oct 2006 16:45:26 +0200 Subject: [Bioperl-l] Keeping references around in the objects? Message-ID: <934F95E71B6C9347A873C42AE3C196191299E011@NZT0004E.dknz.nzcorp.net> Hi All. When getting a Bio::Seq object back from a feature it would be really nice to have access to the old objects through the new object as: $featseq->feature()->parent_seq(); Would it be possible to keep the references around for (as an example) to be able to access the global information through the particular feature. Most of the annotation in the general header of a EMBL/Genbank-record also applies to the specific features. Jesper From JK at novozymes.com Tue Oct 24 10:28:22 2006 From: JK at novozymes.com (JK (Jesper Agerbo Krogh)) Date: Tue, 24 Oct 2006 16:28:22 +0200 Subject: [Bioperl-l] Subclassing Bio::Seq ? Extending Bio::Perl Message-ID: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net> Hi. We're trying to "extend" bioperl in our own setup. We have some funtions that we'd like to "allways" have available on a Bio::Seq-object. As an example, I'd like to have the sequence-digest available on ->digest that just returns A hex-encoded message-digest of the sequence in the object. This is really comfortable when trying to figure out wether we've got some computations stored in the cache for this particular sequence. Another example is that we have some fields we want to be mandatory in the objects, thus adding additional checks in the constructor is nessesary. Our approach has been to "subclass" Bio::Seq in a new object: (Nz::Seq) and add the functionality there. This generally works fine (->translate() calls ->can_call_new() and instantiates the correct subclassed object. But the logic fails when the ->seq of a feature just instantiates a Bio::PrimarySeq without trying to get the subclassed object. So the question basically is: What is the preferred way of extending/subclassing Bio-perl -objects with our own methods? Jesper From bix at sendu.me.uk Tue Oct 24 11:26:19 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 24 Oct 2006 16:26:19 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <000501c6f778$279cee10$15327e82@pyrimidine> References: <000501c6f778$279cee10$15327e82@pyrimidine> Message-ID: <453E309B.9090007@sendu.me.uk> Chris Fields wrote: > ... >>>> Since >>>> $VERSION = 1.52_10; >>>> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, >>>> final release version should be >>>> $VERSION = 1.6010. >>> Because they are dealt with separately, I don't think this is an issue >>> (see above). >> If you don't notice the dates, or are doing numerical version number >> comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may >> not be automatic, but you can still chose to download the developer >> releases. Which means if we say to someone 'use Bioperl 1.6 or better' >> they may choose to get the latest version and think it is 1.6002 when >> infact 1.60 was the more recent version. 1.6010 solves the problem, is >> consistent with your 1.50_10 suggestion, and doesn't cause any problems >> as far as I can see. > > CPAN looks like it can handle 'x.y.z', at least for Pugs: > > http://search.cpan.org/~audreyt/Perl6-Pugs-6.2.13/ 'handle'? I think it shows up as '6.2.13' simply because it was uploaded with the filename Perl6-Pugs-6.2.13.tar.gz As you point out, the code has the kind of $VERSION number we've been suggesting in this thread: > From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013': > > our $VERSION = 6.002013; > > That's also a very perlish-way to do it. And there are no developer > versions of Pugs, since it is always under active development. We could try > something like: > > our $VERSION = 1.005002_01; Yes, this was already like one of my suggestions (1.0502_01), but I brought up the concern that 1.05 might be < 1.4. So then we have a question: do we try and fumble a 1.4 compatible number by using 1.60_10, or do we have a clean break, remove 1.4 from CPAN if it causes problems, and go for the 'proper' 1.006000 (1.6.0) with no room for RC numbering, or 1.006000010 (1.6.0.10) - the first final release following some 1.006000_001 (1.6.0.01 == rc1) RCs? > just to tag it as a developer release or release candidate, if that's what > you want; I'm neutral to that point. I don't think it's necessary to post > every RC to CPAN, though, unless you feel very strongly about it. It just > seems like more hassle than it's worth, esp. since you've been releasing > about one per week leading up to a final 1.5.2 (due soon). I don't think it would be a hassle; on the contrary it would be very useful to know the CPAN distribution actually works. I'm very happy with the idea that a release candidate gets fully tested... From bix at sendu.me.uk Tue Oct 24 11:39:16 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 24 Oct 2006 16:39:16 +0100 Subject: [Bioperl-l] Subclassing Bio::Seq ? Extending Bio::Perl In-Reply-To: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net> References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net> Message-ID: <453E33A4.5060004@sendu.me.uk> JK (Jesper Agerbo Krogh) wrote: > Hi. > > We're trying to "extend" bioperl in our own setup. We have some funtions > that we'd like to "allways" have available on a Bio::Seq-object. [snip] > So the question basically is: > What is the preferred way of extending/subclassing Bio-perl -objects > with our own methods? http://www.bioperl.org/wiki/Advanced_BioPerl#Extending_the_toolkit From hlapp at gmx.net Tue Oct 24 12:24:09 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 24 Oct 2006 12:24:09 -0400 Subject: [Bioperl-l] Subclassing Bio::Seq ? Extending Bio::Perl In-Reply-To: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net> References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net> Message-ID: I think you've generally taken the right path, but see below. First off, object factories are used extensively already but not yet in each and every place where Bioperl creates an object internally. Achieving your goal may entail fixes to Bioperl to use a factory instead of a hard-coded module name. Also be on the lookout for factory() or seq_factory() methods for classes whose work entails creating sequence objects and that already give you control over the type to be created. The problem that hits you here though isn't one of determining the type of the object to be created, because the respective method doesn't create a sequence object. It only returns the sequence object that the feature has a reference to. The reason that this is a Bio::PrimarySeq and not a Bio::Seq or your extension of the latter is that the Perl garbage collector can't deal with circular references. The way we've circumvented the problem with sequence (who hold references to their feature objects) and feature objects (who need to hold a reference to their sequence object) is to make Bio::Seq a wrapper around Bio::PrimarySeq (i.e., Bio::Seq implements Bio::PrimarySeqI by delegating all the Bio::PrimarySeqI methods to an instance of Bio::PrimarySeq, and then adds implementations of the Bio::SeqI methods), and then make feature objects only hold a reference to the 'base' Bio::PrimarySeq instance. This works because Bio::PrimarySeq doesn't hold features, only Bio::SeqI objects do. Having said all that, note that if all what you want to do is defining computations on Bio::Seq objects, as opposed to storing values for additional attributes, the best design approach is not to extend the class but to create a class with those computations as static methods (which would accept the seq object on which to compute as an argument; e.g., print $seqComputations->message_digest($seq)). -hlmar On Oct 24, 2006, at 10:28 AM, JK ((Jesper Agerbo Krogh)) wrote: > Hi. > > We're trying to "extend" bioperl in our own setup. We have some > funtions > > that we'd like to "allways" have available on a Bio::Seq-object. As an > example, > I'd like to have the sequence-digest available on ->digest that just > returns > A hex-encoded message-digest of the sequence in the object. This is > really comfortable > when trying to figure out wether we've got some computations stored in > the cache > for this particular sequence. > > Another example is that we have some fields we want to be mandatory in > the objects, > thus adding additional checks in the constructor is nessesary. > > Our approach has been to "subclass" Bio::Seq in a new object: > (Nz::Seq) > and add > the functionality there. This generally works fine (->translate() > calls > ->can_call_new() > and instantiates the correct subclassed object. > > But the logic fails when the ->seq of a feature just instantiates a > Bio::PrimarySeq > without trying to get the subclassed object. > > So the question basically is: > What is the preferred way of extending/subclassing Bio-perl -objects > with > our own methods? > > Jesper > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Tue Oct 24 12:45:25 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 24 Oct 2006 11:45:25 -0500 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453E309B.9090007@sendu.me.uk> Message-ID: <000001c6f78b$d1c65a30$15327e82@pyrimidine> ... > > 'handle'? I think it shows up as '6.2.13' simply because it was uploaded > with the filename Perl6-Pugs-6.2.13.tar.gz Sorry, my point was that when Audrey T. uses '6.2.13', her $VERSION is '6.002013'. So maybe we should follow a similar convention. Seems easier and less confusing to me, at least. > As you point out, the code has the kind of $VERSION number we've been > suggesting in this thread: > > > From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013': > > > > our $VERSION = 6.002013; > > > > That's also a very perlish-way to do it. And there are no developer > > versions of Pugs, since it is always under active development. We could > try > > something like: > > > > our $VERSION = 1.005002_01; > > Yes, this was already like one of my suggestions (1.0502_01), but I > brought up the concern that 1.05 might be < 1.4. > > So then we have a question: do we try and fumble a 1.4 compatible number > by using 1.60_10, or do we have a clean break, remove 1.4 from CPAN if > it causes problems, and go for the 'proper' 1.006000 (1.6.0) with no > room for RC numbering, or 1.006000010 (1.6.0.10) - the first final > release following some 1.006000_001 (1.6.0.01 == rc1) RCs? I would go for the clean break if it follows perl/CPAN convention. '1.60_10' looks like '1.60.10', not 1.6 or 1.6 RC1, so it's too confusing. If we can use '1.00600x' for 1.6, 1.6.1, etc, and '1.006000_00x' for 1.6 RC1, 1.6 RC2 etc then that would be consistent and perl-compatible. BTW, the reason I looked at Pugs was to see what some of the Perl6 developers were using. Who knows; they'll probably change it! ... > I don't think it would be a hassle; on the contrary it would be very > useful to know the CPAN distribution actually works. I'm very happy with > the idea that a release candidate gets fully tested... So you obviously feel strongly about it! ;> I don't have a problem as long as we stick with doing this from now on (i.e. have a consistent versioning scheme, release policy, CPAN release policy, etc). Would be nice for Jason/Brian/Hilmar to chime in as to the reasoning behind the older versioning scheme. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From JK at novozymes.com Tue Oct 24 13:59:10 2006 From: JK at novozymes.com (JK (Jesper Agerbo Krogh)) Date: Tue, 24 Oct 2006 19:59:10 +0200 Subject: [Bioperl-l] Subclassing Bio::Seq ? Extending Bio::Perl References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.n et> Message-ID: <934F95E71B6C9347A873C42AE3C196190B84C5FC@NZT0004E.dknz.nzcorp.net> > > I think you've generally taken the right path, but see below. > > First off, object factories are used extensively already but not yet > in each and every place where Bioperl creates an object internally. > Achieving your goal may entail fixes to Bioperl to use a factory > instead of a hard-coded module name. Also be on the lookout for > factory() or seq_factory() methods for classes whose work entails > creating sequence objects and that already give you control over the > type to be created. Can you elaborate/describe this a bit more? > The problem that hits you here though isn't one of determining the > type of the object to be created, because the respective method > doesn't create a sequence object. It only returns the sequence object > that the feature has a reference to. This was what Data::Dumper told me, but stuff I'd likewise would like to change was to get a RichSeq object returned every-time from Bio::Seq, adding in the stuff that allways seems appropriate. > The reason that this is a Bio::PrimarySeq and not a Bio::Seq or your > extension of the latter is that the Perl garbage collector can't deal > with circular references. Doesn't Scalar::Util::weaken solve that? > Having said all that, note that if all what you want to do is > defining computations on Bio::Seq objects, as opposed to storing > values for additional attributes, the best design approach is not to > extend the class but to create a class with those computations as > static methods (which would accept the seq object on which to compute > as an argument; e.g., print $seqComputations->message_digest($seq)). I could but there are some functionality that I'd by design would like to have available on every sequence in the system. This way I would end up coding the functionality for getting the message_digest every place that I needed to get the value (which would be quite often in this application), whereas it by design belongs into the Bio::Seq-stuff. Jesper From JK at novozymes.com Tue Oct 24 13:59:19 2006 From: JK at novozymes.com (JK (Jesper Agerbo Krogh)) Date: Tue, 24 Oct 2006 19:59:19 +0200 Subject: [Bioperl-l] Subclassing Bio::Seq ? Extending Bio::Perl References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.n et> <453E33A4.5060004@sendu.me.uk> Message-ID: <934F95E71B6C9347A873C42AE3C196190B84C5FD@NZT0004E.dknz.nzcorp.net> > JK (Jesper Agerbo Krogh) wrote: > > Hi. > > > > We're trying to "extend" bioperl in our own setup. We have some funtions > > that we'd like to "allways" have available on a Bio::Seq-object. > [snip] > > So the question basically is: > > What is the preferred way of extending/subclassing Bio-perl -objects > > with our own methods? > > http://www.bioperl.org/wiki/Advanced_BioPerl#Extending_the_toolkit That is definately a way of extending Bio-perl, thanks. Jesper From hlapp at gmx.net Tue Oct 24 14:57:02 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 24 Oct 2006 14:57:02 -0400 Subject: [Bioperl-l] Subclassing Bio::Seq ? Extending Bio::Perl In-Reply-To: <934F95E71B6C9347A873C42AE3C196190B84C5FC@NZT0004E.dknz.nzcorp.net> References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.n et> <934F95E71B6C9347A873C42AE3C196190B84C5FC@NZT0004E.dknz.nzcorp.net> Message-ID: On Oct 24, 2006, at 1:59 PM, JK ((Jesper Agerbo Krogh)) wrote: >> >> I think you've generally taken the right path, but see below. >> >> First off, object factories are used extensively already but not yet >> in each and every place where Bioperl creates an object internally. >> Achieving your goal may entail fixes to Bioperl to use a factory >> instead of a hard-coded module name. Also be on the lookout for >> factory() or seq_factory() methods for classes whose work entails >> creating sequence objects and that already give you control over the >> type to be created. > > Can you elaborate/describe this a bit more? See for example the POD of Bio::SeqIO (sorry, the method is called sequence_factory()). > >> The reason that this is a Bio::PrimarySeq and not a Bio::Seq or your >> extension of the latter is that the Perl garbage collector can't deal >> with circular references. > > Doesn't Scalar::Util::weaken solve that? You're welcome to test and try. It should be a simple change in Bio::Seq::add_SeqFeature(). You will see that it is this method and not the feature object that makes sure the wrapped primarySeq gets passed as sequence reference. Just change that to creating a new reference to the sequence object and make it a weak reference before passing it to the feature object. (The feature object has no requirement (or knowledge) that the referenced sequence object is a PrimarySeq.) > >> Having said all that, note that if all what you want to do is >> defining computations on Bio::Seq objects, as opposed to storing >> values for additional attributes, the best design approach is not to >> extend the class but to create a class with those computations as >> static methods (which would accept the seq object on which to compute >> as an argument; e.g., print $seqComputations->message_digest($seq)). > > I could but there are some functionality that I'd by design would > like to > have available on every sequence in the system. This way I would > end up > coding the functionality for getting the message_digest every place > that > I needed to get the value (which would be quite often in this > application), > whereas it by design belongs into the Bio::Seq-stuff. I'm not following you why this would make any difference (it would be $seq->message_digest() compared to $seqCompute->message_digest ($seq)), unless what you are saying is that you would like to cache the result of the computation. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Wed Oct 25 06:36:27 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 25 Oct 2006 11:36:27 +0100 Subject: [Bioperl-l] Lagan environment variable Message-ID: <453F3E2B.2040309@sendu.me.uk> Notification to say I'm changing the environmental variable that Bio::Tools::Run::Alignment::Lagan expects to define the location of the lagan executables from LAGANDIR to LAGAN_DIR, since the latter is the default variable that the lagan installation and scripts themselves look for. I hope this isn't too much of a burden, but it seems like the sensible approach to getting Bio::Tools::Run::Alignment::Lagan to actually work. Thank you, Sendu. From n.haigh at sheffield.ac.uk Wed Oct 25 09:07:47 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 25 Oct 2006 13:07:47 +0000 Subject: [Bioperl-l] Lagan environment variable In-Reply-To: <453F3E2B.2040309@sendu.me.uk> References: <453F3E2B.2040309@sendu.me.uk> Message-ID: <453F61A3.4090904@sheffield.ac.uk> Sendu Bala wrote: > Notification to say I'm changing the environmental variable that > Bio::Tools::Run::Alignment::Lagan expects to define the location of the > lagan executables from LAGANDIR to LAGAN_DIR, since the latter is the > default variable that the lagan installation and scripts themselves look > for. > > I hope this isn't too much of a burden, but it seems like the sensible > approach to getting Bio::Tools::Run::Alignment::Lagan to actually work. > > > Thank you, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > Woudn't it make more sense to change the test? That is what I've just done for t/Genscan.t It seemed to fit in with the ENV variable syntax that other modules in Bioperl-run used. Nath -- > A: Yes. >> Q: Are you sure? >> >>> A: Because it reverses the logical flow of conversation. >>> >>>> Q: Why is top posting frowned upon? >>>> Get Thunderbird From bix at sendu.me.uk Wed Oct 25 08:12:00 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 25 Oct 2006 13:12:00 +0100 Subject: [Bioperl-l] Lagan environment variable In-Reply-To: <453F61A3.4090904@sheffield.ac.uk> References: <453F3E2B.2040309@sendu.me.uk> <453F61A3.4090904@sheffield.ac.uk> Message-ID: <453F5490.7060808@sendu.me.uk> Nathan S. Haigh wrote: > Sendu Bala wrote: >> Notification to say I'm changing the environmental variable that >> Bio::Tools::Run::Alignment::Lagan expects to define the location of the >> lagan executables from LAGANDIR to LAGAN_DIR, since the latter is the >> default variable that the lagan installation and scripts themselves look >> for. >> >> I hope this isn't too much of a burden, but it seems like the sensible >> approach to getting Bio::Tools::Run::Alignment::Lagan to actually work. > > Woudn't it make more sense to change the test? That is what I've just > done for t/Genscan.t For Genscan.t, the test script looked at the wrong environment variable. Here I'm talking about lagan itself (the thing you get from http://lagan.stanford.edu/lagan_web/citing.shtml, nothing to do with Bioperl) needing the environment variable LAGAN_DIR to be set in order to work. Since you need to set LAGAN_DIR to make lagan work, it makes sense that the Bioperl front-end to lagan also use the same variable. From n.haigh at sheffield.ac.uk Wed Oct 25 09:16:16 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 25 Oct 2006 13:16:16 +0000 Subject: [Bioperl-l] Lagan environment variable In-Reply-To: <453F5490.7060808@sendu.me.uk> References: <453F3E2B.2040309@sendu.me.uk> <453F61A3.4090904@sheffield.ac.uk> <453F5490.7060808@sendu.me.uk> Message-ID: <453F63A0.7040609@sheffield.ac.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: >> Sendu Bala wrote: >>> Notification to say I'm changing the environmental variable that >>> Bio::Tools::Run::Alignment::Lagan expects to define the location of >>> the lagan executables from LAGANDIR to LAGAN_DIR, since the latter >>> is the default variable that the lagan installation and scripts >>> themselves look for. >>> >>> I hope this isn't too much of a burden, but it seems like the >>> sensible approach to getting Bio::Tools::Run::Alignment::Lagan to >>> actually work. >> >> Woudn't it make more sense to change the test? That is what I've just >> done for t/Genscan.t > > For Genscan.t, the test script looked at the wrong environment variable. > > Here I'm talking about lagan itself (the thing you get from > http://lagan.stanford.edu/lagan_web/citing.shtml, nothing to do with > Bioperl) needing the environment variable LAGAN_DIR to be set in order > to work. > > Since you need to set LAGAN_DIR to make lagan work, it makes sense > that the Bioperl front-end to lagan also use the same variable. > Ah, OK! :-[ teach me for speak up about something I know nothing about! :-) FYI, I've been busy this morning installing as much Bioperl-run external software as I could (those that have tests). Will be posting results shorty. Nath From massimo.ubaldi at gmail.com Wed Oct 25 10:28:52 2006 From: massimo.ubaldi at gmail.com (Massimo Ubaldi) Date: Wed, 25 Oct 2006 16:28:52 +0200 Subject: [Bioperl-l] blastxml format Message-ID: <4b5350650610250728s1a421199if2493c9c4660474d@mail.gmail.com> Hi I'm using the script below to parse a blastn output to multiple sequences I got the output from the blast web interface asking for xml formatted output. Everything work fine except that I cannot print the name of each input sequence (see below). That is, using the line (see below) $result->query_description I got just the name of the first sequence. Infact this is defined by the tag. What I really want is to extract the name that is defined by the tag. Now I digged out the bioperl mailing list and other sources but I did not find anything to solve this. Can somebody help me? Thanks alot Massimo This is an example of ouput I got MRDNA_probe 46.1 PREDICTED: Danio rerio similar to mineralocorticoid receptor form B (LOC562171), mRNA 68354945 XM_685568 81.8 Danio rerio VDR-B mRNA, partial cds 68132043 DQ017633 PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA 68420187 XM_684078 This what I'd like to get MRDNA_probe 46.1 PREDICTED: Danio rerio similar to mineralocorticoid receptor form B (LOC562171), mRNA 68354945 XM_685568 VDRacterm_probe 81.8 Danio rerio VDR-B mRNA, partial cds 68132043 DQ017633 ARalpcterm_probe PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA 68420187 XM_684078 This is the script #!/usr/bin/perl use strict; use Bio::SearchIO; my $in = new Bio::SearchIO(-format => 'blast', -file => 'Blastn_danio.bls'); open OUTFILE, ">parsed_blastn_danio.txt" or die "Could not open file, stopped"; my $result = $in->next_result; print OUTFILE $result->algorithm, "\n"; print OUTFILE $result->database_name, "\n"; print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers", "\t", "GenBank Accession", "\n"; while($result = $in->next_result ) { print OUTFILE $result->query_description, "\n"; while( my $hit = $result->next_hit ) { while( my $hsp = $hit->next_hsp ) { my $acc=$hit->name; my $description= $hit->description; $acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/; print OUTFILE $hit->raw_score, "\t", # Score $hit->description, "\t", # Description $1, "\t", $2, "\n"; } } } From cjfields at uiuc.edu Wed Oct 25 11:04:14 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 25 Oct 2006 10:04:14 -0500 Subject: [Bioperl-l] blastxml format In-Reply-To: <4b5350650610250728s1a421199if2493c9c4660474d@mail.gmail.com> Message-ID: <000301c6f846$d6227760$15327e82@pyrimidine> Iterations (which are related to PSIBLAST) aren't currently handled in blastxml, which is why the tag isn't being parsed. I'll give it a look but I don't think it will be properly fixed anytime soon, since we're gearing up for a developer release and are sorting out various bugs in relation to that. In the meantime, you could always try changing the relevant tag in the %MAPPING hash in your local copy of Bio::SearchIO::blastxml from 'BlastOutput_query-def' to 'Iteration_query-def', which may do the trick for you. I'm a bit reluctant to change this in CVS as it would be better to add this in when iterations are handled properly by blastxml, and I'm not sure all BLAST XML varieties have the tag. If you want you can add this to the bioperl bugzilla as an enhancement request to remind us: http://bugzilla.open-bio.org/ Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Massimo Ubaldi > Sent: Wednesday, October 25, 2006 9:29 AM > To: bioperl-l List > Subject: [Bioperl-l] blastxml format > > Hi > I'm using the script below to parse a blastn output to multiple sequences > I got the output from the blast web interface asking for xml formatted > output. > Everything work fine except that I cannot print the name of each input > sequence (see below). > That is, using the line (see below) $result->query_description I got just > the name of the first sequence. Infact this is defined by the > tag. > What I really want is to extract the name that is defined by the > tag. > Now I digged out the bioperl mailing list and other sources but I did not > find anything to solve this. > Can somebody help me? > Thanks alot > Massimo > > > This is an example of ouput I got > > MRDNA_probe > 46.1 PREDICTED: Danio rerio similar to mineralocorticoid receptor form > B > (LOC562171), mRNA 68354945 XM_685568 > 81.8 Danio rerio VDR-B mRNA, partial cds 68132043 DQ017633 > PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA > 68420187 XM_684078 > > This what I'd like to get > MRDNA_probe > 46.1 PREDICTED: Danio rerio similar to mineralocorticoid receptor form > B > (LOC562171), mRNA 68354945 XM_685568 > VDRacterm_probe > 81.8 Danio rerio VDR-B mRNA, partial cds 68132043 DQ017633 > ARalpcterm_probe > PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA > 68420187 XM_684078 > > This is the script > #!/usr/bin/perl > use strict; > use Bio::SearchIO; > my $in = new Bio::SearchIO(-format => 'blast', > -file => 'Blastn_danio.bls'); > open OUTFILE, ">parsed_blastn_danio.txt" or die "Could not open file, > stopped"; > my $result = $in->next_result; > print OUTFILE $result->algorithm, "\n"; > print OUTFILE $result->database_name, "\n"; > > print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers", > "\t", "GenBank Accession", "\n"; > > while($result = $in->next_result ) { > print OUTFILE $result->query_description, "\n"; > while( my $hit = $result->next_hit ) { > while( my $hsp = $hit->next_hsp ) { > > my $acc=$hit->name; > my $description= $hit->description; > > $acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/; > > print OUTFILE > > $hit->raw_score, "\t", # Score > $hit->description, "\t", # Description > > $1, "\t", $2, "\n"; > } > } > } > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From massimo.ubaldi at gmail.com Wed Oct 25 11:20:49 2006 From: massimo.ubaldi at gmail.com (Massimo Ubaldi) Date: Wed, 25 Oct 2006 17:20:49 +0200 Subject: [Bioperl-l] blastxml format In-Reply-To: <000301c6f846$d6227760$15327e82@pyrimidine> References: <4b5350650610250728s1a421199if2493c9c4660474d@mail.gmail.com> <000301c6f846$d6227760$15327e82@pyrimidine> Message-ID: <4b5350650610250820w1498b27dnd155896fbf9a2012@mail.gmail.com> Thanks for the reply. I've already tried this but I got exactly the same results as before. What other can I try? Massimo On 10/25/06, Chris Fields wrote: > > Iterations (which are related to PSIBLAST) aren't currently handled in > blastxml, which is why the tag isn't being parsed. I'll give it a look > but > I don't think it will be properly fixed anytime soon, since we're gearing > up > for a developer release and are sorting out various bugs in relation to > that. > > In the meantime, you could always try changing the relevant tag in the > %MAPPING hash in your local copy of Bio::SearchIO::blastxml from > 'BlastOutput_query-def' to 'Iteration_query-def', which may do the trick > for > you. I'm a bit reluctant to change this in CVS as it would be better to > add > this in when iterations are handled properly by blastxml, and I'm not sure > all BLAST XML varieties have the tag. > > If you want you can add this to the bioperl bugzilla as an enhancement > request to remind us: > > http://bugzilla.open-bio.org/ > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > -----Original Message----- > > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > > bounces at lists.open-bio.org] On Behalf Of Massimo Ubaldi > > Sent: Wednesday, October 25, 2006 9:29 AM > > To: bioperl-l List > > Subject: [Bioperl-l] blastxml format > > > > Hi > > I'm using the script below to parse a blastn output to multiple > sequences > > I got the output from the blast web interface asking for xml formatted > > output. > > Everything work fine except that I cannot print the name of each input > > sequence (see below). > > That is, using the line (see below) $result->query_description I got > just > > the name of the first sequence. Infact this is defined by the > > tag. > > What I really want is to extract the name that is defined by the > > tag. > > Now I digged out the bioperl mailing list and other sources but I did > not > > find anything to solve this. > > Can somebody help me? > > Thanks alot > > Massimo > > > > > > This is an example of ouput I got > > > > MRDNA_probe > > 46.1 PREDICTED: Danio rerio similar to mineralocorticoid receptor > form > > B > > (LOC562171), mRNA 68354945 XM_685568 > > 81.8 Danio rerio VDR-B mRNA, partial cds 68132043 DQ017633 > > PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA > > 68420187 XM_684078 > > > > This what I'd like to get > > MRDNA_probe > > 46.1 PREDICTED: Danio rerio similar to mineralocorticoid receptor > form > > B > > (LOC562171), mRNA 68354945 XM_685568 > > VDRacterm_probe > > 81.8 Danio rerio VDR-B mRNA, partial cds 68132043 DQ017633 > > ARalpcterm_probe > > PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA > > 68420187 XM_684078 > > > > This is the script > > #!/usr/bin/perl > > use strict; > > use Bio::SearchIO; > > my $in = new Bio::SearchIO(-format => 'blast', > > -file => 'Blastn_danio.bls'); > > open OUTFILE, ">parsed_blastn_danio.txt" or die "Could not open file, > > stopped"; > > my $result = $in->next_result; > > print OUTFILE $result->algorithm, "\n"; > > print OUTFILE $result->database_name, "\n"; > > > > print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers", > > "\t", "GenBank Accession", "\n"; > > > > while($result = $in->next_result ) { > > print OUTFILE $result->query_description, "\n"; > > while( my $hit = $result->next_hit ) { > > while( my $hsp = $hit->next_hsp ) { > > > > my $acc=$hit->name; > > my $description= $hit->description; > > > > $acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/; > > > > print OUTFILE > > > > $hit->raw_score, "\t", # Score > > $hit->description, "\t", # Description > > > > $1, "\t", $2, "\n"; > > } > > } > > } > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at uiuc.edu Wed Oct 25 12:56:46 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 25 Oct 2006 11:56:46 -0500 Subject: [Bioperl-l] blastxml format In-Reply-To: <4b5350650610250820w1498b27dnd155896fbf9a2012@mail.gmail.com> Message-ID: <000001c6f856$8ee44bc0$15327e82@pyrimidine> > Thanks for the reply. I've already tried this but I got exactly the same > > results as before. > What other can I try? > Massimo If you don't mind me asking, what version of perl and Bioperl are you using, and what version of BLAST is used? I want to point out there are a number of problems with your script, now I have had a chance to look at it. 1) You have the SearchIO format set to 'blast'. It should be 'blastxml' if you are parsing XML format. 2) Every time you call next_result() you iterate through each BLAST report. In effect, you're doing something like this: my $result = $in->next_result(); ....# do something here (in first BLAST report) while ($result = $in->next_result()) { # change to second BLAST report # more stuff here (in second BLAST report, if there is one) } I don't know if it's intentional though, but it's something to point out. 3) You also use raw_score(), which doesn't return a value for me (this may be related to the bioperl version, which is why I asked above). If you use $hit->bits() or $hit->significance() you can get the bits or hit evalue, respectively. 4) Also, I didn't see a difference with the two XML tags and using BLAST 2.2.15 output (WebBLAST at NCBI), which makes sense since they should originate from the same query sequence anyway. This could be related to the BLAST version. Here's my version of your script, using WinXP and bioperl-live (CVS): use Bio::SearchIO; my $file = shift @ARGV; my $in = new Bio::SearchIO(-format => 'blastxml', -file => $file); open OUTFILE, ">parsed_blastn_danio.txt" || die "Could not open file, stopped"; while(my $result = $in->next_result ) { print OUTFILE $result->algorithm, "\n"; print OUTFILE $result->database_name, "\n"; print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers", "\t", "GenBank Accession", "\n"; print OUTFILE $result->query_description, "\n"; while( my $hit = $result->next_hit ) { while( my $hsp = $hit->next_hsp ) { my $acc=$hit->name; my $description= $hit->description; if ($acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/) { print OUTFILE $hit->bits, "\t", # Score $hit->description, "\t", # Description $1, "\t", $2, "\n"; } } } } Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign ... From n.haigh at sheffield.ac.uk Thu Oct 26 04:47:27 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 26 Oct 2006 09:47:27 +0100 Subject: [Bioperl-l] More extensive Bioperl-run 1.5.2RC2 tests Message-ID: <4540761F.6010904@sheffield.ac.uk> Oops, I posted this to the Biojava list the other day by mistake! I have recently installed some more software for which there are bioperl-run tests and run the test suite with several versions of the software I could find. I've added info to http://www.bioperl.org/wiki/Release_1.5.2#bioperl-run. If there were any fails in any of the versions I tested I've noted them together with versions that were ok (if any). There maybe another 6 or so programs I'm trying to get hold of to run further tests - I'll update when I get them. Nath From n.haigh at sheffield.ac.uk Thu Oct 26 05:14:07 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 26 Oct 2006 10:14:07 +0100 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally Message-ID: <45407C5F.40104@sheffield.ac.uk> I'm thinking that it's not wise to test for things like overall_percentage_identity etc in alignments that are generated by external software like T-Coffee, Clustalw etc. Changes to software algorithms/efficiency, bug fixes etc may well alter the quality of the alignment produced in different versions and thus affect the value returned by such methods. Therefore, I think these methods should only be tested from alignments loaded directly from t/data. Nath From bix at sendu.me.uk Thu Oct 26 05:48:37 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 26 Oct 2006 10:48:37 +0100 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <45407C5F.40104@sheffield.ac.uk> References: <45407C5F.40104@sheffield.ac.uk> Message-ID: <45408475.30903@sendu.me.uk> Nathan Haigh wrote: > I'm thinking that it's not wise to test for things like > overall_percentage_identity etc in alignments that are generated by > external software like T-Coffee, Clustalw etc. Changes to software > algorithms/efficiency, bug fixes etc may well alter the quality of the > alignment produced in different versions and thus affect the value > returned by such methods. Therefore, I think these methods should only > be tested from alignments loaded directly from t/data. Did you discover some specific problem cases? From n.haigh at sheffield.ac.uk Thu Oct 26 06:04:54 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 26 Oct 2006 11:04:54 +0100 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <45408475.30903@sendu.me.uk> References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk> Message-ID: <45408846.1050001@sheffield.ac.uk> Sendu Bala wrote: > Nathan Haigh wrote: >> I'm thinking that it's not wise to test for things like >> overall_percentage_identity etc in alignments that are generated by >> external software like T-Coffee, Clustalw etc. Changes to software >> algorithms/efficiency, bug fixes etc may well alter the quality of the >> alignment produced in different versions and thus affect the value >> returned by such methods. Therefore, I think these methods should only >> be tested from alignments loaded directly from t/data. > > Did you discover some specific problem cases? My messages seem to be taking a while to come through, but, yes. It may be due to the software changing default parameters, but it makes testing the output for specific details pretty difficult and inconsistent. For example, running T-Coffee, the following command from t/TCoffee.t results in slightly different alignment: $aln = $factory->run('-type' => 'profile', '-profile' => $aln1, '-seq' => Bio::Root::IO->catfile("t","data","cysprot1b.fa")); Of particular note, is the gaps on the last line of the sequences. In 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in CATH_RAT/1-333 ------mwtalpllcagawllsagat----------aeltvnaiek------------fh ftswmkqhqktyss-reyshrlqvfannwrkiqahn----qrnhtfkmglnqfsdmsfae ikhkylwsepqncsat--ksnylrgtgp--ypssmdwrkkgnvvspvknqgacgscwtfs ttgalesavaiasgkmmtlaeqqlvdcaqnfnnh--------gcqgglpsqafeyilynk gimgedsypyigkngqckfnpekavafvknvv-nitlndeaamveavalynpvsfafevt -edfmmyksgvyssnschktpdkvnhavlavgygeqn-----gllywivknswgsnwgnn gyfliergk-nm---cglaacasypipqv >CATL_HUMAN/1-333 --------------------------------mnptlilaafclgiasatltfdhsleaq wtkwkamhnrlygmnee-gwrravweknmkmielhnqeyregkhsftmamnafgdmtsee frqvmngfqnrkpr----kgkvfqeplfyeaprsvdwrekg-yvtpvknqgqcgscwafs atgalegqmfrktgrlislseqnlvdcsgpqgn--------egcngglmdyafqyvqdng gldseesypyeateesckynpkysvandtgfv-dip-kqekalmkavatvgpisvaidag hesflfykegiyfepdc--ssedmdhgvlvvgygfestesdnn-kywlvknswgeewgmg gyvkmakdrrnh---cgiasaasyptv-- >CATL_RAT/1-334 --------------------------------mtpllllavlclgtalatpkfdqtfnaq whqwksthrrlygtnee-ewrravweknmrmiqlhngeysngkhgftmemnafgdmtnee frqivngyrhqkhk----kgrlfqeplmlqipktvdwrekg-cvtpvknqgqcgscwafs asgclegqmflktgklislseqnlvdcshdqgn--------qgcngglmdfafqyikeng gldseesypyeakdgsckyraeyavandtgfv-dip-qqekalmkavatvgpisvamdas hpslqfyssgiyyepnc--sskdldhgvlvvgygyegtdsnkd-kywlvknswgkewgmd gyikiakdrnnh---cglataasypivn- >PAPA_CARPA/1-345 mamipsiskllfvaiclfvymglsfg-------------dfsivgysqndltsterliql feswmlkhnkiyknidekiyrfeifkdnlkyidetn----kknnsywlglnvfadmsnde fkekytgsiagnytttelsyeevlndgdvnipeyvdwrqkg-avtpvknqgscgscwafs avvtiegiikirtgnlneyseqelldc----------drrsygcnggypwsalqlvaqy- gihyrntypyegvqrycrsrekgpyaaktdgvrqvqpynegallysian-qpvsvvleaa gkdfqlyrggifvgpcgnk----vdhavaavgygpn---------yiliknswgtgwgen gyirikrgtgnsygvcglytssfypvkn- >ALEU_HORVU/1-362 maharvlllalavlataavavassssfadsnpirpvtdraastlesavlgalgrtrhalr farfavrygksyesaaevrrrfrifsesleevrstn----rkglpyrlginrfsdmswee fqatrlg-aaqtcsatlagnhlmrdaaa--lpetkdwredg-ivspvknqahcgscwtfs ttgaleaaytqatgknislseqqlvdcaggfnnf--------gcngglpsqafeyikyng gidteesypykgvngvchykaenaavqvldsv-nitlnaedelknavglvrpvsvafqvi -dgfrqyksgvytsdhcgttpddvnhavlavgygven-----gvpywliknswgadwgdn gyfkmemgk-nm---caiatcasypvvaa >CATH_HUMAN/1-335 ------mwatlpllcagawllg--------vpvcgaaelsvnslek------------fh fkswmskhrktys-teeyhhrlqtfasnwrkinahn----ngnhtfkmalnqfsdmsfae ikhkylwsepqncsatks--nylrgtgp--yppsvdwrkkgnfvspvknqgacgscwtfs ttgalesaiaiatgkmlslaeqqlvdcaqdfnny--------gcqgglpsqafeyilynk gimgedtypyqgkdgyckfqpgkaigfvkdva-nitiydeeamveavalynpvsfafevt -qdfmmyrtgiysstschktpdkvnhavlavgygekn-----gipywivknswgpqwgmn gyfliergk-nm---cglaacasypiplv >CYS1_DICDI/1-343 -----mkvillfvlavftvfvs---------------srgippeeq------------sq flefqdkfnkkys-heeylerfeifksnlgkieelnliainhkadtkfgvnkfadlssde fknyylnnkeaiftddlpvadylddefinsiptafdwrtrg-avtpvknqgqcgscwsfs ttgnvegqhfisqnklvslseqnlvdcdhecmeyegeeacdegcngglqpnaynyiikng giqtessypytaetgtqcnfnsanigakisnf-tmipknetvmagyivstgplaiaadav -e-wqfyiggvfdipcn---pnsldhgilivgysakntifrknmpywivknswgadwgeq gyiylrrgk-nt---cgvsnfvstsii-- While T-Coffee <4.45 returned: >CATH_RAT/1-333 ----------mwtalpllcagawllsagat----------aeltvnaiek---------- --fhftswmkqhqktyss-reyshrlqvfannwrkiqahn----q----rnhtfkmglnq fsdmsfaeikhkylwsepqncsat--ksnylrgtgp--ypssmdwrkkgnvvspvknqga cgscwtfsttgalesavaiasgkmmtlaeqqlvdcaqnfnnh--------gcqgglpsqa feyilynkgimgedsypyigkngqckfnpekavafvknvvn-itlndeaamveavalynp vsfafevt-edfmmyksgvyssnschktpdkvnhavlavgygeqn-----gllywivkns wgsnwgnngyfliergkn----mcglaacasypipqv >PAPA_CARPA/1-345 mamipsiskllfvaiclfvymglsfgdfsivgysqndltsterliqlfeswml------- -------------khnkiyknidekiyrf-----eifkdnlkyidetnkknnsywlglnv fadmsndefkekytgsiagnytttelsyeevlndgdvnipeyvdwrqkg-avtpvknqgs cgscwafsavvtiegiikirtgnlneyseqelldc----------drrsygcnggypwsa lq-lvaqygihyrntypyegvqrycrsrekgpyaaktdgvrqvqpynegallysia-nqp vsvvleaagkdfqlyrggifvgpcgnk----vdhavaavgygpn---------yilikns wgtgwgengyirikrgtgnsygvcglytssfypvkn- >CATL_HUMAN/1-333 -----------------------------------------mnptlilaafclgiasatl tfdhsleaqwtkwkamhnrlygmneegwrravweknmkmielhnqeyregkhsftmamna fgdmtseefrqvmngfqnrkprkgkvfqeplf----yeaprsvdwrekg-yvtpvknqgq cgscwafsatgalegqmfrktgrlislseqnlvdcsgpqgn--------egcngglmdya fqyvqdnggldseesypyeateesckynpkysvandtgfvd--ipkqekalmkavatvgp isvaidaghesflfykegiyfepdc--ssedmdhgvlvvgygfestesdnn-kywlvkns wgeewgmggyvkmakdrrnh---cgiasaasyptv-- >CATL_RAT/1-334 -----------------------------------------mtpllllavlclgtalatp kfdqtfnaqwhqwksthrrlygtneeewrravweknmrmiqlhngeysngkhgftmemna fgdmtneefrqivngyrhqkhkkgrlfqeplm----lqipktvdwrekg-cvtpvknqgq cgscwafsasgclegqmflktgklislseqnlvdcshdqgn--------qgcngglmdfa fqyikenggldseesypyeakdgsckyraeyavandtgfvd--ipqqekalmkavatvgp isvamdashpslqfyssgiyyepnc--sskdldhgvlvvgygyegtdsnkd-kywlvkns wgkewgmdgyikiakdrnnh---cglataasypivn- >ALEU_HORVU/1-362 ----maharvlllalavlataavavassssfadsnpirpvtdraastlesavlgalgrtr halrfarfavrygksyesaaevrrrfrifsesleevrstn----r----kglpyrlginr fsdmsweefqatrlg-aaqtcsatlagnhlmrdaaa--lpetkdwredg-ivspvknqah cgscwtfsttgaleaaytqatgknislseqqlvdcaggfnnf--------gcngglpsqa feyikynggidteesypykgvngvchykaenaavqvldsvn-itlnaedelknavglvrp vsvafqvi-dgfrqyksgvytsdhcgttpddvnhavlavgygven-----gvpywlikns wgadwgdngyfkmemgkn----mcaiatcasypvvaa >CATH_HUMAN/1-335 ----------mwatlpllcagawllg--------vpvcgaaelsvnslek---------- --fhfkswmskhrktys-teeyhhrlqtfasnwrkinahn----n----gnhtfkmalnq fsdmsfaeikhkylwsepqncsatks--nylrgtgp--yppsvdwrkkgnfvspvknqga cgscwtfsttgalesaiaiatgkmlslaeqqlvdcaqdfnny--------gcqgglpsqa feyilynkgimgedtypyqgkdgyckfqpgkaigfvkdvan-itiydeeamveavalynp vsfafevt-qdfmmyrtgiysstschktpdkvnhavlavgygekn-----gipywivkns wgpqwgmngyfliergkn----mcglaacasypiplv >CYS1_DICDI/1-343 ---------mkvillfvlavftvfvs---------------srgippeeq---------- --sqflefqdkfnkkys-heeylerfeifksnlgkieelnliain----hkadtkfgvnk fadlssdefknyylnnkeaiftddlpvadylddefinsiptafdwrtrg-avtpvknqgq cgscwsfsttgnvegqhfisqnklvslseqnlvdcdhecmeyegeeacdegcngglqpna ynyiiknggiqtessypytaetgtqcnfnsanigakisnft-mipknetvmagyivstgp laiaadav-e-wqfyiggvfdipcn---pnsldhgilivgysakntifrknmpywivkns wgadwgeqgyiylrrgkn----tcgvsnfvstsii-- From sanges at biogem.it Thu Oct 26 06:26:36 2006 From: sanges at biogem.it (Remo Sanges) Date: Thu, 26 Oct 2006 11:26:36 +0100 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <45408846.1050001@sheffield.ac.uk> References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk> <45408846.1050001@sheffield.ac.uk> Message-ID: <45408D5C.1000305@biogem.it> Nathan Haigh wrote: > Sendu Bala wrote: > >> Nathan Haigh wrote: >> >>> I'm thinking that it's not wise to test for things like >>> overall_percentage_identity etc in alignments that are generated by >>> external software like T-Coffee, Clustalw etc. Changes to software >>> algorithms/efficiency, bug fixes etc may well alter the quality of the >>> alignment produced in different versions and thus affect the value >>> returned by such methods. Therefore, I think these methods should only >>> be tested from alignments loaded directly from t/data. >>> >> Did you discover some specific problem cases? >> > My messages seem to be taking a while to come through, but, yes. It may > be due to the software changing default parameters, but it makes testing > the output for specific details pretty difficult and inconsistent. For > example, running T-Coffee, the following command from t/TCoffee.t > results in slightly different alignment: > $aln = $factory->run('-type' => 'profile', > '-profile' => $aln1, > '-seq' => > Bio::Root::IO->catfile("t","data","cysprot1b.fa")); > > Of particular note, is the gaps on the last line of the sequences. In > 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in > I'm not a T-coffee user but usually you can come across these problems when you use different scoring parameters when align sequences. Could it be possible that they have simply changed the default parameters for gap penalties and that kind of stuff? It is possible to set them? If so you can just run the test by defining the scores in the param hash without using the default. HTH Remo From n.haigh at sheffield.ac.uk Thu Oct 26 06:33:55 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 26 Oct 2006 11:33:55 +0100 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <45408D5C.1000305@biogem.it> References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk> <45408846.1050001@sheffield.ac.uk> <45408D5C.1000305@biogem.it> Message-ID: <45408F13.9020209@sheffield.ac.uk> Remo Sanges wrote: > Nathan Haigh wrote: >> Sendu Bala wrote: >> >>> Nathan Haigh wrote: >>> >>>> I'm thinking that it's not wise to test for things like >>>> overall_percentage_identity etc in alignments that are generated by >>>> external software like T-Coffee, Clustalw etc. Changes to software >>>> algorithms/efficiency, bug fixes etc may well alter the quality of the >>>> alignment produced in different versions and thus affect the value >>>> returned by such methods. Therefore, I think these methods should only >>>> be tested from alignments loaded directly from t/data. >>>> >>> Did you discover some specific problem cases? >>> >> My messages seem to be taking a while to come through, but, yes. It may >> be due to the software changing default parameters, but it makes testing >> the output for specific details pretty difficult and inconsistent. For >> example, running T-Coffee, the following command from t/TCoffee.t >> results in slightly different alignment: >> $aln = $factory->run('-type' => 'profile', >> '-profile' => $aln1, >> '-seq' => >> Bio::Root::IO->catfile("t","data","cysprot1b.fa")); >> >> Of particular note, is the gaps on the last line of the sequences. In >> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in >> > > I'm not a T-coffee user but usually you can come across > these problems when you use different scoring parameters > when align sequences. > > Could it be possible that they have simply changed the > default parameters for gap penalties and that kind of > stuff? It is possible to set them? > > If so you can just run the test by defining > the scores in the param hash without using the default. > > HTH > > Remo That is true, but it depends on the whether the wrapper is complete enough to be able to set all the parameters provided by the software. Nath From n.haigh at sheffield.ac.uk Thu Oct 26 12:13:03 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 26 Oct 2006 17:13:03 +0100 Subject: [Bioperl-l] Bio::Restriction::Enzyme Message-ID: <4540DE8F.7070501@sheffield.ac.uk> I'm in the middle of writing some code that uses Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using Bioperl from HEAD. I seem to find that $enzyme->is_palindromic always seems to return true. Can anyone verify this? If needs be, I can send some code. Thanks Nathan From info at nanotechcongresssmailer.net Tue Oct 24 10:45:10 2006 From: info at nanotechcongresssmailer.net (International Association of Nanotechnology) Date: Tue, 24 Oct 2006 09:45:10 -0500 Subject: [Bioperl-l] ICNT2006-presents Nanotechnology Workforce Development Message-ID: <200610241445.k9OEjBBA024478@portal.open-bio.org> An HTML attachment was scrubbed... URL: From bosborne11 at verizon.net Thu Oct 26 12:37:06 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Thu, 26 Oct 2006 12:37:06 -0400 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: <4540DE8F.7070501@sheffield.ac.uk> Message-ID: Nathan, Perhaps because most restriction sites are palindromes. Anyway, I added tests for palindromic() and is_palindromic() where the site is not a palindrome, these tests pass (t/RestrictionAnalyis.t). Brian O. On 10/26/06 12:13 PM, "Nathan Haigh" wrote: > I'm in the middle of writing some code that uses > Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using > Bioperl from HEAD. > > I seem to find that $enzyme->is_palindromic always seems to return true. > Can anyone verify this? If needs be, I can send some code. > > Thanks > Nathan > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From n.haigh at sheffield.ac.uk Thu Oct 26 12:49:48 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 26 Oct 2006 17:49:48 +0100 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: References: Message-ID: <4540E72C.5020800@sheffield.ac.uk> Brian Osborne wrote: > Nathan, > > Perhaps because most restriction sites are palindromes. Anyway, I added > tests for palindromic() and is_palindromic() where the site is not a > palindrome, these tests pass (t/RestrictionAnalyis.t). > > Brian O. > > > On 10/26/06 12:13 PM, "Nathan Haigh" wrote: > > >> I'm in the middle of writing some code that uses >> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using >> Bioperl from HEAD. >> >> I seem to find that $enzyme->is_palindromic always seems to return true. >> Can anyone verify this? If needs be, I can send some code. >> >> Thanks >> Nathan >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > > Ok, thanks - nice to know :-) From cjfields at uiuc.edu Thu Oct 26 12:58:34 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 26 Oct 2006 11:58:34 -0500 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: <4540DE8F.7070501@sheffield.ac.uk> Message-ID: <001301c6f91f$f9611770$15327e82@pyrimidine> > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Nathan Haigh > Sent: Thursday, October 26, 2006 11:13 AM > To: Bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] Bio::Restriction::Enzyme > > I'm in the middle of writing some code that uses > Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using > Bioperl from HEAD. > > I seem to find that $enzyme->is_palindromic always seems to return true. > Can anyone verify this? If needs be, I can send some code. > > Thanks > Nathan You should file a bug report if you have found a test case where this method isn't working as it should, especially if Brian's tests pass and you're still getting the wrong results. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Thu Oct 26 12:57:32 2006 From: jason at bioperl.org (Jason Stajich) Date: Thu, 26 Oct 2006 09:57:32 -0700 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <45408F13.9020209@sheffield.ac.uk> References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk> <45408846.1050001@sheffield.ac.uk> <45408D5C.1000305@biogem.it> <45408F13.9020209@sheffield.ac.uk> Message-ID: Nathan - I agree - the values tend to change with different versions of the applications unfortunately. It would make sense to just test that you get out sequences that are in valid alignment format and perhaps have as many ending sequences as you started with. The more restrictive tests probably aren't reliable with mixing and matching versions. One thing we do for PAML is condition tests on the version used - but of course when a new version comes out we have to add more stuff to the tests (or just have some code that skips those tests). -jason On Oct 26, 2006, at 3:33 AM, Nathan Haigh wrote: > Remo Sanges wrote: >> Nathan Haigh wrote: >>> Sendu Bala wrote: >>> >>>> Nathan Haigh wrote: >>>> >>>>> I'm thinking that it's not wise to test for things like >>>>> overall_percentage_identity etc in alignments that are >>>>> generated by >>>>> external software like T-Coffee, Clustalw etc. Changes to software >>>>> algorithms/efficiency, bug fixes etc may well alter the quality >>>>> of the >>>>> alignment produced in different versions and thus affect the value >>>>> returned by such methods. Therefore, I think these methods >>>>> should only >>>>> be tested from alignments loaded directly from t/data. >>>>> >>>> Did you discover some specific problem cases? >>>> >>> My messages seem to be taking a while to come through, but, yes. >>> It may >>> be due to the software changing default parameters, but it makes >>> testing >>> the output for specific details pretty difficult and >>> inconsistent. For >>> example, running T-Coffee, the following command from t/TCoffee.t >>> results in slightly different alignment: >>> $aln = $factory->run('-type' => 'profile', >>> '-profile' => $aln1, >>> '-seq' => >>> Bio::Root::IO->catfile("t","data","cysprot1b.fa")); >>> >>> Of particular note, is the gaps on the last line of the >>> sequences. In >>> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in >>> >> >> I'm not a T-coffee user but usually you can come across >> these problems when you use different scoring parameters >> when align sequences. >> >> Could it be possible that they have simply changed the >> default parameters for gap penalties and that kind of >> stuff? It is possible to set them? >> >> If so you can just run the test by defining >> the scores in the param hash without using the default. >> >> HTH >> >> Remo > That is true, but it depends on the whether the wrapper is complete > enough to be able to set all the parameters provided by the software. > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From cjfields at uiuc.edu Thu Oct 26 18:01:08 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 26 Oct 2006 17:01:08 -0500 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: Message-ID: <000301c6f94a$3e2a3f10$15327e82@pyrimidine> I have been running into similar issues with EUtilities tests. Since the data on the server is constantly updated I have to try an future-proof the tests so they don't constantly fail. I have been using Test::More and like/unlike or cmp_ok to get around some of those 'fuzzy data' issues. If some methods consistently return a particular type of value, such as an integer, you could use: like($foo->get_value, qr{^\d+$}, 'value test'); #integer or similar. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > Nathan - > > I agree - the values tend to change with different versions of the > applications unfortunately. It would make sense to just test that > you get out sequences that are in valid alignment format and perhaps > have as many ending sequences as you started with. The more > restrictive tests probably aren't reliable with mixing and matching > versions. > > One thing we do for PAML is condition tests on the version used - but > of course when a new version comes out we have to add more stuff to > the tests (or just have some code that skips those tests). > > -jason > On Oct 26, 2006, at 3:33 AM, Nathan Haigh wrote: > > > Remo Sanges wrote: > >> Nathan Haigh wrote: > >>> Sendu Bala wrote: > >>> > >>>> Nathan Haigh wrote: > >>>> > >>>>> I'm thinking that it's not wise to test for things like > >>>>> overall_percentage_identity etc in alignments that are > >>>>> generated by > >>>>> external software like T-Coffee, Clustalw etc. Changes to software > >>>>> algorithms/efficiency, bug fixes etc may well alter the quality > >>>>> of the > >>>>> alignment produced in different versions and thus affect the value > >>>>> returned by such methods. Therefore, I think these methods > >>>>> should only > >>>>> be tested from alignments loaded directly from t/data. > >>>>> > >>>> Did you discover some specific problem cases? > >>>> > >>> My messages seem to be taking a while to come through, but, yes. > >>> It may > >>> be due to the software changing default parameters, but it makes > >>> testing > >>> the output for specific details pretty difficult and > >>> inconsistent. For > >>> example, running T-Coffee, the following command from t/TCoffee.t > >>> results in slightly different alignment: > >>> $aln = $factory->run('-type' => 'profile', > >>> '-profile' => $aln1, > >>> '-seq' => > >>> Bio::Root::IO->catfile("t","data","cysprot1b.fa")); > >>> > >>> Of particular note, is the gaps on the last line of the > >>> sequences. In > >>> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in > >>> >>> > >> I'm not a T-coffee user but usually you can come across > >> these problems when you use different scoring parameters > >> when align sequences. > >> > >> Could it be possible that they have simply changed the > >> default parameters for gap penalties and that kind of > >> stuff? It is possible to set them? > >> > >> If so you can just run the test by defining > >> the scores in the param hash without using the default. > >> > >> HTH > >> > >> Remo > > That is true, but it depends on the whether the wrapper is complete > > enough to be able to set all the parameters provided by the software. > > > > Nath > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich, PhD > Miller Research Fellow > University of California > Dept of Plant and Microbial Biology > 321 Koshland Hall #3102 > Berkeley, CA 94720-3102 > lab: 510.642.8441 > http://pmb.berkeley.edu/~taylor/people/js.html > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From gbazykin at Princeton.EDU Thu Oct 26 18:49:56 2006 From: gbazykin at Princeton.EDU (Georgii A Bazykin) Date: Thu, 26 Oct 2006 18:49:56 -0400 Subject: [Bioperl-l] about PAML running within bioperl In-Reply-To: <001901c6dbcf$9af4de50$0915020a@zchou> References: <001901c6dbcf$9af4de50$0915020a@zchou> Message-ID: <185431468.20061026184956@princeton.edu> I just had the exact same problem, which was also (as in Caleb Davis's case) was solved by switching to PAML 3.14 from 3.15. ------------------------------ Tuesday, September 19, 2006, 5:40:07 AM, you wrote: > Hello, every one, > I use code in the PAML HOWTO (running PAML fom within Bioperl) on > my Linux OS. And I set ENV as described by instructions. At the > beginning, it seems that ClustalW run smoothly. However, when the > programme run to call method "get_MLmatrix", somethign happened. The > following information was listed as follows: (What reason or How to solve these problems?) > ........ > Sequences (2:3) Aligned. Score: 87 > Sequences (2:4) Aligned. Score: 88 > Sequences (2:5) Aligned. Score: 87 > Sequences (2:6) Aligned. Score: 87 > Sequences (2:7) Aligned. Score: 87 > Sequences (2:8) Aligned. Score: 87 > Sequences (3:4) Aligned. Score: 93 > Sequences (3:5) Aligned. Score: 93 > Sequences (3:6) Aligned. Score: 93 > Sequences (3:7) Aligned. Score: 92 > Sequences (3:8) Aligned. Score: 92 > Sequences (4:5) Aligned. Score: 99 > Sequences (4:6) Aligned. Score: 99 > Sequences (4:7) Aligned. Score: 98 > Sequences (4:8) Aligned. Score: 98 > Sequences (5:6) Aligned. Score: 100 > Sequences (5:7) Aligned. Score: 99 > Sequences (5:8) Aligned. Score: 99 > Sequences (6:7) Aligned. Score: 99 > Sequences (6:8) Aligned. Score: 99 > Sequences (7:8) Aligned. Score: 100 > Guide tree file created: > [/home/zchou/TMPDIR/8QEqLivAKY/JU833u8OTP.dnd] > Start of Multiple Alignment > There are 7 groups > Aligning... > Group 1: Sequences: 2 Score:5875 > Group 2: Sequences: 2 Score:5877 > Group 3: Sequences: 4 Score:5864 > Group 4: Sequences: 5 Score:5537 > Group 5: Sequences: 6 Score:5727 > Group 6: Sequences: 7 Score:5608 > Group 7: Sequences: 8 Score:5607 > Alignment Score 43650 > GCG-Alignment file created > [/home/zchou/TMPDIR/8QEqLivAKY/CussPD56rZ] > aligned aa sequences were: Bio::SimpleAlign=HASH(0x87b93f4) > Can't call method "get_MLmatrix" on an undefined value at > originalpaml.pl line 57, line 332. > Zhuocheng Hou > Department of Animal Genetics and Breeding > China Agricultural University From himanshu.ardawatia at bccs.uib.no Thu Oct 26 21:54:36 2006 From: himanshu.ardawatia at bccs.uib.no (Himanshu Ardawatia) Date: Fri, 27 Oct 2006 03:54:36 +0200 Subject: [Bioperl-l] Query on tree bootstrap values Message-ID: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> Hi, 2 questions : 1. I have a phylogenetic tree and I wish to set (or modify or query) bootstrap values for all internal nodes. How do I do that using BioPerl ? 2. I tried the example script attached below for general purpose for the example newick tree with bootstrap values (also attached below) and It gives strange results even for branch length. It shows Parent ID as 0.71 which actually is the bootstrap value for the last ancestral node for human and chimp and It shows the Child node ID as 'Human' ! Am I missing something in the tree formatting ? Results also attached below. Also how to extract / modify/ add bootstrap values in this tree ? Thanks Himanshu EXAMPLE TREE (Newick with bootstrap values and branch lengths) : ################################# ( ('Chimp' : 0.052, 'Human' : 0.042) 0.71 : 0.007, 'Gorilla' : 0.060, ('Gibbon' : 0.124, 'Orangutan' : 0.0971) 1 : 0.038 ); ################################# EXAMPLE SCRIPT: ################################# #!/usr/bin/perl -w use Bio::Seq; # use Bio::TreeIO; use Bio::Tree::TreeI; # get a Tree::NodeI somehow # like from a TreeIO use Bio::TreeIO; # read in a clustalw NJ in phylip/newick format my $treeio = new Bio::TreeIO(-format => 'newick', -file => 'example_newick_tree.newick'); my $tree = $treeio->next_tree; # we'll assume it worked for demo purposes # you might want to test that it was defined my $rootnode = $tree->get_root_node; # process just the next generation foreach my $node ( $rootnode->each_Descendent() ) { print "branch len is ", $node->branch_length, "\n"; } # process all the children my $example_leaf_node; foreach my $node ( $rootnode->get_Descendents() ) { if( $node->is_Leaf ) { print "node is a leaf ... "; # for example use below $example_leaf_node = $node unless defined $example_leaf_node; } print "branch len is ", $node->branch_length, "\n"; } # The ancestor() method points to the parent of a node # A node can only have one parent my $parent = $example_leaf_node->ancestor; # parent won't likely have an description because it is an internal node # but child will because it is a leaf print "Parent id: ", $parent->id," child id: ", $example_leaf_node->id, "\n"; ########################################## RESULTS: branch len is 0.007 branch len is 0.060 branch len is 0.038 node is a leaf ... branch len is 0.042 node is a leaf ... branch len is 0.052 branch len is 0.007 node is a leaf ... branch len is 0.060 node is a leaf ... branch len is 0.0971 node is a leaf ... branch len is 0.124 branch len is 0.038 Parent id: _0.71_ child id: ___'Human'__ From n.haigh at sheffield.ac.uk Fri Oct 27 04:42:23 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 08:42:23 +0000 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: References: Message-ID: <4541C66F.1020404@sheffield.ac.uk> Hi Brian, I wonder if i'm using is_prototype() correctly as I don't seem to get any returning true: my $enz_coll = Bio::Restriction::EnzymeCollection->new(); my $prototype = 0; foreach my $enz ($enz_coll->each_enzyme) { $prototype++ if $enz->is_prototype; } print "$prototype have unique recognition sites\n"; prints: 0 have unique recognition sites Thanks Nath Brian Osborne wrote: > Nathan, > > Perhaps because most restriction sites are palindromes. Anyway, I added > tests for palindromic() and is_palindromic() where the site is not a > palindrome, these tests pass (t/RestrictionAnalyis.t). > > Brian O. > > > On 10/26/06 12:13 PM, "Nathan Haigh" wrote: > > >> I'm in the middle of writing some code that uses >> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using >> Bioperl from HEAD. >> >> I seem to find that $enzyme->is_palindromic always seems to return true. >> Can anyone verify this? If needs be, I can send some code. >> >> Thanks >> Nathan >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > > -- > A: Yes. >> Q: Are you sure? >> >>> A: Because it reverses the logical flow of conversation. >>> >>>> Q: Why is top posting frowned upon? >>>> Get Thunderbird From n.haigh at sheffield.ac.uk Fri Oct 27 04:47:21 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 08:47:21 +0000 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: <001301c6f91f$f9611770$15327e82@pyrimidine> References: <001301c6f91f$f9611770$15327e82@pyrimidine> Message-ID: <4541C799.4090507@sheffield.ac.uk> Chris Fields wrote: >> -----Original Message----- >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >> bounces at lists.open-bio.org] On Behalf Of Nathan Haigh >> Sent: Thursday, October 26, 2006 11:13 AM >> To: Bioperl-l at lists.open-bio.org >> Subject: [Bioperl-l] Bio::Restriction::Enzyme >> >> I'm in the middle of writing some code that uses >> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using >> Bioperl from HEAD. >> >> I seem to find that $enzyme->is_palindromic always seems to return true. >> Can anyone verify this? If needs be, I can send some code. >> >> Thanks >> Nathan >> > > You should file a bug report if you have found a test case where this method > isn't working as it should, especially if Brian's tests pass and you're > still getting the wrong results. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > I was doing some filtering of the default set of enzymes and happened to removed the 2 that are not palindromic before I used is_palindromic(). Thus, I didn't see any that were not palindromic - if that makes sense! Since I know very little about restriction enzymes, I'll trust that these are correct :-) and I'm getting the correct results. Thanks Nath From n.haigh at sheffield.ac.uk Fri Oct 27 05:04:40 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 09:04:40 +0000 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <000301c6f94a$3e2a3f10$15327e82@pyrimidine> References: <000301c6f94a$3e2a3f10$15327e82@pyrimidine> Message-ID: <4541CBA8.10006@sheffield.ac.uk> Chris Fields wrote: > I have been running into similar issues with EUtilities tests. Since the > data on the server is constantly updated I have to try an future-proof the > tests so they don't constantly fail. > > I have been using Test::More and like/unlike or cmp_ok to get around some of > those 'fuzzy data' issues. If some methods consistently return a particular > type of value, such as an integer, you could use: > > like($foo->get_value, qr{^\d+$}, 'value test'); #integer > > or similar. > > Chris > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > >> Nathan - >> >> I agree - the values tend to change with different versions of the >> applications unfortunately. It would make sense to just test that >> you get out sequences that are in valid alignment format and perhaps >> have as many ending sequences as you started with. The more >> restrictive tests probably aren't reliable with mixing and matching >> versions. >> >> One thing we do for PAML is condition tests on the version used - but >> of course when a new version comes out we have to add more stuff to >> the tests (or just have some code that skips those tests). >> >> -jason >> On Oct 26, 2006, at 3:33 AM, Nathan Haigh wrote: >> >> I think it makes sense to test that data of the expected type was returned by the xternal resource but not to test the specifics of what was retured. If specifics are tested we are then in the realm of testing whether we believe the data returned by the external resource or not. We should assume that the domain experts for these resources know what they are doing - in some cases this might not be true :-) but I think we should stick to testing that the objects created hold the expected type of data. I like what Chris had to say (above) but wonder whether tests would/should be tested for in the module itself - i.e. testing that a stored value is an integer and warn/throw if not? Nath From bix at sendu.me.uk Fri Oct 27 05:08:18 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 27 Oct 2006 10:08:18 +0100 Subject: [Bioperl-l] Query on tree bootstrap values In-Reply-To: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> Message-ID: <4541CC82.2040705@sendu.me.uk> Himanshu Ardawatia wrote: > Hi, > > 2 questions : > > 1. I have a phylogenetic tree and I wish to set (or modify or query) > bootstrap values for all internal nodes. How do I do that using BioPerl ? Does bootstrap() not do what you need? > 2. I tried the example script attached below for general purpose for the > example newick tree with bootstrap values (also attached below) and It gives > strange results even for branch length. It shows Parent ID as 0.71 which > actually is the bootstrap value for the last ancestral node for human and > chimp and It shows the Child node ID as 'Human' ! Am I missing something in > the tree formatting ? Results also attached below. Also how to extract / > modify/ add bootstrap values in this tree ? [snip] > EXAMPLE TREE (Newick with bootstrap values and branch lengths) : > ################################# > ( > ('Chimp' : 0.052, > 'Human' : 0.042) 0.71 : 0.007, > 'Gorilla' : 0.060, > ('Gibbon' : 0.124, > 'Orangutan' : 0.0971) 1 : 0.038 > ); > ################################# Are you sure this is in the correct format? For example, with the tree: ( ('Chimp':0.052,'Human':0.042)'Human_Chimp_Ancestor':0.007, 'Gorilla':0.060, ('Gibbon':0.124,'Orangutan':0.0971)'Gibbon_Orangutan_Ancestor':0.038); and your script (with a print "--\n" between the two printing loops for clarity) I get... > ########################################## > > RESULTS: > branch len is 0.007 > branch len is 0.060 > branch len is 0.038 > node is a leaf ... branch len is 0.042 > node is a leaf ... branch len is 0.052 > branch len is 0.007 > node is a leaf ... branch len is 0.060 > node is a leaf ... branch len is 0.0971 > node is a leaf ... branch len is 0.124 > branch len is 0.038 > Parent id: _0.71_ child id: ___'Human'__ ... branch len is 0.007 branch len is 0.060 branch len is 0.038 -- branch len is 0.007 node is a leaf ... branch len is 0.052 node is a leaf ... branch len is 0.042 node is a leaf ... branch len is 0.060 branch len is 0.038 node is a leaf ... branch len is 0.124 node is a leaf ... branch len is 0.0971 Parent id: 'Human_Chimp_Ancestor' child id: 'Chimp' This seems reasonable to me. What were you expecting? From n.haigh at sheffield.ac.uk Fri Oct 27 07:36:10 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 11:36:10 +0000 Subject: [Bioperl-l] Query on tree bootstrap values In-Reply-To: <4541CC82.2040705@sendu.me.uk> References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> <4541CC82.2040705@sendu.me.uk> Message-ID: <4541EF2A.4050600@sheffield.ac.uk> Sendu Bala wrote: > Himanshu Ardawatia wrote: > >> Hi, >> >> 2 questions : >> >> 1. I have a phylogenetic tree and I wish to set (or modify or query) >> bootstrap values for all internal nodes. How do I do that using BioPerl ? >> > > Does bootstrap() not do what you need? > > > >> 2. I tried the example script attached below for general purpose for the >> example newick tree with bootstrap values (also attached below) and It gives >> strange results even for branch length. It shows Parent ID as 0.71 which >> actually is the bootstrap value for the last ancestral node for human and >> chimp and It shows the Child node ID as 'Human' ! Am I missing something in >> the tree formatting ? Results also attached below. Also how to extract / >> modify/ add bootstrap values in this tree ? >> > [snip] > >> EXAMPLE TREE (Newick with bootstrap values and branch lengths) : >> ################################# >> ( >> ('Chimp' : 0.052, >> 'Human' : 0.042) 0.71 : 0.007, >> 'Gorilla' : 0.060, >> ('Gibbon' : 0.124, >> 'Orangutan' : 0.0971) 1 : 0.038 >> ); >> ################################# >> > > Are you sure this is in the correct format? > He/she may have a tree that already contains bootstrap values output from another program. If this is so, which program did you use? Without reminding myself of the formats, you should lookup newick format and whther it is possible to store bootstraps in it. In addition you should also look up the nhx format. > For example, with the tree: > ( ('Chimp':0.052,'Human':0.042)'Human_Chimp_Ancestor':0.007, > 'Gorilla':0.060, > ('Gibbon':0.124,'Orangutan':0.0971)'Gibbon_Orangutan_Ancestor':0.038); > > This tree does not contain any bootstrap values - only branch lengths. Sorry I can't be much more help at the moment - if i get a spare 10 mins i'll have a closer look. Nath From bix at sendu.me.uk Fri Oct 27 07:16:08 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 27 Oct 2006 12:16:08 +0100 Subject: [Bioperl-l] Query on tree bootstrap values In-Reply-To: <4541EF2A.4050600@sheffield.ac.uk> References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> <4541CC82.2040705@sendu.me.uk> <4541EF2A.4050600@sheffield.ac.uk> Message-ID: <4541EA78.3050404@sendu.me.uk> Nathan S. Haigh wrote: > Sendu Bala wrote: >> Himanshu Ardawatia wrote: >>> >>> EXAMPLE TREE (Newick with bootstrap values and branch lengths) : >>> ################################# >>> ( >>> ('Chimp' : 0.052, >>> 'Human' : 0.042) 0.71 : 0.007, >>> 'Gorilla' : 0.060, >>> ('Gibbon' : 0.124, >>> 'Orangutan' : 0.0971) 1 : 0.038 >>> ); >>> ################################# >>> >> Are you sure this is in the correct format? >> > > He/she may have a tree that already contains bootstrap values output > from another program. If this is so, which program did you use? Without > reminding myself of the formats, you should lookup newick format and > whther it is possible to store bootstraps in it. In addition you should > also look up the nhx format. Ah, well from a brief google it seemed like some software do store boostrap values for internal nodes as the node ids when outputting in Newick format. I don't think Bioperl should be able to tell the difference between a normal id and a bootstrap value, so you'll have to detect that yourself and manually use bootstrap() when you get an id that looks like a number. Or should Bioperl be making this assumption for you? Is that a safe thing to do? Maybe as an option only? From n.haigh at sheffield.ac.uk Fri Oct 27 08:24:49 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 12:24:49 +0000 Subject: [Bioperl-l] Query on tree bootstrap values In-Reply-To: <4541EA78.3050404@sendu.me.uk> References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> <4541CC82.2040705@sendu.me.uk> <4541EF2A.4050600@sheffield.ac.uk> <4541EA78.3050404@sendu.me.uk> Message-ID: <4541FA91.3040505@sheffield.ac.uk> --snip-- > > Ah, well from a brief google it seemed like some software do store > boostrap values for internal nodes as the node ids when outputting in > Newick format. I don't think Bioperl should be able to tell the > difference between a normal id and a bootstrap value, so you'll have > to detect that yourself and manually use bootstrap() when you get an > id that looks like a number. If I remember rightly, in programs like Clustal you can specify where bootstrap values are stored - node or branch. I can't remember which is the default way, but TreeView can only see bootstraps in they are stored using the "non-default" setting. This "could" be the same issue here. > > Or should Bioperl be making this assumption for you? Is that a safe > thing to do? Maybe as an option only? I don't know without a closer look - i'd also need to look at the newick format definition as to whether this is an "extension" to the format or if something is just flouting the newick rules. Nath From n.haigh at sheffield.ac.uk Fri Oct 27 08:59:51 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 12:59:51 +0000 Subject: [Bioperl-l] Caching sequences Message-ID: <454202C7.1040701@sheffield.ac.uk> I have a script that is capable of downloading sequences from GenBank based on GI numbers. I retrieve them if fasta format in order to save bandwidth, but I'd like to take this one step further and cache the sequences in case the user want to rerun the script using some of the GI's they used previously. Does anyone have any guidance on how best to do this? Cheers Nath From bix at sendu.me.uk Fri Oct 27 08:35:13 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 27 Oct 2006 13:35:13 +0100 Subject: [Bioperl-l] Caching sequences In-Reply-To: <454202C7.1040701@sheffield.ac.uk> References: <454202C7.1040701@sheffield.ac.uk> Message-ID: <4541FD01.6090803@sendu.me.uk> Nathan S. Haigh wrote: > I have a script that is capable of downloading sequences from GenBank > based on GI numbers. I retrieve them if fasta format in order to save > bandwidth, but I'd like to take this one step further and cache the > sequences in case the user want to rerun the script using some of the > GI's they used previously. > > Does anyone have any guidance on how best to do this? You'd probably write the sequences out in some suitable format and access them via Bio::Index Or, I'm sure bioperl-db excels at this kind of thing, but is a little more involved if this is only a simple situation. From bosborne11 at verizon.net Fri Oct 27 09:09:30 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Fri, 27 Oct 2006 09:09:30 -0400 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: <4541C66F.1020404@sheffield.ac.uk> Message-ID: Nathan, I don't know how this is supposed to work, there would be different ways to make is_prototype true. One way would be to make the enzyme with the first occurrence of a given restriction site the prototype (and the next enzymes with the same site are isoschizomers). Or, one could wait until one site had appeared twice, with 2 different enzymes, then make the first the prototype, etc. I would have done it the first way myself but I took a quick look at IO/withrefm.pm and it looks like it's doing it the second way. That means one can read an enzyme file and end up with no duplicated restriction sites, or prototypes and isoschizomers. Brian O. On 10/27/06 4:42 AM, "Nathan S. Haigh" wrote: > Hi Brian, > > I wonder if i'm using is_prototype() correctly as I don't seem to get > any returning true: > > my $enz_coll = Bio::Restriction::EnzymeCollection->new(); > my $prototype = 0; > foreach my $enz ($enz_coll->each_enzyme) { > $prototype++ if $enz->is_prototype; > } > print "$prototype have unique recognition sites\n"; > > prints: > 0 have unique recognition sites > > Thanks > Nath > > Brian Osborne wrote: >> Nathan, >> >> Perhaps because most restriction sites are palindromes. Anyway, I added >> tests for palindromic() and is_palindromic() where the site is not a >> palindrome, these tests pass (t/RestrictionAnalyis.t). >> >> Brian O. >> >> >> On 10/26/06 12:13 PM, "Nathan Haigh" wrote: >> >> >>> I'm in the middle of writing some code that uses >>> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using >>> Bioperl from HEAD. >>> >>> I seem to find that $enzyme->is_palindromic always seems to return true. >>> Can anyone verify this? If needs be, I can send some code. >>> >>> Thanks >>> Nathan >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> >> >> > From n.haigh at sheffield.ac.uk Fri Oct 27 10:19:02 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 14:19:02 +0000 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: References: Message-ID: <45421556.9060300@sheffield.ac.uk> Brian Osborne wrote: > Nathan, > > I don't know how this is supposed to work, there would be different ways to > make is_prototype true. One way would be to make the enzyme with the first > occurrence of a given restriction site the prototype (and the next enzymes > with the same site are isoschizomers). Or, one could wait until one site had > appeared twice, with 2 different enzymes, then make the first the prototype, > etc. I would have done it the first way myself but I took a quick look at > IO/withrefm.pm and it looks like it's doing it the second way. That means > one can read an enzyme file and end up with no duplicated restriction sites, > or prototypes and isoschizomers. > > Brian O. > > Hmm, I'd have done it the first way also. Doing it the second way would mean you only ended up with something as a prototype if there were multiple enzymes with the same restriction site - is that correct biologically? Nath From n.haigh at sheffield.ac.uk Fri Oct 27 10:23:20 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 14:23:20 +0000 Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage Message-ID: <45421658.5000103@sheffield.ac.uk> As you may be aware by now, i'm working with Bio::Restriction::Analysis and friends. I'm doing restriction analysis on large sequences - chromosomes. I need to identify an appropriate enzyme based on the total length of fragments that are of a certain size (e.g. 100 - 500 bp). However, the amount of memory used by Bio::Restriction::Analysis::fragments() is prohibative. I have the following code (bottom) which downloads 2 thaliana chromosomes (mito and chloro - so pretty small) and runs an analysis and then loops through the fragments for all enzymes in the default collection. My memory usage just keep on climbing and none seems to get freed up even when a $ra goes out of scope (start dealing with the next sequence). Is this a memory leak of some sort, is there a way to free up memory as I go? I'd appreciate any help/advice on how to reduce the amount of memory being consumed as I'd like to use all the thaliana chromosomes (not just mito and chloro), which at the moment probably won't work. Cheers Nath use strict; use Bio::DB::GenBank; use Bio::Restriction::Analysis; use Bio::Restriction::EnzymeCollection; my @seq_objs; my @gis = ( 7525012, 26556996 ); my $db = Bio::DB::GenBank->new(-format => "fasta"); foreach my $gi (@gis) { print "Getting GI: $gi\n"; push @seq_objs, $db->get_Seq_by_id($gi) } my $min_fragment_size = 100; my $max_fragment_size = 500; my $enz_Coll = Bio::Restriction::EnzymeCollection->new(); foreach my $seq (@seq_objs) { my $tot_size = 0; print "Processing ", $seq->primary_id,"\n"; my $ra = Bio::Restriction::Analysis->new( -seq=>$seq, -enzymes=>$enz_Coll, ); my @all_enzymes = $ra->cutters->each_enzyme; print " Calc total length of fragments in range: $min_fragment_size - $max_fragment_size\n"; foreach my $enzyme ( @all_enzymes ) { # fragments() is a real memory hog foreach my $frag ($ra->fragments($enzyme)) { next if $min_fragment_size && (length $frag < $min_fragment_size); next if $max_fragment_size && (length $frag > $max_fragment_size); $tot_size += length $frag; } # do something based on value of $tot_size #print " ", $enzyme->name, " total = $tot_size\n"; } print "DONE\n"; } From avilella at gmail.com Fri Oct 27 09:39:41 2006 From: avilella at gmail.com (Albert Vilella) Date: Fri, 27 Oct 2006 14:39:41 +0100 Subject: [Bioperl-l] scale branch lengths of a tree to sum 1 In-Reply-To: <358f4d650610270623t13875c8bv50278af5ee5b3c44@mail.gmail.com> References: <358f4d650610270623t13875c8bv50278af5ee5b3c44@mail.gmail.com> Message-ID: <358f4d650610270639q14870a6erae2e3c4e9063105d@mail.gmail.com> I respond to myself: I think I found the way: my $tree = $treeio->next_tree; my $total_branch_length = 0; foreach my $node ($tree->get_nodes) { $total_branch_length += $node->branch_length; } foreach my $node ($tree->get_nodes) { my $branch_length = $node->branch_length; next unless (defined($branch_length)); $node->branch_length($branch_length/$total_branch_length); 1; } my $new_branch_length; foreach my $node ($tree->get_nodes) { $new_branch_length += $node->branch_length; } 1; On 10/27/06, Albert Vilella wrote: > Hi all, > > I am in need of a method that would scale the different branch lengths > of a tree so that after the scaling they all sum up to exactly 1. > > Any pointers? Has anyone done that before? > > Thanks in advance, > > Albert. > From cjfields at uiuc.edu Fri Oct 27 10:35:35 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 09:35:35 -0500 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <4541CBA8.10006@sheffield.ac.uk> Message-ID: <001501c6f9d5$2e33e120$15327e82@pyrimidine> ... > I think it makes sense to test that data of the expected type was > returned by the xternal resource but not to test the specifics of what > was retured. If specifics are tested we are then in the realm of testing > whether we believe the data returned by the external resource or not. We > should assume that the domain experts for these resources know what they > are doing - in some cases this might not be true :-) but I think we > should stick to testing that the objects created hold the expected type > of data. > > I like what Chris had to say (above) but wonder whether tests > would/should be tested for in the module itself - i.e. testing that a > stored value is an integer and warn/throw if not? > > Nath Yeah, sorry about the top post (stupid Outlook always sticks the sig at the top of the page!). Testing in the module would be best but can be tricky for the very same reasons that writing tests entail, even more so. For instance, for NCBI esummary data, I parse the data in a very generic way in order to have access to as much data as possible. For tests, I have to assume that NCBI will always return a particular type of value (string, integer, date). I can test for each of those with a regex in the module fairly simply and throw/wanr, as you indicate. However, if they decide to add new data with a data tag other that the ones I test for in the module (i.e. String, Integer, Date), I suddenly have warns/throws showing up and cluttering/clobbering the code for perfectly valid data. However, if these are caught in tests and the tests fail, no big loss. The actual module still works, even if the tests are failing based on an new unknown value being returned. For me, failed tests are sort of a warning light to let me know that something has changed, but it doesn't necessarily mean a module doesn't work. I generally use throw/warn for something truly catastrophic, like no response from the server or an error in the XML, which affects downstream methods. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Fri Oct 27 11:09:36 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 10:09:36 -0500 Subject: [Bioperl-l] Caching sequences In-Reply-To: <454202C7.1040701@sheffield.ac.uk> Message-ID: <001601c6f9d9$ebd8c7f0$15327e82@pyrimidine> > I have a script that is capable of downloading sequences from GenBank > based on GI numbers. I retrieve them if fasta format in order to save > bandwidth, but I'd like to take this one step further and cache the > sequences in case the user want to rerun the script using some of the > GI's they used previously. > > Does anyone have any guidance on how best to do this? > > Cheers > Nath There is Bio::DB::InMemoryCache, which is really an interface but appears to have several methods defined; you could look for modules which implement it. Sendu's suggestion of the Bio::Index modules and bioperl-db are also good starting points. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Fri Oct 27 11:21:49 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 10:21:49 -0500 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: <45421556.9060300@sheffield.ac.uk> Message-ID: <001701c6f9db$9f90d160$15327e82@pyrimidine> > Brian Osborne wrote: > > Nathan, > > > > I don't know how this is supposed to work, there would be different ways > to > > make is_prototype true. One way would be to make the enzyme with the > first > > occurrence of a given restriction site the prototype (and the next > enzymes > > with the same site are isoschizomers). Or, one could wait until one site > had > > appeared twice, with 2 different enzymes, then make the first the > prototype, > > etc. I would have done it the first way myself but I took a quick look > at > > IO/withrefm.pm and it looks like it's doing it the second way. That > means > > one can read an enzyme file and end up with no duplicated restriction > sites, > > or prototypes and isoschizomers. > > > > Brian O. > > > > > Hmm, I'd have done it the first way also. Doing it the second way would > mean you only ended up with something as a prototype if there were > multiple enzymes with the same restriction site - is that correct > biologically? > > Nath I had a look at all the Restriction::IO modules a while back; most need serious updating! It just hasn't been a top priority unfortunately. I think the prototype issue may depend on the IO format and whether or not one is defined explicitly in the file being parsed or is just chosen based on what Brian said (order in the file, similar cutting site). By the strictest definition (and cheating by looking at the Fermentas web site), the prototype is supposed to be the first enzyme discovered which cleaves a unique sequence, so it may not be the first enzyme found in the file. Isoschizomers are those discovered to cleave the same sequence subsequent to the prototype. Neoschizomers cleave the same sequence as a prototype but at a different site. So this calls into question whether the prototype should be defined at all unless it is specifically indicated in the file. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Fri Oct 27 12:47:53 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 16:47:53 +0000 Subject: [Bioperl-l] Caching sequences In-Reply-To: <8273f6c20610270838y76bf0dd1haed77615f46a82fc@mail.gmail.com> References: <454202C7.1040701@sheffield.ac.uk> <001601c6f9d9$ebd8c7f0$15327e82@pyrimidine> <8273f6c20610270838y76bf0dd1haed77615f46a82fc@mail.gmail.com> Message-ID: <45423839.9040503@sheffield.ac.uk> Jason Stajich wrote: > Bio::DB::FileCache does one better and lets you cache the data in a > persistent file. Not sure this index is shareable among users though > - bioperl-db is a better soln when that is desired. Thanks I'll have a look into it. No need for being sharable among users - not unless the script becomes heavily used. Thanks Nath From cjfields at uiuc.edu Fri Oct 27 12:15:00 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 11:15:00 -0500 Subject: [Bioperl-l] StandAloneFasta.t bioperl-run tests Message-ID: <000101c6f9e3$0e5e95d0$15327e82@pyrimidine> Nathan, The test fails you posted on the wiki seem to indicate that using the wrapper works but the order of the returned hits is off. Does the order of the returned hits match the actual FASTA report order? If it does then the tests need to be fixed in a way to make it more flexible, to account for some data 'fuzziness' due to variations in output based on different versions. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Fri Oct 27 12:50:54 2006 From: jason at bioperl.org (Jason Stajich) Date: Fri, 27 Oct 2006 09:50:54 -0700 Subject: [Bioperl-l] Query on tree bootstrap values In-Reply-To: <4541EA78.3050404@sendu.me.uk> References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> <4541CC82.2040705@sendu.me.uk> <4541EF2A.4050600@sheffield.ac.uk> <4541EA78.3050404@sendu.me.uk> Message-ID: <1230E110-01AB-4D4E-842F-20B939555299@bioperl.org> I've answered to this effect this multiple times in the past on the mailing list. newick format does not distinguish between internal ids and bootstrap values (or whatever else you want to attach there). Different programs have different conventions. when both values are present and encoded so that we can parse out the bootstrap like this: [BOOTSTRAP] the parser grabs it out. If you know all the internal ids are boostraps you can just copy the values over manually very simply for my $node ( grep { ! $_->is_Leaf } $tree->get_nodes ) { # get all the internal nodes $node->bootstrap($node->id) if defined $node->id && length($node- >id); # copy id to boostrap $node->id(''); # set internal id to empty } If someone can make this clearer on a wiki page that would be great. On Oct 27, 2006, at 4:16 AM, Sendu Bala wrote: > Nathan S. Haigh wrote: >> Sendu Bala wrote: >>> Himanshu Ardawatia wrote: >>>> >>>> EXAMPLE TREE (Newick with bootstrap values and branch lengths) : >>>> ################################# >>>> ( >>>> ('Chimp' : 0.052, >>>> 'Human' : 0.042) 0.71 : 0.007, >>>> 'Gorilla' : 0.060, >>>> ('Gibbon' : 0.124, >>>> 'Orangutan' : 0.0971) 1 : 0.038 >>>> ); >>>> ################################# >>>> >>> Are you sure this is in the correct format? >>> >> >> He/she may have a tree that already contains bootstrap values output >> from another program. If this is so, which program did you use? >> Without >> reminding myself of the formats, you should lookup newick format and >> whther it is possible to store bootstraps in it. In addition you >> should >> also look up the nhx format. > > Ah, well from a brief google it seemed like some software do store > boostrap values for internal nodes as the node ids when outputting in > Newick format. I don't think Bioperl should be able to tell the > difference between a normal id and a bootstrap value, so you'll > have to > detect that yourself and manually use bootstrap() when you get an id > that looks like a number. > > Or should Bioperl be making this assumption for you? Is that a safe > thing to do? Maybe as an option only? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From avilella at gmail.com Fri Oct 27 09:23:07 2006 From: avilella at gmail.com (Albert Vilella) Date: Fri, 27 Oct 2006 14:23:07 +0100 Subject: [Bioperl-l] scale branch lengths of a tree to sum 1 Message-ID: <358f4d650610270623t13875c8bv50278af5ee5b3c44@mail.gmail.com> Hi all, I am in need of a method that would scale the different branch lengths of a tree so that after the scaling they all sum up to exactly 1. Any pointers? Has anyone done that before? Thanks in advance, Albert. From cjfields at uiuc.edu Fri Oct 27 14:34:57 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 13:34:57 -0500 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign Message-ID: <000001c6f9f6$9ab12710$15327e82@pyrimidine> I am working an refactoring the AlignIO::stockholm parser to get it reading and writing Pfam/Rfam alignments, and noticed that many alignments have EMBL-like annotations attached, which pertain to the entire alignment: # STOCKHOLM 1.0 #=GF ID ykkC-yxkD #=GF AC RF00442 #=GF DE ykkC-yxkD element #=GF AU Moxon SJ #=GF GA 20.0 #=GF NC 0.1 #=GF TC 59.4 #=GF SE Barrick JE, Breaker RR #=GF SS Predicted; Barrick JE, Breaker RR #=GF TP Cis-reg; riboswitch; #=GF BM cmbuild CM SEED #=GF BM cmsearch -W 175 CM SEQDB #=GF RN [1] #=GF RM 15096624 #=GF RT New RNA motifs suggest an expanded scope for riboswitches in #=GF RT bacterial genetic control. #=GF RA Barrick JE, Corbino KA, Winkler WC, Nahvi A, Mandal M, Collins J, Lee #=GF RA M, Roth A, Sudarsan N, Jona I, Wickiser JK, Breaker RR; #=GF RL Proc Natl Acad Sci U S A 2004;101:6421-6426. #=GF CC This family represents the bacterial ykkC/yxkD element. The function of #=GF CC this family is unclear although it has been suggested that it may function #=GF CC to switch on efflux pumps and detoxification systems in response to harmful #=GF CC environmental molecules [1]. The Thermoanaerobacter tengcongensis sequence #=GF CC EMBL:AE013027 overlaps with Rfam:RF00167 suggesting that the two #=GF CC riboswitches may work in conjunction to regulate the the upstream gene #=GF CC which codes for Swiss:Q8RC62, a member of Pfam:PF00860 (Personal obs. Moxon #=GF CC SJ). #=GF SQ 16 SimpleAlign, as implemented, seemingly doesn't have a way to store this information. I'll work on getting the core alignment IO working, but would there be any interest in having a way to store annotations in Bio::SimpleAlign? I'm guessing the methods would be similar to the various Bio::Seq Annotation methods. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Fri Oct 27 16:23:46 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 27 Oct 2006 16:23:46 -0400 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: <000001c6f9f6$9ab12710$15327e82@pyrimidine> References: <000001c6f9f6$9ab12710$15327e82@pyrimidine> Message-ID: You could make SimpleAlign be a Bio::AnnotationHolderI. (I suppose this is what you meant by the 'various Bio::Seq Annotation methods' too.) Just to make sure I'm not misunderstanding, I suppose the annotation pertains to the entire alignment? -hilmar On Oct 27, 2006, at 2:34 PM, Chris Fields wrote: > I am working an refactoring the AlignIO::stockholm parser to get it > reading > and writing Pfam/Rfam alignments, and noticed that many alignments > have > EMBL-like annotations attached, which pertain to the entire alignment: > > # STOCKHOLM 1.0 > #=GF ID ykkC-yxkD > #=GF AC RF00442 > #=GF DE ykkC-yxkD element > #=GF AU Moxon SJ > #=GF GA 20.0 > #=GF NC 0.1 > #=GF TC 59.4 > #=GF SE Barrick JE, Breaker RR > #=GF SS Predicted; Barrick JE, Breaker RR > #=GF TP Cis-reg; riboswitch; > #=GF BM cmbuild CM SEED > #=GF BM cmsearch -W 175 CM SEQDB > #=GF RN [1] > #=GF RM 15096624 > #=GF RT New RNA motifs suggest an expanded scope for > riboswitches in > #=GF RT bacterial genetic control. > #=GF RA Barrick JE, Corbino KA, Winkler WC, Nahvi A, Mandal M, > Collins J, > Lee > #=GF RA M, Roth A, Sudarsan N, Jona I, Wickiser JK, Breaker RR; > #=GF RL Proc Natl Acad Sci U S A 2004;101:6421-6426. > #=GF CC This family represents the bacterial ykkC/yxkD element. The > function of > #=GF CC this family is unclear although it has been suggested > that it may > function > #=GF CC to switch on efflux pumps and detoxification systems in > response > to harmful > #=GF CC environmental molecules [1]. The Thermoanaerobacter > tengcongensis > sequence > #=GF CC EMBL:AE013027 overlaps with Rfam:RF00167 suggesting that > the two > #=GF CC riboswitches may work in conjunction to regulate the the > upstream > gene > #=GF CC which codes for Swiss:Q8RC62, a member of Pfam:PF00860 > (Personal > obs. Moxon > #=GF CC SJ). > #=GF SQ 16 > > SimpleAlign, as implemented, seemingly doesn't have a way to store > this > information. > > I'll work on getting the core alignment IO working, but would there > be any > interest in having a way to store annotations in Bio::SimpleAlign? > I'm > guessing the methods would be similar to the various Bio::Seq > Annotation > methods. > > Chris > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Fri Oct 27 16:38:17 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 15:38:17 -0500 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: Message-ID: <000001c6fa07$d8659990$15327e82@pyrimidine> Hilmar Lapp wrote: > You could make SimpleAlign be a Bio::AnnotationHolderI. (I > suppose this is what you meant by the 'various Bio::Seq Annotation > methods' too.) > > Just to make sure I'm not misunderstanding, I suppose the > annotation pertains to the entire alignment? > > -hilmar ... Yes, that's correct. I would probably use Bio::Seq::Meta for the sequence-specific markup lines. I would have to add another new method to deal with non-sequence-based consensus data (like sec. structure) for now. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Fri Oct 27 11:38:05 2006 From: jason at bioperl.org (Jason Stajich) Date: Fri, 27 Oct 2006 08:38:05 -0700 Subject: [Bioperl-l] Caching sequences In-Reply-To: <001601c6f9d9$ebd8c7f0$15327e82@pyrimidine> References: <454202C7.1040701@sheffield.ac.uk> <001601c6f9d9$ebd8c7f0$15327e82@pyrimidine> Message-ID: <8273f6c20610270838y76bf0dd1haed77615f46a82fc@mail.gmail.com> Bio::DB::FileCache does one better and lets you cache the data in a persistent file. Not sure this index is shareable among users though - bioperl-db is a better soln when that is desired. -jason On 10/27/06, Chris Fields wrote: > > > I have a script that is capable of downloading sequences from GenBank > > based on GI numbers. I retrieve them if fasta format in order to save > > bandwidth, but I'd like to take this one step further and cache the > > sequences in case the user want to rerun the script using some of the > > GI's they used previously. > > > > Does anyone have any guidance on how best to do this? > > > > Cheers > > Nath > > There is Bio::DB::InMemoryCache, which is really an interface but appears > to > have several methods defined; you could look for modules which implement > it. > Sendu's suggestion of the Bio::Index modules and bioperl-db are also good > starting points. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich jason at bioperl.org http://www.duke.edu/~jes12/ From cjfields at uiuc.edu Fri Oct 27 21:57:58 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 20:57:58 -0500 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: References: <000001c6f9f6$9ab12710$15327e82@pyrimidine> Message-ID: <24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu> On Oct 27, 2006, at 3:23 PM, Hilmar Lapp wrote: > You could make SimpleAlign be a Bio::AnnotationHolderI. (I suppose > this is what you meant by the 'various Bio::Seq Annotation methods' > too.) > > Just to make sure I'm not misunderstanding, I suppose the annotation > pertains to the entire alignment? > > -hilmar BTW, was that supposed to be Bio::AnnotatableI, or Bio::AnnotationHolderI? The latter isn't present in CVS HEAD. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From eric.ross at neuro.utah.edu Sat Oct 28 17:24:30 2006 From: eric.ross at neuro.utah.edu (Eric Ross) Date: Sat, 28 Oct 2006 15:24:30 -0600 Subject: [Bioperl-l] PAML References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu> Message-ID: <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu> I am trying to extract the "Naive Empirical Bayes (NEB) probabilities" from a Bio::Tools::Phylo::PAML::Result object. I am able to extract other data from the report, but there seems to be a conflict in the documentation. One doc implies that there should be a get_posteriors method. (It's used as an example in the Bio::Tools::Phylo::PAML doc), but the method does not appear to exist in the Bio::Tools::Phylo::PAML::Result object. I have been trying various methods, in the event I'm just "confused", but I've had no luck, thus far. Anyone have suggestions? code: ----begin code------- #!/usr/bin/perl -w use strict; use Bio::Tools::Phylo::PAML; my $parser = new Bio::Tools::Phylo::PAML (-file => "mlc"); my $result = $parser->next_result; my @posteriors = $result->get_posteriors(); print "@posteriors"; exit(0); ---------end code------------- --------------- Eric Ross Computer Analyst II ejr at neuro.utah.edu Howard Hughes Medical Institute University of Utah S?nchez Lab From avilella at gmail.com Sun Oct 29 05:52:04 2006 From: avilella at gmail.com (Albert Vilella) Date: Sun, 29 Oct 2006 10:52:04 +0000 Subject: [Bioperl-l] PAML In-Reply-To: <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu> References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu> <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu> Message-ID: <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com> I don't know if this method is implemented. I can't grep-find it. Maybe it's simply not there yet, but was planned when the documentation was written. On 10/28/06, Eric Ross wrote: > I am trying to extract the "Naive Empirical Bayes (NEB) probabilities" from a Bio::Tools::Phylo::PAML::Result object. > > I am able to extract other data from the report, but there seems to be a conflict in the documentation. One doc implies that there should be a get_posteriors method. (It's used as an example in the Bio::Tools::Phylo::PAML doc), but the method does not appear to exist in the Bio::Tools::Phylo::PAML::Result object. > > > I have been trying various methods, in the event I'm just "confused", but I've had no luck, thus far. Anyone have suggestions? > > > code: > > ----begin code------- > #!/usr/bin/perl -w > > use strict; > > > use Bio::Tools::Phylo::PAML; > my $parser = new Bio::Tools::Phylo::PAML > (-file => "mlc"); > my $result = $parser->next_result; > my @posteriors = $result->get_posteriors(); > > print "@posteriors"; > > exit(0); > > ---------end code------------- > > > > --------------- > Eric Ross > Computer Analyst II > ejr at neuro.utah.edu > Howard Hughes Medical Institute > University of Utah > S?nchez Lab > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at uiuc.edu Sun Oct 29 09:23:45 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 29 Oct 2006 08:23:45 -0600 Subject: [Bioperl-l] PAML In-Reply-To: <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com> References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu> <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu> <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com> Message-ID: <9E776AD6-76E1-403A-8003-C27958F21D90@uiuc.edu> Does the data show up in the object using Data::Dumper? This should be filed as a bug since the docs imply the method exists. This could be written up fairly quickly if one had test data and and a script to work with (hint hint...) Chris On Oct 29, 2006, at 4:52 AM, Albert Vilella wrote: > I don't know if this method is implemented. I can't grep-find it. > Maybe it's simply not there yet, but was planned when the > documentation was written. > > On 10/28/06, Eric Ross wrote: >> I am trying to extract the "Naive Empirical Bayes (NEB) >> probabilities" from a Bio::Tools::Phylo::PAML::Result object. >> >> I am able to extract other data from the report, but there seems >> to be a conflict in the documentation. One doc implies that there >> should be a get_posteriors method. (It's used as an example in the >> Bio::Tools::Phylo::PAML doc), but the method does not appear to >> exist in the Bio::Tools::Phylo::PAML::Result object. >> >> >> I have been trying various methods, in the event I'm just >> "confused", but I've had no luck, thus far. Anyone have suggestions? >> >> >> code: >> >> ----begin code------- >> #!/usr/bin/perl -w >> >> use strict; >> >> >> use Bio::Tools::Phylo::PAML; >> my $parser = new Bio::Tools::Phylo::PAML >> (-file => "mlc"); >> my $result = $parser->next_result; >> my @posteriors = $result->get_posteriors(); >> >> print "@posteriors"; >> >> exit(0); >> >> ---------end code------------- >> >> >> >> --------------- >> Eric Ross >> Computer Analyst II >> ejr at neuro.utah.edu >> Howard Hughes Medical Institute >> University of Utah >> S?nchez Lab >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From eric.ross at neuro.utah.edu Sun Oct 29 12:06:54 2006 From: eric.ross at neuro.utah.edu (Eric Ross) Date: Sun, 29 Oct 2006 10:06:54 -0700 Subject: [Bioperl-l] PAML References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu> <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu> <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com> <9E776AD6-76E1-403A-8003-C27958F21D90@uiuc.edu> Message-ID: <2DCC13A920E6C640AD6257DEFA01F78408A3F5@CAMPUSV1.xds.umail.utah.edu> Thanks for all the help. I've been looking at the code for the PAML rst parser. It's a bit tricky. We have written a parser specific for our needs, but it looks to be a pretty complicated matter to make it generic. The output of PAML can vary a lot depending upon your options and this section can be repeated multiple times. I'm sure someone with a good grasp of the potential output of PAML could come up with something, but I'll admit to being at a loss. --------------- Eric Ross Computer Analyst II ejr at neuro.utah.edu Howard Hughes Medical Institute University of Utah S?nchez Lab -----Original Message----- From: Chris Fields [mailto:cjfields at uiuc.edu] Sent: Sun 2006-10-29 7:23 AM To: Albert Vilella Cc: Eric Ross; Bioperl-l at lists.open-bio.org Subject: Re: [Bioperl-l] PAML Does the data show up in the object using Data::Dumper? This should be filed as a bug since the docs imply the method exists. This could be written up fairly quickly if one had test data and and a script to work with (hint hint...) Chris On Oct 29, 2006, at 4:52 AM, Albert Vilella wrote: > I don't know if this method is implemented. I can't grep-find it. > Maybe it's simply not there yet, but was planned when the > documentation was written. > > On 10/28/06, Eric Ross wrote: >> I am trying to extract the "Naive Empirical Bayes (NEB) >> probabilities" from a Bio::Tools::Phylo::PAML::Result object. >> >> I am able to extract other data from the report, but there seems >> to be a conflict in the documentation. One doc implies that there >> should be a get_posteriors method. (It's used as an example in the >> Bio::Tools::Phylo::PAML doc), but the method does not appear to >> exist in the Bio::Tools::Phylo::PAML::Result object. >> >> >> I have been trying various methods, in the event I'm just >> "confused", but I've had no luck, thus far. Anyone have suggestions? >> >> >> code: >> >> ----begin code------- >> #!/usr/bin/perl -w >> >> use strict; >> >> >> use Bio::Tools::Phylo::PAML; >> my $parser = new Bio::Tools::Phylo::PAML >> (-file => "mlc"); >> my $result = $parser->next_result; >> my @posteriors = $result->get_posteriors(); >> >> print "@posteriors"; >> >> exit(0); >> >> ---------end code------------- >> >> >> >> --------------- >> Eric Ross >> Computer Analyst II >> ejr at neuro.utah.edu >> Howard Hughes Medical Institute >> University of Utah >> S?nchez Lab >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Sun Oct 29 12:43:20 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sun, 29 Oct 2006 17:43:20 +0000 Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage In-Reply-To: <45421658.5000103@sheffield.ac.uk> References: <45421658.5000103@sheffield.ac.uk> Message-ID: <4544E838.7090400@sheffield.ac.uk> Sorry for the repeat post but I haven't had a response. Just wondered if anyone had any idea about this? Thanks Nath Nathan S. Haigh wrote: > As you may be aware by now, i'm working with Bio::Restriction::Analysis > and friends. > > I'm doing restriction analysis on large sequences - chromosomes. I need > to identify an appropriate enzyme based on the total length of fragments > that are of a certain size (e.g. 100 - 500 bp). However, the amount of > memory used by Bio::Restriction::Analysis::fragments() is prohibative. I > have the following code (bottom) which downloads 2 thaliana chromosomes > (mito and chloro - so pretty small) and runs an analysis and then loops > through the fragments for all enzymes in the default collection. > > My memory usage just keep on climbing and none seems to get freed up > even when a $ra goes out of scope (start dealing with the next > sequence). Is this a memory leak of some sort, is there a way to free up > memory as I go? I'd appreciate any help/advice on how to reduce the > amount of memory being consumed as I'd like to use all the thaliana > chromosomes (not just mito and chloro), which at the moment probably > won't work. > > Cheers > Nath > > use strict; > use Bio::DB::GenBank; > use Bio::Restriction::Analysis; > use Bio::Restriction::EnzymeCollection; > > my @seq_objs; > my @gis = ( 7525012, 26556996 ); > > my $db = Bio::DB::GenBank->new(-format => "fasta"); > foreach my $gi (@gis) { > print "Getting GI: $gi\n"; > push @seq_objs, $db->get_Seq_by_id($gi) > } > > my $min_fragment_size = 100; > my $max_fragment_size = 500; > my $enz_Coll = Bio::Restriction::EnzymeCollection->new(); > > foreach my $seq (@seq_objs) { > my $tot_size = 0; > print "Processing ", $seq->primary_id,"\n"; > my $ra = Bio::Restriction::Analysis->new( > -seq=>$seq, > -enzymes=>$enz_Coll, > ); > > my @all_enzymes = $ra->cutters->each_enzyme; > print " Calc total length of fragments in range: $min_fragment_size - > $max_fragment_size\n"; > foreach my $enzyme ( @all_enzymes ) { > # fragments() is a real memory hog > foreach my $frag ($ra->fragments($enzyme)) { > next if $min_fragment_size && (length $frag < $min_fragment_size); > next if $max_fragment_size && (length $frag > $max_fragment_size); > $tot_size += length $frag; > } > # do something based on value of $tot_size > #print " ", $enzyme->name, " total = $tot_size\n"; > } > print "DONE\n"; > } > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at uiuc.edu Sun Oct 29 13:09:54 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 29 Oct 2006 12:09:54 -0600 Subject: [Bioperl-l] PAML In-Reply-To: <2DCC13A920E6C640AD6257DEFA01F78408A3F5@CAMPUSV1.xds.umail.utah.edu> References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu> <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu> <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com> <9E776AD6-76E1-403A-8003-C27958F21D90@uiuc.edu> <2DCC13A920E6C640AD6257DEFA01F78408A3F5@CAMPUSV1.xds.umail.utah.edu> Message-ID: On Oct 29, 2006, at 11:06 AM, Eric Ross wrote: > Thanks for all the help. > > I've been looking at the code for the PAML rst parser. It's a bit > tricky. > > We have written a parser specific for our needs, but it looks to be > a pretty complicated matter to make it generic. > > The output of PAML can vary a lot depending upon your options and > this section can be repeated multiple times. I'm sure someone with > a good grasp of the potential output of PAML could come up with > something, but I'll admit to being at a loss. Eric, I planned on looking at ways to integrate the protein-based PAML programs but I'm working on a different area at the moment. I agree it may be hard to adequately genericize parsing/methods to accomplish this, but if you have any ideas feel free to post them. Again, I would suggest adding any proposed enhancements or bugs to Bugzilla: http://bugzilla.open-bio.org/ Suggestions or bug reports on the list sometimes get lost in the shuffle, esp. since we're planning on a new developer release soon. Chris Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Sun Oct 29 13:16:37 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 29 Oct 2006 12:16:37 -0600 Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage In-Reply-To: <4544E838.7090400@sheffield.ac.uk> References: <45421658.5000103@sheffield.ac.uk> <4544E838.7090400@sheffield.ac.uk> Message-ID: <6D9EAA04-199C-4BDD-AA60-4833BC1CE250@uiuc.edu> On Oct 29, 2006, at 11:43 AM, Nathan S. Haigh wrote: > Sorry for the repeat post but I haven't had a response. Just > wondered if > anyone had any idea about this? > > Thanks > Nath ... I think Warnock applies here. Likely no one is really sure, hence they aren't answering. It probably bears investigating by submitting and tracking as a bug. My guess is something isn't garbage-collected properly (i.e. there are circular references present), leading to a memory leak. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From chhalling at alumni.ls.berkeley.edu Sun Oct 29 14:16:36 2006 From: chhalling at alumni.ls.berkeley.edu (Conrad Halling) Date: Sun, 29 Oct 2006 14:16:36 -0500 Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage In-Reply-To: <4544E838.7090400@sheffield.ac.uk> References: <45421658.5000103@sheffield.ac.uk> <4544E838.7090400@sheffield.ac.uk> Message-ID: <4544FE14.7030701@alumni.ls.berkeley.edu> Nathan S. Haigh wrote: > Sorry for the repeat post but I haven't had a response. Just wondered if > anyone had any idea about this? > > Thanks > Nath > > Nathan S. Haigh wrote: > >> As you may be aware by now, i'm working with Bio::Restriction::Analysis >> and friends. >> >> I'm doing restriction analysis on large sequences - chromosomes. I need >> to identify an appropriate enzyme based on the total length of fragments >> that are of a certain size (e.g. 100 - 500 bp). However, the amount of >> memory used by Bio::Restriction::Analysis::fragments() is prohibative. I >> have the following code (bottom) which downloads 2 thaliana chromosomes >> (mito and chloro - so pretty small) and runs an analysis and then loops >> through the fragments for all enzymes in the default collection. >> >> My memory usage just keep on climbing and none seems to get freed up >> even when a $ra goes out of scope (start dealing with the next >> sequence). Is this a memory leak of some sort, is there a way to free up >> memory as I go? I'd appreciate any help/advice on how to reduce the >> amount of memory being consumed as I'd like to use all the thaliana >> chromosomes (not just mito and chloro), which at the moment probably >> won't work. >> >> Cheers >> Nath >> >> use strict; >> use Bio::DB::GenBank; >> use Bio::Restriction::Analysis; >> use Bio::Restriction::EnzymeCollection; >> >> my @seq_objs; >> my @gis = ( 7525012, 26556996 ); >> >> my $db = Bio::DB::GenBank->new(-format => "fasta"); >> foreach my $gi (@gis) { >> print "Getting GI: $gi\n"; >> push @seq_objs, $db->get_Seq_by_id($gi) >> } >> >> my $min_fragment_size = 100; >> my $max_fragment_size = 500; >> my $enz_Coll = Bio::Restriction::EnzymeCollection->new(); >> >> foreach my $seq (@seq_objs) { >> my $tot_size = 0; >> print "Processing ", $seq->primary_id,"\n"; >> my $ra = Bio::Restriction::Analysis->new( >> -seq=>$seq, >> -enzymes=>$enz_Coll, >> ); >> >> my @all_enzymes = $ra->cutters->each_enzyme; >> print " Calc total length of fragments in range: $min_fragment_size - >> $max_fragment_size\n"; >> foreach my $enzyme ( @all_enzymes ) { >> # fragments() is a real memory hog >> foreach my $frag ($ra->fragments($enzyme)) { >> next if $min_fragment_size && (length $frag < $min_fragment_size); >> next if $max_fragment_size && (length $frag > $max_fragment_size); >> $tot_size += length $frag; >> } >> # do something based on value of $tot_size >> #print " ", $enzyme->name, " total = $tot_size\n"; >> } >> print "DONE\n"; >> } >> >> Try this code, which creates a new Bio::Restriction::Analysis object for each digest. On my PowerBook, this doesn't use more than 13 Mb of memory. Reading the code for Bio::Restriction::Analysis reveals that the fragments() method calls the cut() method. The documentation for the cut method states: Note: cut doesn't now re-initialize everything before figuring out cuts. This is so that you can do multiple digests, or add more data or whatever. You'll have to use new to reset everything. This means there is no memory leak; it's just that the Bio::Restriction::Analysis object is retaining cut information for each enzyme, which takes a lot of memory. use strict; use warnings; use Bio::DB::GenBank; use Bio::Restriction::Analysis; use Bio::Restriction::EnzymeCollection; my @seq_objs; my @gis = ( 7525012, 26556996 ); my $db = Bio::DB::GenBank->new(-format => "fasta"); foreach my $gi (@gis) { print "Getting GI: $gi\n"; push @seq_objs, $db->get_Seq_by_id($gi) } my $min_fragment_size = 100; my $max_fragment_size = 500; my $enz_Coll = Bio::Restriction::EnzymeCollection->new(); foreach my $seq (@seq_objs) { print "Processing ", $seq->primary_id, "\n"; foreach my $enzyme ( $enz_Coll->each_enzyme() ) { my $ra = Bio::Restriction::Analysis->new( -seq => $seq, -enzymes => $enzyme ); my $tot_size = 0; print " Calc total length of fragments in range: $min_fragment_size -" . " $max_fragment_size\n"; foreach my $frag ($ra->fragments($enzyme)) { next if $min_fragment_size && (length $frag < $min_fragment_size); next if $max_fragment_size && (length $frag > $max_fragment_size); $tot_size += length $frag; } # do something based on value of $tot_size print " ", $enzyme->name, " total = $tot_size\n"; } print "DONE\n"; } -- Conrad Halling chhalling at alumni.ls.berkeley.edu From n.haigh at sheffield.ac.uk Mon Oct 30 03:51:49 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 30 Oct 2006 08:51:49 +0000 Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id() Message-ID: <4545BD25.3030107@sheffield.ac.uk> In my script I retrieve sequences from GenBank in FASTA format by GI numbers and optionally store the sequence in a cache using Bio::DB::Fasta. On subsequent runs of the script, the cache is first checked for the GI and returns the sequence if it is found or the sequence is obtained from GenBank as above. I would have thought that Bio::DB::Fasta::get_Seq_by_id() would have returned a Bio::Seq object but rather it returns a Bio::PrimarySeq object which is defined within the Bio::DB::Fasta file. This is annoying, since $seq_obj in my script would be either a Bio::Seq if it was obtained from GenBank or a Bio::PrimarySeq if obtained from the cache and calling primary_id() on it doesn't do the expected thing with Bio::PrimarySeq: ID: Bio::PrimarySeq::Fasta=HASH(0x89b4508) Is there a reason why Bio::DB::Fasta doesn't return a Bio::Seq object? Nath From yuhki at ncifcrf.gov Mon Oct 30 08:57:35 2006 From: yuhki at ncifcrf.gov (Naoya Yuhki) Date: Mon, 30 Oct 2006 08:57:35 -0500 Subject: [Bioperl-l] bptutorial.pl 0 Message-ID: <46170A95-13C8-4839-BD70-EF086ED1E0DA@ncifcrf.gov> Hello, I run perl bptutorial.pl 0 and I got the following error. -------------------- WARNING --------------------- MSG: id (ROA1_HUMAN) does not exist --------------------------------------------------- Can't call method "display_id" on an undefined value at bptutorial.pl line 3945. other tests all worked. I thank any suggestions from you. NAOYA YUHKI. From cjfields at uiuc.edu Mon Oct 30 12:42:21 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 30 Oct 2006 11:42:21 -0600 Subject: [Bioperl-l] bptutorial.pl 0 In-Reply-To: <46170A95-13C8-4839-BD70-EF086ED1E0DA@ncifcrf.gov> Message-ID: <000601c6fc4a$c3e43450$15327e82@pyrimidine> > Hello, > I run > > perl bptutorial.pl 0 > > and I got the following error. > > -------------------- WARNING --------------------- > MSG: id (ROA1_HUMAN) does not exist > --------------------------------------------------- > Can't call method "display_id" on an undefined value at bptutorial.pl > line 3945. > > other tests all worked. > > I thank any suggestions from you. > > NAOYA YUHKI. What version of Bioperl are you running? As a warning, the bptutorial.pl script has been removed from CVS and will not be included in future versions of Bioperl. It can be found on the bioperl wiki instead: http://www.bioperl.org/wiki/Bptutorial chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Mon Oct 30 13:08:15 2006 From: jason at bioperl.org (Jason Stajich) Date: Mon, 30 Oct 2006 10:08:15 -0800 Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id() In-Reply-To: <4545BD25.3030107@sheffield.ac.uk> References: <4545BD25.3030107@sheffield.ac.uk> Message-ID: <29F47393-D134-4093-8751-E948BF521843@bioperl.org> Bio::PrimarySeq makes sense because Fasta databases only provide sequences without features. But you are actually getting a Bio::PrimarySeq::Fasta object which is a proxy object since the module won't pull a whole sequence into memory unless seq() is requested. The problem is really why you are getting something useless set for primary_id. What do you want it to be - the GI number? you'll need to explicitly set it because DB::Fasta has no concept of GI numbers encoded in the header line. AFAIK you cannot also set the primary_id to a value of your liking because this a proxy object. The best bet is to create a Bio::Seq object out of one of these and set the primary_id and display_id to values that you can compute from the display_id. At least that has been my strategy when using this - maybe someone wants to code something new into the object itsself. -jason On Oct 30, 2006, at 12:51 AM, Nathan S. Haigh wrote: > In my script I retrieve sequences from GenBank in FASTA format by GI > numbers and optionally store the sequence in a cache using > Bio::DB::Fasta. On subsequent runs of the script, the cache is first > checked for the GI and returns the sequence if it is found or the > sequence is obtained from GenBank as above. > > I would have thought that Bio::DB::Fasta::get_Seq_by_id() would have > returned a Bio::Seq object but rather it returns a Bio::PrimarySeq > object which is defined within the Bio::DB::Fasta file. This is > annoying, since $seq_obj in my script would be either a Bio::Seq if it > was obtained from GenBank or a Bio::PrimarySeq if obtained from the > cache and calling primary_id() on it doesn't do the expected thing > with > Bio::PrimarySeq: > ID: Bio::PrimarySeq::Fasta=HASH(0x89b4508) > > Is there a reason why Bio::DB::Fasta doesn't return a Bio::Seq object? > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From golharam at umdnj.edu Mon Oct 30 15:11:51 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Mon, 30 Oct 2006 15:11:51 -0500 Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String? Message-ID: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1> I'm trying to parse some blast output w/o actually creating the output file. Instead, I'm capturing the output in a variable and would like to use IO::String to represent the file: $_ = `megablast -d somedatabase -i somesequence -D 2`; my $blast_file = new IO::String($_); my $searchio = new Bio::SearchIO(-format => 'blast', -fh => $blast_file); my $results = $searchio->next_result; my $hit = $results->next_hit; if (! defined($hit)) { warn "No BLAST hit for $accession on chr $chr for Seq/$orth_id/$organism\n\n"; return; } Now, when Bio::SearchIO tries to read the output line by line, instead it reads the entire output as 1 line. If I provide the output in a file and use: my $searchio = new Bio::SearchIO(-format => 'blast', -file => '/tmp/somefile.blast'); This works...so is it possible to use IO::String to provide Bio::SearchIO with BLAST output? Ryan From golharam at umdnj.edu Mon Oct 30 15:54:29 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Mon, 30 Oct 2006 15:54:29 -0500 Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String? In-Reply-To: <716af09c0610301244q1c6d4cd6y23e72371dda4fa4d@mail.gmail.com> Message-ID: <00e801c6fc65$9849aee0$e6028a0a@GOLHARMOBILE1> Thanks. How are you getting the output? system()? BTW- I'm using v1.5.1... > -----Original Message----- > From: Bernd Web [mailto:bernd.web at gmail.com] > Sent: Monday, October 30, 2006 3:45 PM > To: golharam at umdnj.edu > Cc: bioperl-l > Subject: Re: [Bioperl-l] Is it possible to parse BLAST output > using IO:String? > > > Hi Ryan, > > I parse blastn output using IO::String w/o problems: > > my $stringfh = new IO::String($input); > my $in = new Bio::SearchIO(-format => 'blast', -fh => $stringfh); > > however this is input does not come via backticks. > > > bernd > > On 10/30/06, Ryan Golhar wrote: > > I'm trying to parse some blast output w/o actually creating > the output > > file. Instead, I'm capturing the output in a variable and > would like > > to use IO::String to represent the file: > > > > $_ = `megablast -d somedatabase -i somesequence -D 2`; > > my $blast_file = new IO::String($_); > > my $searchio = new Bio::SearchIO(-format => 'blast', -fh => > > $blast_file); > > my $results = $searchio->next_result; > > my $hit = $results->next_hit; > > if (! defined($hit)) { > > warn "No BLAST hit for $accession on chr $chr for > > Seq/$orth_id/$organism\n\n"; > > return; > > } > > > > Now, when Bio::SearchIO tries to read the output line by > line, instead > > it reads the entire output as 1 line. > > > > If I provide the output in a file and use: > > > > my $searchio = new Bio::SearchIO(-format => > 'blast', -file => > > '/tmp/somefile.blast'); > > > > This works...so is it possible to use IO::String to provide > > Bio::SearchIO with BLAST output? > > > > Ryan > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > From bix at sendu.me.uk Mon Oct 30 16:27:58 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 30 Oct 2006 21:27:58 +0000 Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String? In-Reply-To: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1> References: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1> Message-ID: <45466E5E.9000504@sendu.me.uk> Ryan Golhar wrote: > I'm trying to parse some blast output w/o actually creating the output > file. Instead, I'm capturing the output in a variable and would like to > use IO::String to represent the file: > > $_ = `megablast -d somedatabase -i somesequence -D 2`; > my $blast_file = new IO::String($_); > my $searchio = new Bio::SearchIO(-format => 'blast', -fh => > $blast_file); > my $results = $searchio->next_result; > my $hit = $results->next_hit; > if (! defined($hit)) { > warn "No BLAST hit for $accession on chr $chr for > Seq/$orth_id/$organism\n\n"; > return; > } > > Now, when Bio::SearchIO tries to read the output line by line, instead > it reads the entire output as 1 line. > > If I provide the output in a file and use: > > my $searchio = new Bio::SearchIO(-format => 'blast', -file => > '/tmp/somefile.blast'); > > This works...so is it possible to use IO::String to provide > Bio::SearchIO with BLAST output? Why must it be IO::String? Why not just open() your megablast and provide $searchio the real filehandle? It would be faster that way as well. Read the docs for `. Your usage above is inappropriate. From golharam at umdnj.edu Mon Oct 30 16:54:45 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Mon, 30 Oct 2006 16:54:45 -0500 Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String? In-Reply-To: Message-ID: <00f901c6fc6e$03916460$e6028a0a@GOLHARMOBILE1> Hmmm. Yes, I suppose I could. I did it with the backtick because I based my code off of the "To and >From a String" from the SeqIO HOWTO... -----Original Message----- From: Jason Stajich [mailto:jason.stajich at gmail.com] On Behalf Of Jason Stajich Sent: Monday, October 30, 2006 4:44 PM To: Sendu Bala Cc: golharam at umdnj.edu; 'bioperl-l' Subject: Re: [Bioperl-l] Is it possible to parse BLAST output using IO:String? right - can't you just do: my $fh; open($fh, "megablast -d ... | ") || die $!; my $searchIO = Bio::SearchIO->new(-format => 'blast', -fh => $fh); On Oct 30, 2006, at 1:27 PM, Sendu Bala wrote: Ryan Golhar wrote: I'm trying to parse some blast output w/o actually creating the output file. Instead, I'm capturing the output in a variable and would like to use IO::String to represent the file: $_ = `megablast -d somedatabase -i somesequence -D 2`; my $blast_file = new IO::String($_); my $searchio = new Bio::SearchIO(-format => 'blast', -fh => $blast_file); my $results = $searchio->next_result; my $hit = $results->next_hit; if (! defined($hit)) { warn "No BLAST hit for $accession on chr $chr for Seq/$orth_id/$organism\n\n"; return; } Now, when Bio::SearchIO tries to read the output line by line, instead it reads the entire output as 1 line. If I provide the output in a file and use: my $searchio = new Bio::SearchIO(-format => 'blast', -file => '/tmp/somefile.blast'); This works...so is it possible to use IO::String to provide Bio::SearchIO with BLAST output? Why must it be IO::String? Why not just open() your megablast and provide $searchio the real filehandle? It would be faster that way as well. Read the docs for `. Your usage above is inappropriate. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From bernd.web at gmail.com Mon Oct 30 15:44:31 2006 From: bernd.web at gmail.com (Bernd Web) Date: Mon, 30 Oct 2006 21:44:31 +0100 Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String? In-Reply-To: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1> References: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1> Message-ID: <716af09c0610301244q1c6d4cd6y23e72371dda4fa4d@mail.gmail.com> Hi Ryan, I parse blastn output using IO::String w/o problems: my $stringfh = new IO::String($input); my $in = new Bio::SearchIO(-format => 'blast', -fh => $stringfh); however this is input does not come via backticks. bernd On 10/30/06, Ryan Golhar wrote: > I'm trying to parse some blast output w/o actually creating the output > file. Instead, I'm capturing the output in a variable and would like to > use IO::String to represent the file: > > $_ = `megablast -d somedatabase -i somesequence -D 2`; > my $blast_file = new IO::String($_); > my $searchio = new Bio::SearchIO(-format => 'blast', -fh => > $blast_file); > my $results = $searchio->next_result; > my $hit = $results->next_hit; > if (! defined($hit)) { > warn "No BLAST hit for $accession on chr $chr for > Seq/$orth_id/$organism\n\n"; > return; > } > > Now, when Bio::SearchIO tries to read the output line by line, instead > it reads the entire output as 1 line. > > If I provide the output in a file and use: > > my $searchio = new Bio::SearchIO(-format => 'blast', -file => > '/tmp/somefile.blast'); > > This works...so is it possible to use IO::String to provide > Bio::SearchIO with BLAST output? > > Ryan > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From jason at bioperl.org Mon Oct 30 16:44:18 2006 From: jason at bioperl.org (Jason Stajich) Date: Mon, 30 Oct 2006 13:44:18 -0800 Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String? In-Reply-To: <45466E5E.9000504@sendu.me.uk> References: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1> <45466E5E.9000504@sendu.me.uk> Message-ID: right - can't you just do: my $fh; open($fh, "megablast -d ... | ") || die $!; my $searchIO = Bio::SearchIO->new(-format => 'blast', -fh => $fh); On Oct 30, 2006, at 1:27 PM, Sendu Bala wrote: > Ryan Golhar wrote: >> I'm trying to parse some blast output w/o actually creating the >> output >> file. Instead, I'm capturing the output in a variable and would >> like to >> use IO::String to represent the file: >> >> $_ = `megablast -d somedatabase -i somesequence -D 2`; >> my $blast_file = new IO::String($_); >> my $searchio = new Bio::SearchIO(-format => 'blast', -fh => >> $blast_file); >> my $results = $searchio->next_result; >> my $hit = $results->next_hit; >> if (! defined($hit)) { >> warn "No BLAST hit for $accession on chr $chr for >> Seq/$orth_id/$organism\n\n"; >> return; >> } >> >> Now, when Bio::SearchIO tries to read the output line by line, >> instead >> it reads the entire output as 1 line. >> >> If I provide the output in a file and use: >> >> my $searchio = new Bio::SearchIO(-format => 'blast', -file => >> '/tmp/somefile.blast'); >> >> This works...so is it possible to use IO::String to provide >> Bio::SearchIO with BLAST output? > > Why must it be IO::String? Why not just open() your megablast and > provide $searchio the real filehandle? It would be faster that way > as well. > > Read the docs for `. Your usage above is inappropriate. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From lstein at cshl.edu Mon Oct 30 13:59:29 2006 From: lstein at cshl.edu (Lincoln Stein) Date: Mon, 30 Oct 2006 13:59:29 -0500 Subject: [Bioperl-l] Last minute changes to Bio::Graphics::FeatureBase Message-ID: <6dce9a0b0610301059t6bfae183n8d359f4331985c62@mail.gmail.com> Hi All, I found a bug in Bio::Graphics::FeatureBase that was causing GFF3 output not to validate. I have committed a new version to live and to the release candidate branch. I hope it isn't too late to get this into the release. Lincoln -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From huangyi1 at hkusua.hku.hk Tue Oct 31 00:46:20 2006 From: huangyi1 at hkusua.hku.hk (Huang Yi) Date: Tue, 31 Oct 2006 13:46:20 +0800 Subject: [Bioperl-l] bioperl1.5 and GD2.35 Message-ID: <200610310546.k9V5kQGT010481@hkusua.hku.hk> Hi, I just installed bioperl 1.4 from CPAN to my Gentoo linux computer. But the installation was failed. I had to install by force. However, the GD module couldn't be installed for some unknown reasons. I therefore use "emerge" tool of Gentoo to get bioperl and GD again. They are fine. The version of bioperl became upgrade to1.5 and GD was 2.35. However, when I tested it by using the program in HOWTO wiki page (http://www.bioperl.org/wiki/HOWTO:Graphics), it always told me: Can't locate object method "png" via package "GD::Image" at /usr/lib/perl5/site_perl/5.8.8/Bio/Graphics/Panel.pm line 799, <> line 9. In my other computer, bioperl1.4 and GD2.34 work fine. I therefore want to remove the CPAN bioperl from the system and re-install it, but it seems to be impossible. Would you please give me some advices on how to let my GD and bioperl work. Thanks! Huang Yi From bix at sendu.me.uk Tue Oct 31 03:20:21 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 31 Oct 2006 08:20:21 +0000 Subject: [Bioperl-l] Last minute changes to Bio::Graphics::FeatureBase In-Reply-To: <6dce9a0b0610301059t6bfae183n8d359f4331985c62@mail.gmail.com> References: <6dce9a0b0610301059t6bfae183n8d359f4331985c62@mail.gmail.com> Message-ID: <45470745.1050605@sendu.me.uk> Lincoln Stein wrote: > Hi All, > > I found a bug in Bio::Graphics::FeatureBase that was causing GFF3 output not > to validate. I have committed a new version to live and to the release > candidate branch. I hope it isn't too late to get this into the release. It isn't too late, thank you. From avilella at gmail.com Tue Oct 31 08:54:39 2006 From: avilella at gmail.com (Albert Vilella) Date: Tue, 31 Oct 2006 13:54:39 +0000 Subject: [Bioperl-l] catfile and catdir Message-ID: <358f4d650610310554v5eb57db1x137ac307a73fa64@mail.gmail.com> Hi, I was testing the bioperl-run/t/PAML.t and stumbled upon this a catdir/catfile error: Can't locate object method "catdir" via package "Bio::Root::IO" at /home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line 113. BEGIN failed--compilation aborted at /home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line 143. Compilation failed in require at t/PAML.t line 64. BEGIN failed--compilation aborted at t/PAML.t line 64. Should be be using File::Spec for catdir and catfile instead of Root::IO? Cheers, Albert. From Kevin.M.Brown at asu.edu Tue Oct 31 10:34:34 2006 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Tue, 31 Oct 2006 08:34:34 -0700 Subject: [Bioperl-l] bioperl1.5 and GD2.35 Message-ID: <1A4207F8295607498283FE9E93B775B4023B5F3C@EX02.asurite.ad.asu.edu> Not really a Bioperl issue per se, but sounds like when you had Gentoo emerge GD it didn't include libpng and so didn't build the needed parts to create PNG type graphics. > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Huang Yi > Sent: Monday, October 30, 2006 10:46 PM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] bioperl1.5 and GD2.35 > > Hi, > > > > I just installed bioperl 1.4 from CPAN to my Gentoo linux > computer. But the > installation was failed. I had to install by force. > > > > However, the GD module couldn't be installed for some unknown reasons. > > > > I therefore use "emerge" tool of Gentoo to get bioperl and GD > again. They > are fine. The version of bioperl became upgrade to1.5 and GD was 2.35. > > > > However, when I tested it by using the program in HOWTO wiki page > (http://www.bioperl.org/wiki/HOWTO:Graphics), it always told me: > > > > Can't locate object method "png" via package "GD::Image" at > /usr/lib/perl5/site_perl/5.8.8/Bio/Graphics/Panel.pm line > 799, <> line 9. > > > > In my other computer, bioperl1.4 and GD2.34 work fine. I > therefore want to > remove the CPAN bioperl from the system and re-install it, > but it seems to > be impossible. > > > > Would you please give me some advices on how to let my GD and > bioperl work. > > > > Thanks! > > > > Huang Yi > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From hlapp at gmx.net Tue Oct 31 11:21:40 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 31 Oct 2006 11:21:40 -0500 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: <24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu> References: <000001c6f9f6$9ab12710$15327e82@pyrimidine> <24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu> Message-ID: <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net> On Oct 27, 2006, at 9:57 PM, Chris Fields wrote: > BTW, was that supposed to be Bio::AnnotatableI, or > Bio::AnnotationHolderI? Sorry, the former. I guess I got confused with FeatureHolders. Too bad Featureable isn't an English word. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Tue Oct 31 12:01:44 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 31 Oct 2006 12:01:44 -0500 Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id() In-Reply-To: <4545BD25.3030107@sheffield.ac.uk> References: <4545BD25.3030107@sheffield.ac.uk> Message-ID: <375EA8B5-EE6E-47BA-A335-AAC44CC454C3@gmx.net> The only thing I would add to Jason's reply is that it is easy to do if (! $seq->isa("Bio::SeqI")) { my $bioseq = Bio::Seq->new(); $bioseq->primary_seq($seq); $seq = $bioseq; } and from that point on all your objects are Bio::SeqI compliant regardless of whether they were obtained that way or not. Aside from that I wonder why there isn't a -primary_seq option in Bio::Seq::new - this would shorten the above into a (more perl'ish) single line: $seq = Bio::Seq->new(-primary_seq=>$seq) unless $seq->isa("Bio::SeqI"); Anyone takers to add that capability? -hilmar On Oct 30, 2006, at 3:51 AM, Nathan S. Haigh wrote: > In my script I retrieve sequences from GenBank in FASTA format by GI > numbers and optionally store the sequence in a cache using > Bio::DB::Fasta. On subsequent runs of the script, the cache is first > checked for the GI and returns the sequence if it is found or the > sequence is obtained from GenBank as above. > > I would have thought that Bio::DB::Fasta::get_Seq_by_id() would have > returned a Bio::Seq object but rather it returns a Bio::PrimarySeq > object which is defined within the Bio::DB::Fasta file. This is > annoying, since $seq_obj in my script would be either a Bio::Seq if it > was obtained from GenBank or a Bio::PrimarySeq if obtained from the > cache and calling primary_id() on it doesn't do the expected thing > with > Bio::PrimarySeq: > ID: Bio::PrimarySeq::Fasta=HASH(0x89b4508) > > Is there a reason why Bio::DB::Fasta doesn't return a Bio::Seq object? > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Tue Oct 31 12:08:56 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 31 Oct 2006 11:08:56 -0600 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net> Message-ID: <001401c6fd0f$4239aa50$15327e82@pyrimidine> >> BTW, was that supposed to be Bio::AnnotatableI, or >> Bio::AnnotationHolderI? > > Sorry, the former. I guess I got confused with > FeatureHolders. Too bad Featureable isn't an English word. > > -hilmar Having SimpleAlign be AnnotatableI shouldn't be too much of a burden, since the only additional implemented method is annotation(). So, I think all the various Stockholm tags can be placed somewhere. A bit OT: were we planning on getting rid of the various *_tag_* methods in AnnotatableI at some point? I'm a bit confused as to why they were added. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Tue Oct 31 12:09:26 2006 From: jason at bioperl.org (Jason Stajich) Date: Tue, 31 Oct 2006 09:09:26 -0800 Subject: [Bioperl-l] catfile and catdir In-Reply-To: <358f4d650610310554v5eb57db1x137ac307a73fa64@mail.gmail.com> References: <358f4d650610310554v5eb57db1x137ac307a73fa64@mail.gmail.com> Message-ID: <1AD4DB38-E08D-4E47-8A59-6539068474CB@bioperl.org> Yep. Unless we want this to also exist in Root::IO and delegate to File::Spec. -jason On Oct 31, 2006, at 5:54 AM, Albert Vilella wrote: > Hi, > > I was testing the bioperl-run/t/PAML.t and stumbled upon this a > catdir/catfile error: > > Can't locate object method "catdir" via package "Bio::Root::IO" at > /home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line > 113. > BEGIN failed--compilation aborted at > /home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line > 143. > Compilation failed in require at t/PAML.t line 64. > BEGIN failed--compilation aborted at t/PAML.t line 64. > > Should be be using File::Spec for catdir and catfile instead of > Root::IO? > > Cheers, > > Albert. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From jason at bioperl.org Tue Oct 31 12:10:51 2006 From: jason at bioperl.org (Jason Stajich) Date: Tue, 31 Oct 2006 09:10:51 -0800 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net> References: <000001c6f9f6$9ab12710$15327e82@pyrimidine> <24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu> <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net> Message-ID: <65F92B54-33FD-4D8F-90B7-49E2697CDBA2@bioperl.org> It just needs to have an annotation collection - so it would be Bio::AnnotateableI On Oct 31, 2006, at 8:21 AM, Hilmar Lapp wrote: > > On Oct 27, 2006, at 9:57 PM, Chris Fields wrote: > >> BTW, was that supposed to be Bio::AnnotatableI, or >> Bio::AnnotationHolderI? > > Sorry, the former. I guess I got confused with FeatureHolders. Too > bad Featureable isn't an English word. > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From hlapp at gmx.net Tue Oct 31 12:44:58 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 31 Oct 2006 12:44:58 -0500 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: References: Message-ID: Well isn't this a result of conflating some of the SeqFeatureI methods into the annotation collection? If I'm not mistaken on this then those methods were introduced in 1.5.0 and hence can go away without deprecation. -hilmar On Oct 31, 2006, at 11:37 AM, Brian Osborne wrote: > Chris, > > I don't think the intent was to remove the methods, rather we'd > just call > deprecated(). Example from AnnotatableI: > > sub remove_tag { > my ($self, at args) = @_; > > #uncomment in 1.6 > #$self->deprecated('remove_tag() is deprecated, use > remove_Annotations()'); > > return $self->annotation->remove_Annotations(@args); > } > > With regards to "why", I can't reconstruct the entire rationale > myself but I > can say that the newer names make more sense. Take that example > above - it's > function is to remove entire Annotations not just to remove tags, so > remove_Annotations is a better name. > > Brian O. > > > On 10/31/06 1:08 PM, "Chris Fields" wrote: > >> A bit OT: were we planning on getting rid of the various *_tag_* >> methods in >> AnnotatableI at some point? I'm a bit confused as to why they >> were added. > > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bosborne11 at verizon.net Tue Oct 31 11:37:01 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 31 Oct 2006 12:37:01 -0400 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: <001401c6fd0f$4239aa50$15327e82@pyrimidine> Message-ID: Chris, I don't think the intent was to remove the methods, rather we'd just call deprecated(). Example from AnnotatableI: sub remove_tag { my ($self, at args) = @_; #uncomment in 1.6 #$self->deprecated('remove_tag() is deprecated, use remove_Annotations()'); return $self->annotation->remove_Annotations(@args); } With regards to "why", I can't reconstruct the entire rationale myself but I can say that the newer names make more sense. Take that example above - it's function is to remove entire Annotations not just to remove tags, so remove_Annotations is a better name. Brian O. On 10/31/06 1:08 PM, "Chris Fields" wrote: > A bit OT: were we planning on getting rid of the various *_tag_* methods in > AnnotatableI at some point? I'm a bit confused as to why they were added. From cjfields at uiuc.edu Tue Oct 31 13:44:02 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 31 Oct 2006 12:44:02 -0600 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: Message-ID: <001e01c6fd1c$891fc4b0$15327e82@pyrimidine> Hilmar Lapp wrote: > Well isn't this a result of conflating some of the > SeqFeatureI methods into the annotation collection? > > If I'm not mistaken on this then those methods were > introduced in 1.5.0 and hence can go away without deprecation. > > -hilmar > > On Oct 31, 2006, at 11:37 AM, Brian Osborne wrote: > >> Chris, >> >> I don't think the intent was to remove the methods, rather we'd just >> call deprecated(). Example from AnnotatableI: >> >> sub remove_tag { >> my ($self, at args) = @_; >> >> #uncomment in 1.6 >> #$self->deprecated('remove_tag() is deprecated, use >> remove_Annotations()'); >> >> return $self->annotation->remove_Annotations(@args); } >> >> With regards to "why", I can't reconstruct the entire rationale >> myself but I can say that the newer names make more sense. Take that >> example above - it's function is to remove entire Annotations not >> just to remove tags, so remove_Annotations is a better name. >> >> Brian O. >> >> >> On 10/31/06 1:08 PM, "Chris Fields" wrote: >> >>> A bit OT: were we planning on getting rid of the various *_tag_* >>> methods in AnnotatableI at some point? I'm a bit confused as to why >>> they were added. Sorry Brian, what I meant was, based on CVS history, the various *tag* methods in AnnotatableI were added all at once, with deprecations already present in the commit. So the methods weren't there to begin with, then added only to be deprecated later? Hence the confusion... I think Hilmar's right; the CVS history indicates these were added just prior to rel. 1.5 by Allen and seem to be related to SeqFeatureI. I'm sure the intent was good, but they contradict methods in the Feature/Annotation HOWTO on retrieving Annotation objects via the Annotation::Collection object. I think that agrees with your point about the various Annotation* method names being the more appropriate ones. Does everybody agree we should just remove them? Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 31 13:53:16 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 31 Oct 2006 12:53:16 -0600 Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id() In-Reply-To: <375EA8B5-EE6E-47BA-A335-AAC44CC454C3@gmx.net> Message-ID: <000001c6fd1d$d4359c80$15327e82@pyrimidine> > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar Lapp > Sent: Tuesday, October 31, 2006 11:02 AM > To: n.haigh at sheffield.ac.uk > Cc: Bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id() > > The only thing I would add to Jason's reply is that it is easy to do > > if (! $seq->isa("Bio::SeqI")) { > my $bioseq = Bio::Seq->new(); > $bioseq->primary_seq($seq); > $seq = $bioseq; > } > > and from that point on all your objects are Bio::SeqI > compliant regardless of whether they were obtained that way or not. > > Aside from that I wonder why there isn't a -primary_seq > option in Bio::Seq::new - this would shorten the above into a > (more perl'ish) single line: > > $seq = Bio::Seq->new(-primary_seq=>$seq) unless > $seq->isa("Bio::SeqI"); > > Anyone takers to add that capability? > > -hilmar Sounds good to me! Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From nhansen at nhgri.nih.gov Tue Oct 31 14:51:23 2006 From: nhansen at nhgri.nih.gov (Nancy Hansen) Date: Tue, 31 Oct 2006 14:51:23 -0500 (EST) Subject: [Bioperl-l] Bio::SeqIO::scf header/comments handling Message-ID: Hello, As sequencing centers begin to deposit trace data from "Medical Sequencing" projects into the public archives, there is now the need to "anonymize" sequence trace files by removing embedded information which might be used to identify the individual who was the original source of the DNA being sequenced. I was hoping I might be able to use Bio::SeqIO to manipulate the comments contained in an SCF-formatted trace file, but I'm finding that Bio::SeqIO/Bio::Seq::SequenceTrace doesn't seem to store this information. Since SCF is a widely-accepted standard for trace files, would it be reasonable to include fields like "scf_comments" and "scf_header" in a Bio::Seq::SequenceTrace object and have Bio::SeqIO::scf populate them? Likewise, it would be great if write_seq could pull these values right from a SequenceTrace object rather than requiring them as arguments. I'd be happy to help in this effort if necessary. Thanks, --Nancy ************************************* Nancy F. Hansen, PhD nhansen at nhgri.nih.gov Bioinformatics Group NIH Intramural Sequencing Center (NISC) 5625 Fishers Lane Rockville, MD 20852 Phone: (301) 435-1560 Fax: (301) 435-6170 From lincoln.stein at gmail.com Tue Oct 31 15:24:17 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Tue, 31 Oct 2006 15:24:17 -0500 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <000001c6f78b$d1c65a30$15327e82@pyrimidine> References: <453E309B.9090007@sendu.me.uk> <000001c6f78b$d1c65a30$15327e82@pyrimidine> Message-ID: <6dce9a0b0610311224x79256b29sf102eb5c35865caf@mail.gmail.com> Are you going to go ahead with 1.52_XX ? If so, I will code GBrowse to look for 1.52 or higher. Lincoln On 10/24/06, Chris Fields wrote: > > .. > > > > 'handle'? I think it shows up as '6.2.13' simply because it was uploaded > > with the filename Perl6-Pugs-6.2.13.tar.gz > > Sorry, my point was that when Audrey T. uses '6.2.13', her $VERSION is > '6.002013'. So maybe we should follow a similar convention. Seems easier > and less confusing to me, at least. > > > As you point out, the code has the kind of $VERSION number we've been > > suggesting in this thread: > > > > > From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013': > > > > > > our $VERSION = 6.002013; > > > > > > That's also a very perlish-way to do it. And there are no developer > > > versions of Pugs, since it is always under active development. We > could > > try > > > something like: > > > > > > our $VERSION = 1.005002_01; > > > > Yes, this was already like one of my suggestions (1.0502_01), but I > > brought up the concern that 1.05 might be < 1.4. > > > > So then we have a question: do we try and fumble a 1.4 compatible number > > by using 1.60_10, or do we have a clean break, remove 1.4 from CPAN if > > it causes problems, and go for the 'proper' 1.006000 (1.6.0) with no > > room for RC numbering, or 1.006000010 (1.6.0.10) - the first final > > release following some 1.006000_001 (1.6.0.01 == rc1) RCs? > > I would go for the clean break if it follows perl/CPAN convention. > '1.60_10' looks like '1.60.10', not 1.6 or 1.6 RC1, so it's too confusing. > > If we can use '1.00600x' for 1.6, 1.6.1, etc, and '1.006000_00x' for 1.6 > RC1, 1.6 RC2 etc then that would be consistent and perl-compatible. > > BTW, the reason I looked at Pugs was to see what some of the Perl6 > developers were using. Who knows; they'll probably change it! > > .. > > > I don't think it would be a hassle; on the contrary it would be very > > useful to know the CPAN distribution actually works. I'm very happy with > > the idea that a release candidate gets fully tested... > > So you obviously feel strongly about it! ;> > > I don't have a problem as long as we stick with doing this from now on ( > i.e. > have a consistent versioning scheme, release policy, CPAN release policy, > etc). Would be nice for Jason/Brian/Hilmar to chime in as to the > reasoning > behind the older versioning scheme. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From hlapp at gmx.net Tue Oct 31 16:53:58 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 31 Oct 2006 16:53:58 -0500 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: <001e01c6fd1c$891fc4b0$15327e82@pyrimidine> References: <001e01c6fd1c$891fc4b0$15327e82@pyrimidine> Message-ID: On Oct 31, 2006, at 1:44 PM, Chris Fields wrote: > Does everybody agree we should just remove them? I wish you could but I'm afraid that would break stuff? Otherwise why were they added in the first place? I thought Bio::SeqFeature::Annotated needs them maybe? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Tue Oct 31 17:41:17 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 31 Oct 2006 16:41:17 -0600 Subject: [Bioperl-l] AnnotatableI tag methods, was Rfam/Pfam annotations and SimpleAlign In-Reply-To: Message-ID: <000001c6fd3d$ae37c240$15327e82@pyrimidine> > On Oct 31, 2006, at 1:44 PM, Chris Fields wrote: > > > Does everybody agree we should just remove them? > > I wish you could but I'm afraid that would break stuff? > Otherwise why were they added in the first place? I thought > Bio::SeqFeature::Annotated needs them maybe? > > -hilmar > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== Yep, removing them clobbers a ton of tests, including anything that requires SeqIO::FTHelper. Looks like SeqFeature::Generic and a few others use them. I could understand if these were meant to be permanent methods, but why add these in if they were to be deprecated in 1.6? Something that was meant to be a transition but wasn't finished? That seems to be indicated in the commented out lines for all the *tag* methods: #uncomment in 1.6 #$self->deprecated('remove_tag() is deprecated, use remove_Annotations()'); Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From lincoln.stein at gmail.com Tue Oct 31 18:18:07 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Tue, 31 Oct 2006 18:18:07 -0500 Subject: [Bioperl-l] Bio::DB::GFF::Util::Binning In-Reply-To: References: Message-ID: <6dce9a0b0610311518l3bec852q5d04a9b488621377@mail.gmail.com> Hi Keith, The current Bio/DB/GFF/Util/Binning.pm file just contains the hierarchical binning system that I implemented some time ago. Where is the R-tree system that you describe? How much of an improvement did the R-tree scheme give over the hierarchical scheme? FTYI the GFF3 implementation uses a different binning scheme in which there is a fixed-size bin. Every time a feature overlaps a bin, it creates a new row in a table. So big features will have multiple rows and little features that fit inside a bin will have only one row. The query for this is simpler and seems to give the same relative speedup as the hierarchical binning system. I'd really like to get these queries to go as fast as possible and would love to work with you on this if you're interested. Lincoln On 10/19/06, Keith Player wrote: > > I know that there may be some changes resulting from new GFF3 > implementations, > but thought I would see if the following is useful anyway. > > I implemented the R-tree binning schema as used by > Bio::DB::GFF::Util::Binning > and as mention in this article: > > I tested the following query on a normal table (no binning), but it > assumes > that you know the longest range in the table. So for example with a table > of > human genes, where the longest gene we know of is around 2.4Mb. > > SELECT COUNT(*) as count FROM groups WHERE start > max(0,[start-2.4Mb]) > AND > g.start < [end] AND g.end > [start] AND g.chromosome = '1' > > so for 100Mb:101Mb > > SELECT COUNT(*) as count FROM groups WHERE start > 97600000 AND g.start < > 101000000 AND g.end > 100000000 AND g.chromosome = '1' > > > where [start] and [end] define the region of interest. This query > outperforms > the R-Tree implementation on all tests that I have performed (for lengths > of > 200bp to 10Mb across a whole chromsome). Could this be of some practical > use? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From bosborne11 at verizon.net Tue Oct 31 21:31:49 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 31 Oct 2006 22:31:49 -0400 Subject: [Bioperl-l] Bio::SeqIO::scf header/comments handling In-Reply-To: Message-ID: Nancy, It looks like a good place to start would be the get_header() and _get_header methods in Bio::SeqIO::scf. If you read t/scf.t you can see that the author, at some point, wanted get_header to return meaningful information but stepping through the test shows it returning a lot of UNDEF. Now I don't know if this is due to the method or the source SCF file, but you might be able to get these methods to work yourself. But to answer your questions, yes, it certainly sounds reasonable that these values would be extracted by Bio::SeqIO::scf. Brian O. On 10/31/06 3:51 PM, "Nancy Hansen" wrote: > > Hello, > > As sequencing centers begin to deposit trace data from "Medical > Sequencing" projects into the public archives, there is now the need to > "anonymize" sequence trace files by removing embedded information which > might be used to identify the individual who was the original source of > the DNA being sequenced. > > I was hoping I might be able to use Bio::SeqIO to manipulate the > comments contained in an SCF-formatted trace file, but I'm finding that > Bio::SeqIO/Bio::Seq::SequenceTrace doesn't seem to store this information. > Since SCF is a widely-accepted standard for trace files, would it be > reasonable to include fields like "scf_comments" and "scf_header" in a > Bio::Seq::SequenceTrace object and have Bio::SeqIO::scf populate them? > Likewise, it would be great if write_seq could pull these values right > from a SequenceTrace object rather than requiring them as arguments. > > I'd be happy to help in this effort if necessary. > > Thanks, > --Nancy > > ************************************* > Nancy F. Hansen, PhD nhansen at nhgri.nih.gov > Bioinformatics Group > NIH Intramural Sequencing Center (NISC) > 5625 Fishers Lane > Rockville, MD 20852 > Phone: (301) 435-1560 Fax: (301) 435-6170 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Sun Oct 1 17:05:25 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 1 Oct 2006 12:05:25 -0500 Subject: [Bioperl-l] Tests involving remote databases In-Reply-To: <84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net> References: <000001c6e3e6$81630010$15327e82@pyrimidine> <6761B0A9-DEF0-4F33-9C89-39AFE104B61F@gmx.net> <79D77258-5A4E-4428-9322-00962D5C0419@uiuc.edu> <451E3707.4090400@sendu.me.uk> <0DA94AF7-DB0E-4DC5-8896-1048089D9AC5@uiuc.edu> <84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net> Message-ID: <3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu> On Sep 30, 2006, at 4:43 PM, Hilmar Lapp wrote: > > On Sep 30, 2006, at 10:57 AM, Chris Fields wrote: > >> There should be a failed test to let us know of the problem. As >> currently set up, the XEMBL server failure doesn't show up in >> Test::Harness test summaries. Biblio_biofetch.t had the similar >> problems before Brian's fixes. > > Just keep in mind that you may not want somebody's CPAN installation > to fail (or require a 'forced' install) just because some server > happens to be down for maintenance. > > -hilmar I don't think this would be a problem unless users specifically set BIOPERLDEBUG to 1, which is something most people don't bother with before installation (and probably not something we should promote for normal installation anyway). So, for CPAN installation we would suggest that BIOPERLDEBUG be 0 or not set at all, and outline the reasons why. The idea is to retain current behavior (remote DB access will not be run unless BIOPERLDEBUG is set to 1) and apply it to all tests requiring such access. Otherwise, just those tests are skipped (and not the rest of the tests, which occurs currently). If BIOPERLDEBUG is set, the next tests would check the URL, which passes/fails (based on the specific value of $@), and runs/skips tests based on the mere presence of $@, which indicates some URL issue. You can do this with Test::More, but I'm not sure this can be done with Test.pm or Test::Simple. The current behavior just skips all tests based on a single failed URL. Then, Test::Harness, as currently set, shows skipped tests as passed. The last run I posted previously where XEMBL_DB.t remote DB tests failed, I also ran all tests (make test) and get this, which doesn't tell us that the remote URL failed: ----------------------------------------- ... t/WABA.......................ok t/XEMBL_DB...................ok t/ztr........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping ztr.t tests ok All tests successful, 5 subtests skipped. ----------------------------------------- Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Sun Oct 1 17:17:24 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 1 Oct 2006 12:17:24 -0500 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: References: <7A592EAB-A869-4A6C-BFA8-F73F3DFD8F5B@gmx.net> <09FB1EB0-2C1C-4FCF-8339-E78556EFEFF2@uiuc.edu> <8D75FE6D-C02D-4A86-93FA-B7256050AF11@uiuc.edu> <40155903-555A-4662-BCCE-38E5E3784118@uiuc.edu> <54E79A5F-5446-4D8E-AD26-B70894048D60@gmx.net> <1D69005A-DF0E-4F37-93FE-7577A32CC625@gmx.net> Message-ID: The '-w' flag on the shebang line is the source of those errors. I never set it anymore on Windows due to this; I just use the 'use warnings' pragma. If you use 'perl -I. t/test.t' you can normally get around the '-w' assumed by using 'make test'. I will try running tests on bioperl-db and bioperl tomorrow on WinXP to confirm these. Chris On Sep 30, 2006, at 6:10 PM, Seth Johnson wrote: > How do I get rid of all of the warnings for "redefined subroutines" > during > the test?? It clutters the output and I can't see the errors. > > On 9/30/06, Hilmar Lapp wrote: >> >> It doesn't shed more light but it does raise an alert flag. All tests >> are supposed to pass. The fact that they don't means the problems you >> are seeing have nothing to do with your specific data or script. >> >> First off - can anyone else confirm those errors using the latest >> Bioperl-db and Bioperl? >> >> Second - Seth could you run those tests individually, e.g., using >> >> $ make test test_02species TEST_VERBOSE=1 >> >> and similarly for the other tests that have failures and post the >> output. Let's start with 02species and 03simpleseq. >> >> -hilmar >> >> On Sep 30, 2006, at 5:44 PM, Seth Johnson wrote: >> >>> There are errors during the test. Here's their summary: >>> ____________________________ >>> Failed Test Stat Wstat Total Fail Failed List of Failed >>> ------------------------------------------------------------- >>> t\02species.t 65 2 3.08% 63 65 >>> t\03simpleseq.t 1 256 59 106 179.66% 7-59 >>> t\04swiss.t 52 14 26.92% 25 27-34 38-42 >>> t\12ontology.t 2 512 738 1471 199.32% 3-738 >>> t\16obda.t 12 3 25.00% 10-12 >>> ____________________________ >>> >>> May be that can shed some light on the problem?!?! >>> >>> On 9/29/06, Hilmar Lapp < hlapp at gmx.net> wrote:This may in fact be >>> a knock-on effect of the fixes? >>> >>> Seth, did you run the test suite that comes with bioperl-db, and did >>> you get any errors? >>> >>> -hilmar >>> >>> On Sep 28, 2006, at 2:26 PM, Chris Fields wrote: >>> >>>> Seth, >>>> >>>> The organism issue is a bug and has been reported, though I thought >>>> it was fixed. >>>> >>>> The lack of the date and the version is a bit odd, but there have >>>> been a lot of changes lately to bioperl-live (core bioperl in CVS), >>>> and a few to bioperl-db. How old is your bioperl and bioperl-db >>>> installation. Hilmar, any additional thoughts? >>>> >>>> Chris >>>> >>>> On Sep 28, 2006, at 11:10 AM, Seth Johnson wrote: >>>> >>>>> Thank you. That takes care of that, however, I do have another >>>>> gripe. When >>>>> running my script, quoted before, with "my $out = >>>>> Bio::SeqIO->newFh('-format' => 'genbank');", I have several key >>>>> pieces of >>>>> information missing. The most important one is the version >>>>> number. There's >>>>> also a date missing, and source organism name is corrupted. >>>>> Here's what I >>>>> get: >>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>>>> LOCUS NM_014580 2145 bp dna linear >>>>> UNK >>>>> DEFINITION Homo sapiens solute carrier family 2, (facilitated >>>>> glucose >>>>> transporter) member 8 (SLC2A8), mRNA. >>>>> ACCESSION NM_014580 >>>>> SOURCE sapiens. >>>>> ORGANISM sapiens >>>>> Eukaryota; Fungi/Metazoa group; Metazoa; Eumetazoa; >>>>> Bilateria; >>>>> Coelomata; Deuterostomia; Chordata; Craniata; >>> Vertebrata; >>>>> Gnathostomata; Teleostomi; Euteleostomi; >>>>> Sarcopterygii; >>>>> Tetrapoda; >>>>> Amniota; Mammalia; Theria; Eutheria; Euarchontoglires; >>>>> Primates; >>>>> Haplorrhini; Simiiformes; Catarrhini; Hominoidea; >>>>> Hominidae; >>>>> Homo/Pan/Gorilla group; Homo. >>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>>>> >>>>> All of the missing information is stored in BioSQL and >>>>> theoretically should >>>>> be in the outpu. Here's how NCBI genbank file looks: >>>>> >>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>>>> >>>>> LOCUS NM_014580 2145 bp mRNA linear >>>>> PRI 17-OCT-2005 >>>>> DEFINITION Homo sapiens solute carrier family 2, (facilitated >>>>> glucose >>>>> transporter) member 8 (SLC2A8), mRNA. >>>>> ACCESSION NM_014580 >>>>> VERSION NM_014580.3 GI:51870928 >>>>> KEYWORDS . >>>>> SOURCE Homo sapiens (human) >>>>> ORGANISM Homo sapiens >>>>> >>>>> Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; >>>>> Euteleostomi; >>>>> Mammalia; Eutheria; Euarchontoglires; Primates; >>>>> Haplorrhini; >>>>> Catarrhini; Hominidae; Homo. >>>>> >>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>>>> >>>>> >>>>> On 9/28/06, Chris Fields wrote: >>>>>> >>>>>> Those are from the excessively paranoid '-w' flag on the shebang >>>>>> line. If you remove the flag but add the 'use warnings' pragma >>> the >>>>>> 'subroutine x redefined' warnings go away. This, BTW, is one >>> of the >>>>>> quirks of the ActivePerl distribution; other OSs don't have the >>> same >>>>>> problem. >>>>>> >>>>>> The 'solution' described on that page is actually a workaround, >>>>>> not a >>>>>> bugfix. It causes problems with stack traces with error handling >>>>>> but >>>>>> seems harmless beyond that. I haven't been able to find a >>>>>> satisfactory fix which works on all OS's. >>>>>> >>>>>> Chris >>>>>> >>>>>> >>>>>> On Sep 28, 2006, at 10:42 AM, Seth Johnson wrote: >>>>>> >>>>>>> This is under Windows, but using ActiveState Komodo 3.5 and >>>>>>> their >>>>>>> latest Perl for Windows and latest BioPerl & BioPerl-db from >>>>>>> CVS. >>>>>>> >>>>>>> I actually just stumbled upon a solution. It's described in the >>>>>>> "Installing Bioperl on Windows" by adding a comma after >>> $class: in >>>>>>> Bio::Root::Root throw() subroutine. Thanks for hinting me about >>>>>>> what I run it on. >>>>>>> >>>>>>> The code works now, BUT it spews whole bunch of warnings about >>>>>>> "Subroutine .... redefined": >>>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>>>>> Subroutine new redefined at c:/Perl/site/lib/Bio\BioEntry >>>>>>> .pm line 88. >>>>>>> Subroutine object_id redefined at c:/Perl/site/lib/Bio >>> \BioEntry.pm >>>>>>> line 128. >>>>>>> Subroutine version redefined at c:/Perl/site/lib/Bio\BioEntry.pm >>>>>>> line 150. >>>>>>> Subroutine authority redefined at c:/Perl/site/lib/Bio >>> \BioEntry.pm >>>>>>> line 171. >>>>>>> Subroutine namespace redefined at c:/Perl/site/lib/Bio >>> \BioEntry.pm >>>>>>> line 192. >>>>>>> Subroutine display_name redefined at c:/Perl/site/lib/Bio >>>>>>> \BioEntry.pm line 217. >>>>>>> Subroutine description redefined at c:/Perl/site/lib/Bio >>>>>>> \BioEntry.pm line 241. >>>>>>> Subroutine new redefined at c:/Perl/site/lib/Bio\Root\Root.pm >>> line >>>>>>> 201. >>>>>>> Subroutine verbose redefined at c:/Perl/site/lib/Bio\Root >>>>>>> \Root.pm >>>>>>> line 234. >>>>>>> Subroutine _register_for_cleanup redefined at c:/Perl/site/lib/ >>> Bio >>>>>>> \Root\Root.pm line 246. >>>>>>> Subroutine _unregister_for_cleanup redefined at c:/Perl/site/ >>>>>>> lib/ >>>>>>> Bio >>>>>>> \Root\Root.pm line 256. >>>>>>> Subroutine _cleanup_methods redefined at c:/Perl/site/lib/Bio >>> \Root >>>>>>> \Root.pm line 263. >>>>>>> Subroutine throw redefined at c:/Perl/site/lib/Bio\Root\Root.pm >>>>>>> line 316. >>>>>>> Subroutine debug redefined at c:/Perl/site/lib/Bio\Root\Root.pm >>>>>>> line 379. >>>>>>> Subroutine _load_module redefined at c:/Perl/site/lib/Bio\Root >>>>>>> \Root.pm line 398. >>>>>>> Subroutine DESTROY redefined at c:/Perl/site/lib/Bio\Root >>>>>>> \Root.pm >>>>>>> line 426. >>>>>>> Subroutine new redefined at c:/Perl/site/lib/Bio\Root\RootI.pm >>> line >>>>>>> 117. >>>>>>> Subroutine _initialize redefined at c:/Perl/site/lib/Bio\Root >>>>>>> \RootI.pm line 128. >>>>>>> ... >>>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>>>>> >>>>>>> >>>>>>> On 9/28/06, Chris Fields wrote: I had >>> problems >>>>>>> with bioperl-db on native WinXP (not cygwin), but I >>>>>>> did manage to get it running in cygwin with some effort. The >>> issue >>>>>>> on native WinXP was related to Bio::Root::Root::throw(), though. >>>>>>> >>>>>>> There is a bug and workaround filed on Bugzilla, but I haven't >>>>>>> worked >>>>>>> on it in a while (and the workaround has some problems as >>> well). I >>>>>>> may try running it again to see what happens. >>>>>>> >>>>>>> http://bugzilla.open-bio.org/show_bug.cgi?id=1938 >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On Sep 28, 2006, at 9:04 AM, Hilmar Lapp wrote: >>>>>>> >>>>>>>> Very odd. This is under Windows, presumably using Cygwin? >>>>>>>> >>>>>>>> The method Bio::Root::Root::throw() clearly exists, and >>>>>>>> PersistentObject inherits from it. The exception it was >>> trying to >>>>>>>> throw has nothing to do with failure or success to find the >>>>>>>> database >>>>>>>> row (actually it did succeed since otherwise it wouldn't >>> construct >>>>>>>> the object) but with dynamically loading a class, presumably >>>>>>>> Bio::DB::Persistent::Seq. >>>>>>>> >>>>>>>> Are you using the 1.5.x release of bioperl? >>>>>>>> >>>>>>>> Does anyone on the list have any experience with these sorts of >>>>>>>> things on Windows? >>>>>>>> >>>>>>>> (Seth, I've moved this thread to the bioperl list, since >>>>>>>> this is >>>>>>> what >>>>>>>> the problem is about.) >>>>>>>> >>>>>>>> -hilmar >>>>>>>> >>>>>>>> On Sep 27, 2006, at 1:39 PM, Seth Johnson wrote: >>>>>>>> >>>>>>>>> Hello guys, >>>>>>>>> >>>>>>>>> I successfully populated the biosql database, thanks to you. >>>>>>>>> Now, >>>>>>>>> I'm >>>>>>>>> trying to retrieve a sequence from it following the example >>> from >>>>>>>>> BOSC2003 >>>>>>>>> slides and ran into uninformative error (at least to me it >>>>>>>>> doesn't >>>>>>>>> mean >>>>>>>>> anyting). I suspect that I'm missing something and hope you >>> can >>>>>>>>> point me in >>>>>>>>> the right direction. Here's my source code: >>>>>>>>> >>>>>>> >>> ------------------------------------------------------------------- >>>>>>> -- >>>>>>>>> - >>>>>>>>> --- >>>>>>>>> #!/usr/bin/perl -w >>>>>>>>> use strict; >>>>>>>>> use warnings; >>>>>>>>> >>>>>>>>> use Bio::Seq; >>>>>>>>> use Bio::Seq::SeqFactory; >>>>>>>>> use Bio::DB::SimpleDBContext; >>>>>>>>> use Bio::DB::BioDB; >>>>>>>>> >>>>>>>>> my $dbc = Bio::DB::SimpleDBContext->new( >>>>>>>>> -driver => 'mysql', >>>>>>>>> -dbname => 'BioSQL_1', >>>>>>>>> -host => ' 192.168.1.3', >>>>>>>>> -user => 'xxxxx', >>>>>>>>> -pass => 'xxxxxx' >>>>>>>>> ); >>>>>>>>> >>>>>>>>> my $db = Bio::DB::BioDB->new(-database => 'biosql', >>>>>>>>> -dbcontext => $dbc); >>>>>>>>> >>>>>>>>> my $seq = Bio::Seq->new(-accession_number => 'NM_014580', - >>>>>>>>> namespace => >>>>>>>>> 'refseq_H_sapiens'); >>>>>>>>> my $seqfact = Bio::Seq::SeqFactory->new(-type => 'Bio::Seq'); >>>>>>>>> my $adp = $db->get_object_adaptor($seq); >>>>>>>>> my $dbseq = $adp->find_by_unique_key($seq, -obj_factory => >>>>>>> $seqfact); >>>>>>>>> >>>>>>>>> my $out = Bio::SeqIO->newFh('-format' => 'EMBL'); >>>>>>>>> print $out $dbseq; >>>>>>>>> >>>>>>>>> exit; >>>>>>>>> >>> ----------------------------------------------------------------- >>>>>>>>> >>>>>>>>> Just when the "find_by_unique_key" function is executed I >>> get the >>>>>>>>> following >>>>>>>>> error: >>>>>>>>> >>>>>>>>> ================================ >>>>>>>>> Undefined subroutine &Bio::Root::Root::throw called at >>>>>>>>> c:/Perl/site/lib/Bio/DB/Persistent/PersistentObject.pm line >>> 199. >>>>>>>>> ================================ >>>>>>>>> >>>>>>>>> The sequence does exist in the database. I checked that. Any >>>>>>>>> ideas??? >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Best Regards, >>>>>>>>> >>>>>>>>> >>>>>>>>> Seth Johnson >>>>>>>>> Senior Bioinformatics Associate >>>>>>>>> _______________________________________________ >>>>>>>>> BioSQL-l mailing list >>>>>>>>> BioSQL-l at lists.open-bio.org >>>>>>>>> http://lists.open-bio.org/mailman/listinfo/biosql-l >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> =========================================================== >>>>>>>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>>>>>>> =========================================================== >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Bioperl-l mailing list >>>>>>>> Bioperl-l at lists.open-bio.org >>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>>>>> >>>>>>> Christopher Fields >>>>>>> Postdoctoral Researcher >>>>>>> Lab of Dr. Robert Switzer >>>>>>> Dept of Biochemistry >>>>>>> University of Illinois Urbana-Champaign >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best Regards, >>>>>>> >>>>>>> >>>>>>> Seth Johnson >>>>>>> Senior Bioinformatics Associate >>>>>>> >>>>>>> Ph: (202) 470-0900 >>>>>>> Fx: (775) 251-0358 >>>>>> >>>>>> Christopher Fields >>>>>> Postdoctoral Researcher >>>>>> Lab of Dr. Robert Switzer >>>>>> Dept of Biochemistry >>>>>> University of Illinois Urbana-Champaign >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Best Regards, >>>>> >>>>> >>>>> Seth Johnson >>>>> Senior Bioinformatics Associate >>>>> >>>>> Ph: (202) 470-0900 >>>>> Fx: (775) 251-0358 >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> Christopher Fields >>>> Postdoctoral Researcher >>>> Lab of Dr. Robert Switzer >>>> Dept of Biochemistry >>>> University of Illinois Urbana-Champaign >>>> >>>> >>>> >>> >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> >>> >>> >>> >>> >>> >>> >>> >>> -- >>> Best Regards, >>> >>> >>> Seth Johnson >>> Senior Bioinformatics Associate >>> >>> Ph: (202) 470-0900 >>> Fx: (775) 251-0358 >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> >> > > > -- > Best Regards, > > > Seth Johnson > Senior Bioinformatics Associate > > Ph: (202) 470-0900 > Fx: (775) 251-0358 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From osborne1 at optonline.net Sun Oct 1 21:49:47 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Sun, 01 Oct 2006 17:49:47 -0400 Subject: [Bioperl-l] bioperl-network warnings when loading the DIP dataset In-Reply-To: <20061001183214.GB12075@iucha.net> Message-ID: Florin, This is fixed in CVS now. What had happened is that the DIP file had some minimal protein (node) entries where the only id available was DIP's internal identifier. Not ideal to have to use these as accessions but there's no other choice. Thank you for the note, and in the future write to bioperl-l since there may be others who are interested in hearing about what you've encountered. Brian O. On 10/1/06 2:32 PM, "Florin Iucha" wrote: > Hello, > > I have downloaded a CVS snapshot [1] of your module, bioperl-network, and > I am using it to read the 20060402 edition release of the DIP [2] dataset. > > Starting with the simple program you show in the man page: > > my $io = Bio::Network::IO->new(-format => 'psi', > -file => $ARGV[0]); > > my $network = $io->next_network; > > I get 772 instances of: > > Use of uninitialized value in string eq at > /usr/local/share/perl/5.8.8/Bio/Network/IO/psi.pm line 326. > > I don't know if it is just an annoyance or something bad, so you might > want to take a look at it. > > Thank you for your work, > florin > > [1] http://cvs.open-bio.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-network/ > [2] http://dip.doe-mbi.ucla.edu/ From osborne1 at optonline.net Sun Oct 1 21:56:39 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Sun, 01 Oct 2006 17:56:39 -0400 Subject: [Bioperl-l] bioperl-network warnings when loading the DIP dataset In-Reply-To: <20061001211844.GC12075@iucha.net> Message-ID: Florin, I'm not seeing any segmentation fault using the same file you're using as input (dip20060402.mif). I'm assuming you don't see this error when you use smaller files as input, like those in the t/data directory. When I watch the script in top I see Perl using about 135Mb (RSIZE) right before the script exits. How much memory do you use? Thank you for the note, and in the future write to bioperl-l since there may be others who are interested in hearing about what you've encountered. Brian O. On 10/1/06 5:18 PM, "Florin Iucha" wrote: > On Sun, Oct 01, 2006 at 01:32:14PM -0500, Florin Iucha wrote: >> I have downloaded a CVS snapshot [1] of your module, bioperl-network, and >> I am using it to read the 20060402 edition release of the DIP [2] dataset. > > Using the attached script, I am getting a segmentation fault at the > end, right after printing "That's all, Folks!" Maybe some cleanup is > going off in a wrong direction. > > florin From florin at iucha.net Mon Oct 2 00:24:03 2006 From: florin at iucha.net (Florin Iucha) Date: Sun, 1 Oct 2006 19:24:03 -0500 Subject: [Bioperl-l] bioperl-network warnings when loading the DIP dataset In-Reply-To: References: <20061001211844.GC12075@iucha.net> Message-ID: <20061002002403.GD12075@iucha.net> On Sun, Oct 01, 2006 at 05:56:39PM -0400, Brian Osborne wrote: > I'm not seeing any segmentation fault using the same file you're using as > input (dip20060402.mif). I'm assuming you don't see this error when you use > smaller files as input, like those in the t/data directory. The t/data files are fine. Also the largest file (full_2_psi1.xml) from release 2006-09-19 of the MINT [1] database does not produce the crash. It has a new warning, however: Can't call method "text" on an undefined value at /usr/local/share/perl/5.8.8/Bio/Network/IO/psi.pm line 290. > When I watch the script in top I see Perl using about 135Mb (RSIZE) right > before the script exits. How much memory do you use? "ps ux" tells me VSZ = 272788 and RSZ = 254992. This is on x86-64 with 64 bit perl. The box has 2 GB of physical memory so these numbers don't seem to be a concern. > Thank you for the note, and in the future write to bioperl-l since there may > be others who are interested in hearing about what you've encountered. Do'h! You have the list address loud and clear in three places, but I got your contact info from the AUTHORS. Will use the proper channel from now on! Thanks, florin [1] ftp://mint.bio.uniroma2.it/pub/release/psi1/ -- If we wish to count lines of code, we should not regard them as lines produced but as lines spent. -- Edsger Dijkstra -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: From cjfields at uiuc.edu Mon Oct 2 04:35:22 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 1 Oct 2006 23:35:22 -0500 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: Message-ID: <000001c6e5dc$2eceabe0$15327e82@pyrimidine> Seth, What version of MySQL and perl are you using? I'm using MySQL 5.0.18 (but am upgrading to 5.0.24 tomorrow) and ActivePerl 5.8.819. I ran into a few problems with bioperl-db tests which were unrelated the ones below, but I'm wondering if it is a difference in MySQL versions. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Seth Johnson > Sent: Saturday, September 30, 2006 6:35 PM > To: Hilmar Lapp > Cc: Chris Fields; Bioperl List > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > Here're complete test details: > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ... > FAILED tests 10-12 > Failed 3/12 tests, 75.00% okay > Failed Test Stat Wstat Total Fail Failed List of Failed > -------------------------------------------------------------------------- > ----- > t\02species.t 65 2 3.08% 63 65 > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > t\12ontology.t 2 512 738 1471 199.32% 3-738 > t\16obda.t 12 3 25.00% 10-12 > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From torsten.seemann at infotech.monash.edu.au Mon Oct 2 06:06:50 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Mon, 02 Oct 2006 16:06:50 +1000 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: <2A31B5F3-8D44-45C3-89C6-E42D802C5D14@gmx.net> References: <451C8ED8.2060003@infotech.monash.edu.au> <451CC40D.2030401@sendu.me.uk> <2A31B5F3-8D44-45C3-89C6-E42D802C5D14@gmx.net> Message-ID: <4520AC7A.1050009@infotech.monash.edu.au> >>> I have removed all use/@ISA Bio::Root::Object references from >>> bioperl-live, except for those in Bio::Root::* itself: >> So I'd say they're both relics that can be removed. In fact I was >> planning on getting rid off all references to both of these modules >> before you did, so thanks! :) > I think they can go. It's probably a pre-1.0 deprecation that somehow > was never followed through on. Today I did a fresh CVS checkout of bioperl-live, and deleted the following modules and tests, and all tests passed with BIOPERLDEBUG=0 * Bio::Root::Err * Bio::Root::Global * Bio::Root::IOManager * Bio::Root::Object * Bio::Root::Storable * Bio::Root::Utilities # may be used by third parties? * Bio::Root::Vector * Bio::Root::Xref * t/Root-Utilities.t # need to keep if we keep Utilities.pm * t/RootStorable.t Should we schedule for deprecation, or deprecate immediately as Hilmar suggested they were meant to be deprecated long ago ? -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From bix at sendu.me.uk Mon Oct 2 09:40:02 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 02 Oct 2006 10:40:02 +0100 Subject: [Bioperl-l] Tests involving remote databases In-Reply-To: <3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu> References: <000001c6e3e6$81630010$15327e82@pyrimidine> <6761B0A9-DEF0-4F33-9C89-39AFE104B61F@gmx.net> <79D77258-5A4E-4428-9322-00962D5C0419@uiuc.edu> <451E3707.4090400@sendu.me.uk> <0DA94AF7-DB0E-4DC5-8896-1048089D9AC5@uiuc.edu> <84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net> <3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu> Message-ID: <4520DE72.4000603@sendu.me.uk> Chris Fields wrote: > > The idea is to retain current behavior (remote DB access will not be > run unless BIOPERLDEBUG is set to 1) and apply it to all tests > requiring such access. Otherwise, just those tests are skipped (and > not the rest of the tests, which occurs currently). If BIOPERLDEBUG > is set, the next tests would check the URL, which passes/fails (based > on the specific value of $@), and runs/skips tests based on the mere > presence of $@, which indicates some URL issue. You can do this with > Test::More, but I'm not sure this can be done with Test.pm or > Test::Simple. Firstly, BIOPERLDEBUG should not be abused; it should be used only when you want to see extra debugging messages. There should be another variable that you can set to choose if network-requiring tests are run, and it should also be a configurable choice when you run perl Makefile.PL. (But changing this isn't going to happen for 1.5.2) When the server problem is ambiguous we should not fail the test. Just make the skip message visible and pass all ok... > The current behavior just skips all tests based on a single failed > URL. Then, Test::Harness, as currently set, shows skipped tests as > passed. The last run I posted previously where XEMBL_DB.t remote DB > tests failed, I also ran all tests (make test) and get this, which > doesn't tell us that the remote URL failed: > > ----------------------------------------- > > ... > t/WABA.......................ok > t/XEMBL_DB...................ok > t/ztr........................Bio::SeqIO::staden::read of bioperl-ext > is not installed or is installed incorrectly - skipping ztr.t tests > ok > All tests successful, 5 subtests skipped. All you have to do to make it visible is start the skip message with the work 'Skip': skip('Skip server may be down',1); ... t/WABA.......................ok t/XEMBL_DB...................ok 1/9 skipped: server may be down t/ztr........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping ztr.t tests t/ztr........................ok It's nicer when using Test::More. From bix at sendu.me.uk Mon Oct 2 09:55:27 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 02 Oct 2006 10:55:27 +0100 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: <4520AC7A.1050009@infotech.monash.edu.au> References: <451C8ED8.2060003@infotech.monash.edu.au> <451CC40D.2030401@sendu.me.uk> <2A31B5F3-8D44-45C3-89C6-E42D802C5D14@gmx.net> <4520AC7A.1050009@infotech.monash.edu.au> Message-ID: <4520E20F.6040406@sendu.me.uk> Torsten Seemann wrote: > >>> I have removed all use/@ISA Bio::Root::Object references from > >>> bioperl-live, except for those in Bio::Root::* itself: > > >> So I'd say they're both relics that can be removed. In fact I was > >> planning on getting rid off all references to both of these modules > >> before you did, so thanks! :) > >> I think they can go. It's probably a pre-1.0 deprecation that somehow >> was never followed through on. > > Today I did a fresh CVS checkout of bioperl-live, and deleted the > following modules and tests, and all tests passed with BIOPERLDEBUG=0 > > * Bio::Root::Err > * Bio::Root::Global > * Bio::Root::IOManager > * Bio::Root::Object > * Bio::Root::Storable > * Bio::Root::Utilities # may be used by third parties? > * Bio::Root::Vector > * Bio::Root::Xref > * t/Root-Utilities.t # need to keep if we keep Utilities.pm > * t/RootStorable.t > > Should we schedule for deprecation, or deprecate immediately as Hilmar > suggested they were meant to be deprecated long ago ? I'm happy to get rid of them all straight away. Does anyone object? From florin at iucha.net Mon Oct 2 01:40:07 2006 From: florin at iucha.net (Florin Iucha) Date: Sun, 1 Oct 2006 20:40:07 -0500 Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext on AMD64 Message-ID: <20061002014007.GG12075@iucha.net> Hello, I am trying to install bioperl-network from CVS. I found this to require bioperl from CVS, which requires bioperl-ext from CVS. I have compiled and installed io_lib 1.10.1. After running "perl Makefile.PL; make test" in bioperl-ext I see a lot sources being compiled, then: cc -c -I./libs -D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -O2 -DVERSION=\"1.5.1\" -DXS_VERSION=\"1.5.1\" -fPIC "-I/usr/lib/perl/5.8/CORE" -DPOSIX -DNOERROR Align.c Running Mkbootstrap for Bio::Ext::Align () chmod 644 Align.bs rm -f ../blib/arch/auto/Bio/Ext/Align/Align.so cc -shared -L/usr/local/lib Align.o -o ../blib/arch/auto/Bio/Ext/Align/Align.so libs/libsw.a \ -lm \ /usr/bin/ld: libs/libsw.a(aln.o): relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC libs/libsw.a: could not read symbols: Bad value collect2: ld returned 1 exit status make[1]: *** [../blib/arch/auto/Bio/Ext/Align/Align.so] Error 1 make[1]: Leaving directory `/scratch/dmbio/tools/bioperl-ext/Bio/Ext/Align' make: *** [subdirs] Error 2 This is on a Debian AMD64 box: florin at zeus $ gcc -v Using built-in specs. Target: x86_64-linux-gnu Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr --enable-checking=release x86_64-linux-gnu Thread model: posix gcc version 4.1.2 20060901 (prerelease) (Debian 4.1.1-13) florin at zeus $ perl -V Summary of my perl5 (revision 5 version 8 subversion 8) configuration: Platform: osname=linux, osvers=2.6.16-1-vserver-amd64-k8, archname=x86_64-linux-gnu-thread-multi uname='linux excelsior 2.6.16-1-vserver-amd64-k8 #2 smp tue apr 4 03:40:49 utc 2006 x86_64 gnulinux ' config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=x86_64-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.8 -Darchlib=/usr/lib/perl/5.8 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.8.8 -Dsitearch=/usr/local/lib/perl/5.8.8 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Uusesfio -Uusenm -Duseshrplib -Dlibperl=libperl.so.5.8.8 -Dd_dosuid -des' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=define use64bitall=define uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2', cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include' ccversion='', gccversion='4.1.2 20060729 (prerelease) (Debian 4.1.1-10)', gccosandvers='' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='cc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt perllibs=-ldl -lm -lpthread -lc -lcrypt libc=/lib/libc-2.3.6.so, so=so, useshrplib=true, libperl=libperl.so.5.8.8 gnulibc_version='2.3.6' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: MULTIPLICITY PERL_IMPLICIT_CONTEXT PERL_MALLOC_WRAP THREADS_HAVE_PIDS USE_64_BIT_ALL USE_64_BIT_INT USE_ITHREADS USE_LARGE_FILES USE_PERLIO USE_REENTRANT_API The compiler command line for aln.o is lacking -fPIC: cc -D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DPOSIX -DNOERROR -c -o aln.o aln.c Adding -fPIC to the CCFLAGS variable in Bio/Ext/Align/Makefile and Makefile seems to take build further, but it fails with a similar error in Bio/SeqIO/staden/_Inline/build/Bio/SeqIO/staden/read. That Makefile seems to be regenerated every time I run 'make test' in the top level directory. The error in ../staden/read is: rm -f blib/arch/auto/Bio/SeqIO/staden/read/read.so cc -shared -L/usr/local/lib read.o -o blib/arch/auto/Bio/SeqIO/staden/read/read.so \ -L/usr/local/lib -lread -lz \ /usr/bin/ld: /usr/local/lib/libread.a(libread_a-Read.o): relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC /usr/local/lib/libread.a: could not read symbols: Bad value collect2: ld returned 1 exit status make: *** [blib/arch/auto/Bio/SeqIO/staden/read/read.so] Error 1 So, the questions appears to be: - should "-fPIC" be appended to CFLAGS in the generated Makefiles? - is there anything wrong with io_lib flags? - has anybody built bioperl-ext on AMD64? I can help with debugging or testing if given a gentle nudge in the right direction, but I have little experience with the interactions between perl and static libraries on 64 bit. Thanks, florin -- If we wish to count lines of code, we should not regard them as lines produced but as lines spent. -- Edsger Dijkstra -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: From bix at sendu.me.uk Mon Oct 2 10:52:47 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 02 Oct 2006 11:52:47 +0100 Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext on AMD64 In-Reply-To: <20061002014007.GG12075@iucha.net> References: <20061002014007.GG12075@iucha.net> Message-ID: <4520EF7F.40908@sendu.me.uk> Florin Iucha wrote: > Hello, > > I am trying to install bioperl-network from CVS. I found this to > require bioperl from CVS, which requires bioperl-ext from CVS. I can't help with the compile problems you encountered (other than to say I also have problems under AMD64), but from where did you get the idea that bioperl (live/core) requires bioperl-ext? It doesn't, though recent changes to Makefile.PL may give that impression... From cjfields at uiuc.edu Mon Oct 2 12:26:57 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 07:26:57 -0500 Subject: [Bioperl-l] Tests involving remote databases In-Reply-To: <4520DE72.4000603@sendu.me.uk> References: <000001c6e3e6$81630010$15327e82@pyrimidine> <6761B0A9-DEF0-4F33-9C89-39AFE104B61F@gmx.net> <79D77258-5A4E-4428-9322-00962D5C0419@uiuc.edu> <451E3707.4090400@sendu.me.uk> <0DA94AF7-DB0E-4DC5-8896-1048089D9AC5@uiuc.edu> <84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net> <3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu> <4520DE72.4000603@sendu.me.uk> Message-ID: On Oct 2, 2006, at 4:40 AM, Sendu Bala wrote: > Chris Fields wrote: >> >> The idea is to retain current behavior (remote DB access will not be >> run unless BIOPERLDEBUG is set to 1) and apply it to all tests >> requiring such access. Otherwise, just those tests are skipped (and >> not the rest of the tests, which occurs currently). If BIOPERLDEBUG >> is set, the next tests would check the URL, which passes/fails (based >> on the specific value of $@), and runs/skips tests based on the mere >> presence of $@, which indicates some URL issue. You can do this with >> Test::More, but I'm not sure this can be done with Test.pm or >> Test::Simple. > > Firstly, BIOPERLDEBUG should not be abused; it should be used only > when > you want to see extra debugging messages. There should be another > variable that you can set to choose if network-requiring tests are > run, > and it should also be a configurable choice when you run perl > Makefile.PL. > > (But changing this isn't going to happen for 1.5.2) > > When the server problem is ambiguous we should not fail the test. Just > make the skip message visible and pass all ok... I agree, as well as with your assessment of BIOPERLDEBUG (which I alluded to in a previous post). Torsten suggested creating a new env. variable for network tests. It's obvious this won't be done before 1.5.2, but we can make plans towards the next release. >> The current behavior just skips all tests based on a single failed >> URL. Then, Test::Harness, as currently set, shows skipped tests as >> passed. The last run I posted previously where XEMBL_DB.t remote DB >> tests failed, I also ran all tests (make test) and get this, which >> doesn't tell us that the remote URL failed: >> >> ----------------------------------------- >> >> ... >> t/WABA.......................ok >> t/XEMBL_DB...................ok >> t/ztr........................Bio::SeqIO::staden::read of bioperl-ext >> is not installed or is installed incorrectly - skipping ztr.t tests >> ok >> All tests successful, 5 subtests skipped. > > All you have to do to make it visible is start the skip message > with the > work 'Skip': > > skip('Skip server may be down',1); > > ... > t/WABA.......................ok > > t/XEMBL_DB...................ok > > 1/9 skipped: server may be down > t/ztr........................Bio::SeqIO::staden::read of bioperl- > ext is > not installed or is installed incorrectly - skipping ztr.t tests > t/ztr........................ok > > > It's nicer when using Test::More. Okay, if Test::Harness picks that up it would be okay. We could use skip blocks to skip subsets of tests that require remote access (like SeqFeature.t) as opposed to skipping all tests. I think we want to avoid promoting running tests with BIOPERLDEBUG (or similar) upon installation for everyday installation anyway (such as from CPAN, which Hilmar points out). It's not something everybody installing a new BioPerl should be running unless they run into problems. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From florin at iucha.net Mon Oct 2 12:15:06 2006 From: florin at iucha.net (Florin Iucha) Date: Mon, 2 Oct 2006 07:15:06 -0500 Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext on AMD64 In-Reply-To: <4520EF7F.40908@sendu.me.uk> References: <20061002014007.GG12075@iucha.net> <4520EF7F.40908@sendu.me.uk> Message-ID: <20061002121506.GB14409@iucha.net> On Mon, Oct 02, 2006 at 11:52:47AM +0100, Sendu Bala wrote: > Florin Iucha wrote: > > I am trying to install bioperl-network from CVS. I found this to > > require bioperl from CVS, which requires bioperl-ext from CVS. > > I can't help with the compile problems you encountered (other than to > say I also have problems under AMD64), but from where did you get the > idea that bioperl (live/core) requires bioperl-ext? It doesn't, though > recent changes to Makefile.PL may give that impression... Running the tests for bioperl-live mention in some places that 'this test has been skipped since $foo is not available' and I found the 'foos' in bioperl-ext. florin -- If we wish to count lines of code, we should not regard them as lines produced but as lines spent. -- Edsger Dijkstra -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: From bix at sendu.me.uk Mon Oct 2 14:05:11 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 02 Oct 2006 15:05:11 +0100 Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext on AMD64 In-Reply-To: <20061002121506.GB14409@iucha.net> References: <20061002014007.GG12075@iucha.net> <4520EF7F.40908@sendu.me.uk> <20061002121506.GB14409@iucha.net> Message-ID: <45211C97.2060800@sendu.me.uk> Florin Iucha wrote: > On Mon, Oct 02, 2006 at 11:52:47AM +0100, Sendu Bala wrote: >> Florin Iucha wrote: >>> I am trying to install bioperl-network from CVS. I found this to >>> require bioperl from CVS, which requires bioperl-ext from CVS. >> I can't help with the compile problems you encountered (other than to >> say I also have problems under AMD64), but from where did you get the >> idea that bioperl (live/core) requires bioperl-ext? It doesn't, though >> recent changes to Makefile.PL may give that impression... > > Running the tests for bioperl-live mention in some places that 'this > test has been skipped since $foo is not available' and I found the > 'foos' in bioperl-ext. Right, yes. The idea is, you'd only need to install bioperl-ext if you wanted to use the modules that the complaining tests test. So if none of the things that were skipped matter to you, don't install ext. I guess this needs to be clarified in documentation somewhere. From cjfields at uiuc.edu Mon Oct 2 14:13:56 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 09:13:56 -0500 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: <4520AC7A.1050009@infotech.monash.edu.au> Message-ID: <001801c6e62d$02c883d0$15327e82@pyrimidine> > >>> I have removed all use/@ISA Bio::Root::Object references from > >>> bioperl-live, except for those in Bio::Root::* itself: > > >> So I'd say they're both relics that can be removed. In fact I was > >> planning on getting rid off all references to both of these modules > >> before you did, so thanks! :) > > > I think they can go. It's probably a pre-1.0 deprecation that somehow > > was never followed through on. > > Today I did a fresh CVS checkout of bioperl-live, and deleted the > following modules and tests, and all tests passed with BIOPERLDEBUG=0 > > * Bio::Root::Err > * Bio::Root::Global > * Bio::Root::IOManager > * Bio::Root::Object > * Bio::Root::Storable > * Bio::Root::Utilities # may be used by third parties? > * Bio::Root::Vector > * Bio::Root::Xref > * t/Root-Utilities.t # need to keep if we keep Utilities.pm > * t/RootStorable.t > > Should we schedule for deprecation, or deprecate immediately as Hilmar > suggested they were meant to be deprecated long ago ? I vote for quick deprecation; I had also noticed that these were superfluous and added them as possible deprecations to the wiki page. However, we need to be careful about that 'third-party use' caveat you have for Bio::Root::Utilities; there's another one with Bio::Root::Storable and Ensembl: http://thread.gmane.org/gmane.comp.lang.perl.bio.general/2924/focus=2924 and it seems to have it's users: http://thread.gmane.org/gmane.comp.lang.perl.bio.general/8242/focus=8242 The others (including Bio::Root::Utilities) haven't had any major threads on the mail lists in a very long time. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -- > Dr Torsten Seemann http://www.vicbioinformatics.com > Victorian Bioinformatics Consortium, Monash University, Australia > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Mon Oct 2 14:16:31 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 09:16:31 -0500 Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-exton AMD64 In-Reply-To: <20061002121506.GB14409@iucha.net> Message-ID: <001901c6e62d$5c4fac80$15327e82@pyrimidine> They're not absolutely necessary; the tests are skipped w/o failure because bioperl-ext is optional. These are only necessary if you want the ability to read sequence trace files. BTW, you might have a rough time on trying to install bioperl-ext depending on your platform. Note the following bug report: http://bugzilla.open-bio.org/show_bug.cgi?id=2074 Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Florin Iucha > Sent: Monday, October 02, 2006 7:15 AM > To: bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] Failure to compile the CVS snapshot of bioperl- > exton AMD64 > > On Mon, Oct 02, 2006 at 11:52:47AM +0100, Sendu Bala wrote: > > Florin Iucha wrote: > > > I am trying to install bioperl-network from CVS. I found this to > > > require bioperl from CVS, which requires bioperl-ext from CVS. > > > > I can't help with the compile problems you encountered (other than to > > say I also have problems under AMD64), but from where did you get the > > idea that bioperl (live/core) requires bioperl-ext? It doesn't, though > > recent changes to Makefile.PL may give that impression... > > Running the tests for bioperl-live mention in some places that 'this > test has been skipped since $foo is not available' and I found the > 'foos' in bioperl-ext. > > florin > > -- > If we wish to count lines of code, we should not regard them as lines > produced but as lines spent. -- Edsger Dijkstra From osborne1 at optonline.net Mon Oct 2 14:14:13 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Mon, 02 Oct 2006 10:14:13 -0400 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: <4520E20F.6040406@sendu.me.uk> Message-ID: Sendu, No objection but someone should check the scripts in examples/root to make sure that they are not used there. Brian O. On 10/2/06 5:55 AM, "Sendu Bala" wrote: > Torsten Seemann wrote: >>>>> I have removed all use/@ISA Bio::Root::Object references from >>>>> bioperl-live, except for those in Bio::Root::* itself: >> >>>> So I'd say they're both relics that can be removed. In fact I was >>>> planning on getting rid off all references to both of these modules >>>> before you did, so thanks! :) >> >>> I think they can go. It's probably a pre-1.0 deprecation that somehow >>> was never followed through on. >> >> Today I did a fresh CVS checkout of bioperl-live, and deleted the >> following modules and tests, and all tests passed with BIOPERLDEBUG=0 >> >> * Bio::Root::Err >> * Bio::Root::Global >> * Bio::Root::IOManager >> * Bio::Root::Object >> * Bio::Root::Storable >> * Bio::Root::Utilities # may be used by third parties? >> * Bio::Root::Vector >> * Bio::Root::Xref >> * t/Root-Utilities.t # need to keep if we keep Utilities.pm >> * t/RootStorable.t >> >> Should we schedule for deprecation, or deprecate immediately as Hilmar >> suggested they were meant to be deprecated long ago ? > > I'm happy to get rid of them all straight away. Does anyone object? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From johnson.biotech at gmail.com Mon Oct 2 14:21:50 2006 From: johnson.biotech at gmail.com (Seth Johnson) Date: Mon, 2 Oct 2006 10:21:50 -0400 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: <000001c6e5dc$2eceabe0$15327e82@pyrimidine> References: <000001c6e5dc$2eceabe0$15327e82@pyrimidine> Message-ID: I'm using MySQL 5.0.19 and Perl v5.8.7 [MSWin32-x86-multi-thread] On 10/2/06, Chris Fields wrote: > > Seth, > > What version of MySQL and perl are you using? I'm using MySQL 5.0.18 (but > am upgrading to 5.0.24 tomorrow) and ActivePerl 5.8.819. > > I ran into a few problems with bioperl-db tests which were unrelated the > ones below, but I'm wondering if it is a difference in MySQL versions. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > -----Original Message----- > > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > > bounces at lists.open-bio.org] On Behalf Of Seth Johnson > > Sent: Saturday, September 30, 2006 6:35 PM > > To: Hilmar Lapp > > Cc: Chris Fields; Bioperl List > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > > > Here're complete test details: > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > ... > > > FAILED tests 10-12 > > Failed 3/12 tests, 75.00% okay > > Failed Test Stat Wstat Total Fail Failed List of Failed > > > -------------------------------------------------------------------------- > > ----- > > t\02species.t 65 2 3.08% 63 65 > > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > > t\12ontology.t 2 512 738 1471 199.32% 3-738 > > t\16obda.t 12 3 25.00% 10-12 > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- Best Regards, Seth Johnson Senior Bioinformatics Associate Ph: (202) 470-0900 Fx: (775) 251-0358 From osborne1 at optonline.net Mon Oct 2 14:08:50 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Mon, 02 Oct 2006 10:08:50 -0400 Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext on AMD64 In-Reply-To: <20061002014007.GG12075@iucha.net> Message-ID: Florian, Minor correction here, the Bioperl package does not require bioperl-ext. However we see there is a problem compiling bioperl-ext... Brian O. On 10/1/06 9:40 PM, "Florin Iucha" wrote: > I am trying to install bioperl-network from CVS. I found this to > require bioperl from CVS, which requires bioperl-ext from CVS. From JK at novozymes.com Mon Oct 2 14:05:34 2006 From: JK at novozymes.com (JK (Jesper Agerbo Krogh)) Date: Mon, 2 Oct 2006 16:05:34 +0200 Subject: [Bioperl-l] Blast parser. Message-ID: <934F95E71B6C9347A873C42AE3C196190B84C5D6@NZT0004E.dknz.nzcorp.net> Hi. I've tried to use the blast-parser but I cannot get the original alignment out of the parser. Is it possible to get that out of the Bio::Search::HSP::BlastHSP in some way. It seems quite odd to get a clustalw alignment out when it isn't that type of alignment people are used to get from blast. Thanks Jesper From cjfields at uiuc.edu Mon Oct 2 14:36:31 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 09:36:31 -0500 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: Message-ID: <001d01c6e630$27792fb0$15327e82@pyrimidine> > Sendu, > > No objection but someone should check the scripts in examples/root to make > sure that they are not used there. > > Brian O. I suppose it's also possible that the other bioperl distributions (like bioperl-run) could use them as well. If they do we can take care of them as they pop up. These are really old and haven't been revised in a long time. The only one I worry about is Bio::Root::Storable b/c of Ensembl. Does anyone know where Will Spooner is? He's the maintainer for Bio::Root::Storable. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 2 15:01:44 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 10:01:44 -0500 Subject: [Bioperl-l] Blast parser. In-Reply-To: <934F95E71B6C9347A873C42AE3C196190B84C5D6@NZT0004E.dknz.nzcorp.net> Message-ID: <000001c6e633$ad0a6ce0$15327e82@pyrimidine> The alignment that you get should come from GenericHSP, not BLASTHSP. Either way, the HSP alignment that is retrieved using $hsp->get_aln() should be a Bio::SimpleAlign object. You can then output that to the proper AlignIO format using an AlignIO stream object or use the Bio::SimpleAlign methods for further analysis. my $aln = $hsp->get_aln(); my $alnout = Bio::AlignIO->new(-format => 'msf', -fh => \*STDOUT); $alnout->write_aln($aln); Quick note: not all AlignIO formats have write_aln() support at this time, but most do. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of JK (Jesper Agerbo Krogh) > Sent: Monday, October 02, 2006 9:06 AM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] Blast parser. > > > Hi. > > I've tried to use the blast-parser but I cannot get the original alignment > out of the parser. Is it possible to get that out of the > Bio::Search::HSP::BlastHSP in some way. It seems quite odd to get a > clustalw alignment out when it isn't that type of alignment people are > used to get from blast. > > Thanks > > Jesper > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From whs at ebi.ac.uk Mon Oct 2 16:00:19 2006 From: whs at ebi.ac.uk (Will Spooner) Date: Mon, 2 Oct 2006 17:00:19 +0100 (BST) Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: <001d01c6e630$27792fb0$15327e82@pyrimidine> References: <001d01c6e630$27792fb0$15327e82@pyrimidine> Message-ID: On Mon, 2 Oct 2006, Chris Fields wrote: >> Sendu, >> >> No objection but someone should check the scripts in examples/root to make >> sure that they are not used there. >> >> Brian O. > > I suppose it's also possible that the other bioperl distributions (like > bioperl-run) could use them as well. > > If they do we can take care of them as they pop up. These are really old > and haven't been revised in a long time. > > The only one I worry about is Bio::Root::Storable b/c of Ensembl. Does > anyone know where Will Spooner is? He's the maintainer for > Bio::Root::Storable. > Hi Chris, I'm still lurking... If the tests for Bio::Root::Storable still pass (I assume that they do), then the module is working as advertised. The idea behind Storable is very simple; object instances of any inhereting class can be serialised/retrieved from disk. BioPerl objects will probably not want this functionality by default, but it is trival to implement if needed. Will From cjfields at uiuc.edu Mon Oct 2 17:58:15 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 12:58:15 -0500 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: Message-ID: <000601c6e64c$5746f990$15327e82@pyrimidine> > On Mon, 2 Oct 2006, Chris Fields wrote: > > >> Sendu, > >> > >> No objection but someone should check the scripts in examples/root to > make > >> sure that they are not used there. > >> > >> Brian O. > > > > I suppose it's also possible that the other bioperl distributions (like > > bioperl-run) could use them as well. > > > > If they do we can take care of them as they pop up. These are really > old > > and haven't been revised in a long time. > > > > The only one I worry about is Bio::Root::Storable b/c of Ensembl. Does > > anyone know where Will Spooner is? He's the maintainer for > > Bio::Root::Storable. > > > > Hi Chris, > > I'm still lurking... > > If the tests for Bio::Root::Storable still pass (I assume that they do), > then the module is working as advertised. > > The idea behind Storable is very simple; object instances of any > inhereting class can be serialised/retrieved from disk. BioPerl objects > will probably not want this functionality by default, but it is trival to > implement if needed. > > Will Okay, nice to know you're listening in! Based on that we should keep it in. The rest that Torsten mentioned could probably be removed right away. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From osborne1 at optonline.net Mon Oct 2 17:59:58 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Mon, 02 Oct 2006 13:59:58 -0400 Subject: [Bioperl-l] bioperl-network warnings when loading the DIP dataset In-Reply-To: <20061002002403.GD12075@iucha.net> Message-ID: Florin, OK, this is fixed in CVS now. The problem is that there's some variability in how the PSI MI "standard" is used. In this case there was a species that was not given a value for its scientific name ("fullName"), I had to use common name in its place. Fortunately there's an NCBI taxon id behind all this. Thanks again, Brian O. On 10/1/06 8:24 PM, "Florin Iucha" wrote: > Also the largest file (full_2_psi1.xml) from release 2006-09-19 of the > MINT [1] database does not produce the crash. It has a new warning, however: > > Can't call method "text" on an undefined value at > /usr/local/share/perl/5.8.8/Bio/Network/IO/psi.pm line 290. From mmacho at gmail.com Mon Oct 2 17:43:13 2006 From: mmacho at gmail.com (ende) Date: Mon, 2 Oct 2006 19:43:13 +0200 Subject: [Bioperl-l] Variable scope Message-ID: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com> Hi this may be a typical perl topic and then out of this list center topic. My apologize for any inconvenience. It is a annoying problem that is making me waste lot of time. I have a package with its new object, etc... and constants in it like: #----- use constant False => 0; use constant True => 1; our %CLRFG = ( PLASMIDO => RED, POLY_A => GREEN, RESTR_SITES => BLUE, CONECTORS => MAGENTA, CONTAMINANTS => CYAN, ); our %CLRBG = ( PLASMIDO => "", POLY_A => "", RESTR_SITES => "", CONECTORS => "", CONTAMINANTS => "", ); #------ this constants are include with require "h.pl" from the main package file. I use this module from the mail command line driver to test it "using" it. In the command line driver I can use with no gripe the constants False and True directly, for example "return True", etc without any reference to the origin of that constant. But, with respect to the variables (I would like they also were constants.. but how?), %CLRFG and %CLRBG I can't find the way of refering those int the module. Finally I have desisted and _copy_ the definitions where I have needed it (in the sub were I print Ansi terminal colouring seqs...). I don't find how to refer those variables out of the module. I have tried %modulename::CLRFG, for example, but Perl gives me errors. Any help? -- Juan Falgueras Profesor del Depto. de Lenguajes y Ciencias de la Computaci?n Universidad de M?laga From cjfields at uiuc.edu Mon Oct 2 20:52:11 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 15:52:11 -0500 Subject: [Bioperl-l] Do we need Bio::Root::Object anymore? In-Reply-To: Message-ID: <000001c6e664$a25538d0$15327e82@pyrimidine> I have updated the Deprecation page with the Bio::Root::* modules that we plan on deprecating (note that I have them being removed for rel. 1.5.2). I have left out Bio::Root::Storable for now based on Will's response. http://www.bioperl.org/wiki/Deprecated_modules I'll update the DEPRECATED doc in CVS as well. There is a tentative schedule for when warnings are added for modules before they are removed. In relation to the recent trend for house-cleaning, I noticed that all of the Bio::Tools::BP* BLAST-related modules all are still present but haven't been modified or had deprecation warnings added. BPLite was marked for deprecation around rel 1.5 since the functionality is present in Bio::SearchIO, as well as the others. Judging by the mail list, no one has used these in quite a while, and everyone has been redirected to use Bio::SearchIO instead. Based on that I have added warnings in CVS for deprecation to BPlite and the related modules BPpsilite and BPbl2seq. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Brian Osborne > Sent: Monday, October 02, 2006 9:14 AM > To: Sendu Bala; bioperl-l > Subject: Re: [Bioperl-l] Do we need Bio::Root::Object anymore? > > Sendu, > > No objection but someone should check the scripts in examples/root to make > sure that they are not used there. > > Brian O. > > > On 10/2/06 5:55 AM, "Sendu Bala" wrote: > > > Torsten Seemann wrote: > >>>>> I have removed all use/@ISA Bio::Root::Object references from > >>>>> bioperl-live, except for those in Bio::Root::* itself: > >> > >>>> So I'd say they're both relics that can be removed. In fact I was > >>>> planning on getting rid off all references to both of these modules > >>>> before you did, so thanks! :) > >> > >>> I think they can go. It's probably a pre-1.0 deprecation that somehow > >>> was never followed through on. > >> > >> Today I did a fresh CVS checkout of bioperl-live, and deleted the > >> following modules and tests, and all tests passed with BIOPERLDEBUG=0 > >> > >> * Bio::Root::Err > >> * Bio::Root::Global > >> * Bio::Root::IOManager > >> * Bio::Root::Object > >> * Bio::Root::Storable > >> * Bio::Root::Utilities # may be used by third parties? > >> * Bio::Root::Vector > >> * Bio::Root::Xref > >> * t/Root-Utilities.t # need to keep if we keep Utilities.pm > >> * t/RootStorable.t > >> > >> Should we schedule for deprecation, or deprecate immediately as Hilmar > >> suggested they were meant to be deprecated long ago ? > > > > I'm happy to get rid of them all straight away. Does anyone object? > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From florin at iucha.net Mon Oct 2 20:47:01 2006 From: florin at iucha.net (Florin Iucha) Date: Mon, 2 Oct 2006 15:47:01 -0500 Subject: [Bioperl-l] Variable scope In-Reply-To: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com> References: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com> Message-ID: <20061002204701.GG14409@iucha.net> On Mon, Oct 02, 2006 at 07:43:13PM +0200, ende wrote: > It is a annoying problem that is making me waste lot of time. > > I have a package with its new object, etc... and constants in it like: > > #----- > use constant False => 0; > use constant True => 1; > > our %CLRFG = ( > PLASMIDO => RED, > POLY_A => GREEN, > RESTR_SITES => BLUE, > CONECTORS => MAGENTA, > CONTAMINANTS => CYAN, > ); > > our %CLRBG = ( > PLASMIDO => "", > POLY_A => "", > RESTR_SITES => "", > CONECTORS => "", > CONTAMINANTS => "", > ); > #------ > > this constants are include with require "h.pl" from the main package > file. > > I use this module from the mail command line driver to test it > "using" it. In the command line driver I can use with no gripe the > constants False and True directly, for example "return True", etc > without any reference to the origin of that constant. It is possible you get them from somewhere else. > But, with respect to the variables (I would like they also were > constants.. but how?), %CLRFG and %CLRBG I can't find the way of > refering those int the module. Finally I have desisted and _copy_ > the definitions where I have needed it (in the sub were I print Ansi > terminal colouring seqs...). I don't find how to refer those > variables out of the module. > > I have tried %modulename::CLRFG, for example, but Perl gives me errors. Did you actually declare a package name in "h.pl" ? Is there any reason you don't call the file ".pm" and load it with "use"? I have attached a small example of importing that works. florin -- If we wish to count lines of code, we should not regard them as lines produced but as lines spent. -- Edsger Dijkstra -------------- next part -------------- A non-text attachment was scrubbed... Name: one.pm Type: text/x-perl Size: 118 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: two.pl Type: text/x-perl Size: 69 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: From Kevin.M.Brown at asu.edu Mon Oct 2 23:44:50 2006 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Mon, 2 Oct 2006 16:44:50 -0700 Subject: [Bioperl-l] ClustalW alignment and Bioperl::Run module Message-ID: <1A4207F8295607498283FE9E93B775B4021960CD@EX02.asurite.ad.asu.edu> Well, for anyone that wants to know, I found a way to capture the output of ClustalW to get at things like the score. Copy STDOUT to another handle open(OUTCOPY, ">&STDOUT") or die "Couldn't dup STDOUT: $!"; Change where STDOUT goes open(STDOUT, ">log.test") or die "Couldn't open log.test: $!"; Run the alignment and its output will be captured by the STDOUT redirection $aln, $factory->align(\@seq); Restore STDOUT to its normal location for the rest of the script close STDOUT; open(STDOUT, ">&OUTCOPY"); I guess I can understand why most of this is just dropped by the ClustalW.pm module since there doesn't seem to be a way to hold it all in a SimpleAlign object. > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Kevin Brown > Sent: Thursday, September 28, 2006 2:48 PM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] ClustalW alignment and Bioperl::Run module > > I've gotten a very simple script to run using bioperl that creates an > alignment using clustalw of two sequences. I see that clustal outputs > to stdout information like the score, but I don't see any way to store > that or retrieve that from the alignment object that is > returned (unless > I'm just blind). What follows is my very basic script which used code > found in the Wiki. > > print $aln->score() spits out an error about using an uninitialized > value. > > > #!/usr/bin/perl -w > > use strict; > use Bio::SeqIO; > use Bio::Perl; > use Bio::AlignIO; > use Getopt::Long qw(:config no_ignore_case bundling pass_through); > use POSIX; > use Bio::Tools::Run::Alignment::Clustalw; > > my $fileName = ""; # filename(s) to be parsed for > information > my $output_dir = ""; > my $format = 'fasta'; # default format for SeqIO module > > GetOptions( > 'file=s' => \$fileName, > 'output=s' => \$output_dir, > ); > > # Parse the input file for the needed information > # SeqIO supports several normal formats including , and > > > my @files = split(/\|/, $fileName); > my @seq_array; > > my $stream_out = > Bio::AlignIO->new(-file => '>test.msf', -format => 'msf', -flush => > 0); > > foreach my $fileName (@files) > { > my $file = Bio::SeqIO->new(-format => $format, -file => > $fileName); > my $seq; > while ($seq = $file->next_seq()) > { > push(@seq_array, $seq); > } > } > > my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM'); > my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params); > my $ktuple = 3; > $factory->ktuple($ktuple); # change the parameter before executing > # where @seq_array is an array of {{PM|Bio::Seq}} objects > > open my $out, ">seq.txt"; > > for (my $i = 1 ; $i <= $#seq_array ; $i++) > { > my @seq = ($seq_array[0], $seq_array[$i]); > my $aln = $factory->align(\@seq); > $stream_out->write_aln($aln); > print $aln->score; > for my $seq ($aln->each_seq) { > print $out $seq->display_id() ."\t". $seq->seq()."\n"; > } > } > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From bix at sendu.me.uk Mon Oct 2 23:48:34 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 03 Oct 2006 00:48:34 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 Message-ID: <4521A552.60301@sendu.me.uk> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll upload tar.gz files when I have access to the server, then reply here with links. In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for instructions on getting and testing this RC. Developers: Make sure you're in the AUTHORS file in all 4 packages, as appropriate. Users: Even though 1.5.2 is a 'developer' release, we consider it the most stable and capable version of Bioperl, and recommend that you use it in all but the most critical production environments. Please try it out and let us know of any problems or difficulties you run into. Thank you, Sendu. From lincoln.stein at gmail.com Mon Oct 2 21:53:38 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Mon, 2 Oct 2006 21:53:38 +0000 Subject: [Bioperl-l] Variable scope In-Reply-To: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com> References: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com> Message-ID: <6dce9a0b0610021453va2132c7u73747b9253211a66@mail.gmail.com> Hi, Read the documentation in Export. It is much better to formally export constants, variables and functions and to import them with "use" than to use "require". Also be sure that you understand how namespaces and modules work. This is not a BioPerl topic and should have been directed to a general Perl discussion list, such as Perl Monks. Lincoln On 10/2/06, ende wrote: > > > Hi > > this may be a typical perl topic and then out of this list center > topic. My apologize for any inconvenience. > > It is a annoying problem that is making me waste lot of time. > > I have a package with its new object, etc... and constants in it like: > > #----- > use constant False => 0; > use constant True => 1; > > our %CLRFG = ( > PLASMIDO => RED, > POLY_A => GREEN, > RESTR_SITES => BLUE, > CONECTORS => MAGENTA, > CONTAMINANTS => CYAN, > ); > > our %CLRBG = ( > PLASMIDO => "", > POLY_A => "", > RESTR_SITES => "", > CONECTORS => "", > CONTAMINANTS => "", > ); > #------ > > this constants are include with require "h.pl" from the main package > file. > > I use this module from the mail command line driver to test it > "using" it. In the command line driver I can use with no gripe the > constants False and True directly, for example "return True", etc > without any reference to the origin of that constant. > > But, with respect to the variables (I would like they also were > constants.. but how?), %CLRFG and %CLRBG I can't find the way of > refering those int the module. Finally I have desisted and _copy_ > the definitions where I have needed it (in the sub were I print Ansi > terminal colouring seqs...). I don't find how to refer those > variables out of the module. > > I have tried %modulename::CLRFG, for example, but Perl gives me errors. > > Any help? > > > > > -- > Juan Falgueras > Profesor del Depto. de Lenguajes y Ciencias de la Computaci?n > Universidad de M?laga > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From florin at iucha.net Tue Oct 3 02:30:31 2006 From: florin at iucha.net (Florin Iucha) Date: Mon, 2 Oct 2006 21:30:31 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <4521A552.60301@sendu.me.uk> References: <4521A552.60301@sendu.me.uk> Message-ID: <20061003023031.GI14409@iucha.net> On Tue, Oct 03, 2006 at 12:48:34AM +0100, Sendu Bala wrote: > Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll > upload tar.gz files when I have access to the server, then reply here > with links. > > In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for > instructions on getting and testing this RC. [I won't create a wiki account just to report this.] Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG not set. Lots of warnings about missing packages and all, but this looks interesting: Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423. Otherwise: Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, 99.99% okay. The failed test is: t/ESEfinder..................dubious Test returned status 255 (wstat 65280, 0xff00) DIED. FAILED test 15 florin -- If we wish to count lines of code, we should not regard them as lines produced but as lines spent. -- Edsger Dijkstra From cjfields at uiuc.edu Tue Oct 3 03:50:47 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 22:50:47 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <4521A552.60301@sendu.me.uk> References: <4521A552.60301@sendu.me.uk> Message-ID: <7F5CF5FF-5A1D-4DEA-A861-D4C47756F67D@uiuc.edu> So far all tests pass on Mac OS X. I'll add this to the release page. This RC will throw warnings for four tests I didn't remove in time (BPlite.t, BPpsilite, BPbl2seq.t, and RestrictionEnzyme.t), which correspond to their namesake deprecated Bio::Tools modules. These are no longer in CVS HEAD so should be gone by the next RC, and the relevant modules marked for deprecation. I can verify the Bio::DB::SeqFeature.t warning on Mac OS X that Florin reported, but ESEFinder.t works fine: t/BioDBSeqFeature............Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423. ok .... I'll report WinXP tests tomorrow on the wiki. Chris On Oct 2, 2006, at 6:48 PM, Sendu Bala wrote: > Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll > upload tar.gz files when I have access to the server, then reply here > with links. > > In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for > instructions on getting and testing this RC. > > Developers: > Make sure you're in the AUTHORS file in all 4 packages, as > appropriate. > > Users: > Even though 1.5.2 is a 'developer' release, we consider it the most > stable and capable version of Bioperl, and recommend that you use > it in all but the most critical production environments. Please > try it out and let us know of any problems or difficulties you run > into. > > > Thank you, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 3 03:54:29 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 2 Oct 2006 22:54:29 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <20061003023031.GI14409@iucha.net> References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net> Message-ID: > [I won't create a wiki account just to report this.] > > Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG > not set. Lots of warnings about missing packages and all, but this > looks interesting: > > Argument "+" isn't numeric in numeric lt (<) at Bio/DB/ > SeqFeature/Segment.pm line 423. This is verified on Mac OS X. > Otherwise: > > Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, > 99.99% okay. > > The failed test is: > > t/ESEfinder..................dubious > Test returned status 255 (wstat 65280, 0xff00) > DIED. FAILED test 15 What do you get when you run that set of tests using 'perl -I. -w t/ ESEFinder.t'? The bad status code is odd and could be a remote server issue. Chris > > florin > > -- > If we wish to count lines of code, we should not regard them as lines > produced but as lines spent. -- Edsger Dijkstra > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From torsten.seemann at infotech.monash.edu.au Tue Oct 3 04:30:06 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Tue, 03 Oct 2006 14:30:06 +1000 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm Message-ID: <4521E74E.1040404@infotech.monash.edu.au> My understanding is that all Bioperl-compliant classes should inherit from Bio::Root::Root, not Bio::Root::RootI. Additionally, if functions such as throw() or _rearrange() are to be used without a class instance reference, they are to be used as class methods via Bio::Root::Root, not Bio::Root::RootI. Is this correct? My naive audit of bioperl-live CVS brought up the following statistics: # Root.pm /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l 26 /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | wc -l 346 # RootI.pm /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l 9 /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | wc -l 79 My guess would be that all RootI should be changed to plain Root ? Any help appreciated, -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From jason at bioperl.org Tue Oct 3 06:03:17 2006 From: jason at bioperl.org (Jason Stajich) Date: Mon, 2 Oct 2006 23:03:17 -0700 Subject: [Bioperl-l] t/ESEFinder.t fixed on branch Message-ID: <8927E580-997E-402A-BA81-D09FF9AD5D9A@bioperl.org> Looks like good work everyone. All tests pass for me on OSX perl 5.8.6. w and w/o BIOPERLDEBUG=1 with RC1 except for the t/ESEFinder problem which I've fixed. It skipped too few tests when BIOPERLDEBUG=0. Don't forget to merge branch changes back to head for this test when it is done. I don't want to muddy water so I'm holding off migrating the changes to main trunk as the files is substantially different (I presume pre-Test::More adoption?). -jason From bix at sendu.me.uk Tue Oct 3 07:28:48 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 03 Oct 2006 08:28:48 +0100 Subject: [Bioperl-l] t/ESEFinder.t fixed on branch In-Reply-To: <8927E580-997E-402A-BA81-D09FF9AD5D9A@bioperl.org> References: <8927E580-997E-402A-BA81-D09FF9AD5D9A@bioperl.org> Message-ID: <45221130.2060405@sendu.me.uk> Jason Stajich wrote: > Looks like good work everyone. > > All tests pass for me on OSX perl 5.8.6. w and w/o BIOPERLDEBUG=1 > with RC1 except for the t/ESEFinder problem which I've fixed. > > It skipped too few tests when BIOPERLDEBUG=0. > > Don't forget to merge branch changes back to head for this test when > it is done. I don't want to muddy water so I'm holding off > migrating the changes to main trunk as the files is substantially > different (I presume pre-Test::More adoption?). Actually, it was the same until Torsten made his own (different) fixes to HEAD but not to branch. It was my mistake and I've corrected in yet a third way, and now branch and HEAD match. No harm done :) From bix at sendu.me.uk Tue Oct 3 07:31:10 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 03 Oct 2006 08:31:10 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <7F5CF5FF-5A1D-4DEA-A861-D4C47756F67D@uiuc.edu> References: <4521A552.60301@sendu.me.uk> <7F5CF5FF-5A1D-4DEA-A861-D4C47756F67D@uiuc.edu> Message-ID: <452211BE.6080107@sendu.me.uk> Chris Fields wrote: > So far all tests pass on Mac OS X. I'll add this to the release page. > > This RC will throw warnings for four tests I didn't remove in time > (BPlite.t, BPpsilite, BPbl2seq.t, and RestrictionEnzyme.t), which > correspond to their namesake deprecated Bio::Tools modules. These > are no longer in CVS HEAD so should be gone by the next RC, and the > relevant modules marked for deprecation. Thanks Chris. Sorry I missed these. From bix at sendu.me.uk Tue Oct 3 07:32:08 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 03 Oct 2006 08:32:08 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <20061003023031.GI14409@iucha.net> References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net> Message-ID: <452211F8.8040104@sendu.me.uk> Florin Iucha wrote: > On Tue, Oct 03, 2006 at 12:48:34AM +0100, Sendu Bala wrote: >> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll >> upload tar.gz files when I have access to the server, then reply here >> with links. >> >> In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for >> instructions on getting and testing this RC. > > [I won't create a wiki account just to report this.] > > Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG > not set. Lots of warnings about missing packages and all, but this > looks interesting: > > Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423. > > Otherwise: > > Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, 99.99% okay. > > The failed test is: > > t/ESEfinder..................dubious > Test returned status 255 (wstat 65280, 0xff00) > DIED. FAILED test 15 Thanks for your feedback Florin. The ESEfinder fail will be fixed in the next RC. From bix at sendu.me.uk Tue Oct 3 08:29:37 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 03 Oct 2006 09:29:37 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <4521A552.60301@sendu.me.uk> References: <4521A552.60301@sendu.me.uk> Message-ID: <45221F71.40206@sendu.me.uk> Sendu Bala wrote: > Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll > upload tar.gz files when I have access to the server, then reply here > with links. Live/core: http://bioperl.org/DIST/bioperl-1.5.2-RC1.tar.gz http://bioperl.org/DIST/bioperl-1.5.2-RC1.tar.bz2 http://bioperl.org/DIST/bioperl-1.5.2-RC1.zip Run: http://bioperl.org/DIST/bioperl-run-1.5.2-RC1.tar.gz http://bioperl.org/DIST/bioperl-run-1.5.2-RC1.tar.bz2 http://bioperl.org/DIST/bioperl-run-1.5.2-RC1.zip DB: http://bioperl.org/DIST/bioperl-db-1.5.2-RC1.tar.gz http://bioperl.org/DIST/bioperl-db-1.5.2-RC1.tar.bz2 http://bioperl.org/DIST/bioperl-db-1.5.2-RC1.zip Network: http://bioperl.org/DIST/bioperl-network-1.5.2-RC1.tar.gz http://bioperl.org/DIST/bioperl-network-1.5.2-RC1.tar.bz2 http://bioperl.org/DIST/bioperl-network-1.5.2-RC1.zip Md5 checksums are in: http://bioperl.org/DIST/SIGNATURES.md5 From jason at bioperl.org Tue Oct 3 06:11:30 2006 From: jason at bioperl.org (Jason Stajich) Date: Mon, 2 Oct 2006 23:11:30 -0700 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm Message-ID: <87F9B64E-8BDA-464B-814D-3F117AA646A1@bioperl.org> I only briefly saw your question - but RootI is for interfaces, Root.pm is for instantiated objects. From florin at iucha.net Tue Oct 3 11:39:12 2006 From: florin at iucha.net (Florin Iucha) Date: Tue, 3 Oct 2006 06:39:12 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net> Message-ID: <20061003113912.GJ14409@iucha.net> On Mon, Oct 02, 2006 at 10:54:29PM -0500, Chris Fields wrote: > >Otherwise: > > > > Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, > >99.99% okay. > > > >The failed test is: > > > > t/ESEfinder..................dubious > > Test returned status 255 (wstat 65280, 0xff00) > > DIED. FAILED test 15 $ perl -I. -w t/ESEfinder.t 1..15 ok 1 - use Bio::Tools::Analysis::DNA::ESEfinder; ok 2 - use Data::Dumper; ok 3 - use Bio::PrimarySeq; ok 4 - use Bio::Seq; ok 5 ok 6 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 7 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 8 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 9 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 10 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 11 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 12 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 13 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 14 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test # Looks like you planned 15 tests but only ran 14. $ grep Id t/ESEfinder.t # $Id: ESEfinder.t,v 1.13.6.2 2006/10/02 23:10:39 sendu Exp $ florin -- If we wish to count lines of code, we should not regard them as lines produced but as lines spent. -- Edsger Dijkstra From hlapp at gmx.net Tue Oct 3 12:27:46 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 3 Oct 2006 08:27:46 -0400 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <4521E74E.1040404@infotech.monash.edu.au> References: <4521E74E.1040404@infotech.monash.edu.au> Message-ID: <338034D7-BCB0-4653-9C80-2E7EC9E09E9B@gmx.net> The interface classes (those ending in 'I') should actually inherit from RootI, not Root. In reality this recommendation is more theoretical than it makes that much of a difference I think. The motivation is that interface classes should not determine the actual implementation of a class (hash ref, array ref, whatever), and since Root.pm contains lots of implementation using a hash ref that decision will basically have been made. On the contrary though, RootI contains implementation too, although I'm not sure it would prescribe the object implementation as opposed to merely implementing static methods (like throw(), warn(), etc). That would need to be checked. -hilmar On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote: > My understanding is that all Bioperl-compliant classes should inherit > from Bio::Root::Root, not Bio::Root::RootI. > > Additionally, if functions such as throw() or _rearrange() are to be > used without a class instance reference, they are to be used as class > methods via Bio::Root::Root, not Bio::Root::RootI. > > Is this correct? > > My naive audit of bioperl-live CVS brought up the following > statistics: > > # Root.pm > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l > 26 > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | wc -l > 346 > > # RootI.pm > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l > 9 > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | wc -l > 79 > > My guess would be that all RootI should be changed to plain Root ? > > Any help appreciated, > > -- > Dr Torsten Seemann http://www.vicbioinformatics.com > Victorian Bioinformatics Consortium, Monash University, Australia > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Tue Oct 3 12:33:37 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 3 Oct 2006 07:33:37 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <20061003113912.GJ14409@iucha.net> References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net> <20061003113912.GJ14409@iucha.net> Message-ID: <44724E16-74CD-4778-B04F-529475B47E37@uiuc.edu> Florin, Looks like this is fixed and should be working in the next release. Chris On Oct 3, 2006, at 6:39 AM, Florin Iucha wrote: > On Mon, Oct 02, 2006 at 10:54:29PM -0500, Chris Fields wrote: >>> Otherwise: >>> >>> Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, >>> 99.99% okay. >>> >>> The failed test is: >>> >>> t/ESEfinder..................dubious >>> Test returned status 255 (wstat 65280, 0xff00) >>> DIED. FAILED test 15 > > $ perl -I. -w t/ESEfinder.t > 1..15 > ok 1 - use Bio::Tools::Analysis::DNA::ESEfinder; > ok 2 - use Data::Dumper; > ok 3 - use Bio::PrimarySeq; > ok 4 - use Bio::Seq; > ok 5 > ok 6 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 7 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 8 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 9 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 10 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 11 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 12 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 13 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > ok 14 # skip Skipping tests which require remote servers, set > BIOPERLDEBUG=1 to test > # Looks like you planned 15 tests but only ran 14. > $ grep Id t/ESEfinder.t > # $Id: ESEfinder.t,v 1.13.6.2 2006/10/02 23:10:39 sendu Exp $ > > florin > > -- > If we wish to count lines of code, we should not regard them as lines > produced but as lines spent. -- Edsger Dijkstra > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 3 14:29:51 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 3 Oct 2006 09:29:51 -0500 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <338034D7-BCB0-4653-9C80-2E7EC9E09E9B@gmx.net> Message-ID: <002101c6e6f8$67b4ae10$15327e82@pyrimidine> > The interface classes (those ending in 'I') should actually inherit > from RootI, not Root. > > In reality this recommendation is more theoretical than it makes that > much of a difference I think. The motivation is that interface > classes should not determine the actual implementation of a class > (hash ref, array ref, whatever), and since Root.pm contains lots of > implementation using a hash ref that decision will basically have > been made. > > On the contrary though, RootI contains implementation too, although > I'm not sure it would prescribe the object implementation as opposed > to merely implementing static methods (like throw(), warn(), etc). > That would need to be checked. > > -hilmar The constructor in Bio::Root::RootI lets one know that its use is deprecated, so you shouldn't have any cases of 'our qw(Bio::Root::RootI)'; there should be some way of inheriting Root directly or indirectly. I would say that any direct use of RootI is not good practice, though. For the current implementation we should only inherit Bio::Root::Root, which implements RootI. Is there any reason to shut off the warning with BIOPERLDEBUG? >From RootI: sub new { my $class = shift; my @args = @_; unless ( $ENV{'BIOPERLDEBUG'} ) { carp("Use of new in Bio::Root::RootI is deprecated. Please use Bio::Root::Root instead"); } eval "require Bio::Root::Root"; return Bio::Root::Root->new(@args); } Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > > On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote: > > > My understanding is that all Bioperl-compliant classes should inherit > > from Bio::Root::Root, not Bio::Root::RootI. > > > > Additionally, if functions such as throw() or _rearrange() are to be > > used without a class instance reference, they are to be used as class > > methods via Bio::Root::Root, not Bio::Root::RootI. > > > > Is this correct? > > > > My naive audit of bioperl-live CVS brought up the following > > statistics: > > > > # Root.pm > > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l > > 26 > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | wc -l > > 346 > > > > # RootI.pm > > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l > > 9 > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | wc -l > > 79 > > > > My guess would be that all RootI should be changed to plain Root ? > > > > Any help appreciated, > > > > -- > > Dr Torsten Seemann http://www.vicbioinformatics.com > > Victorian Bioinformatics Consortium, Monash University, Australia > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From slenk at emich.edu Tue Oct 3 17:31:47 2006 From: slenk at emich.edu (Stephen Gordon Lenk) Date: Tue, 03 Oct 2006 13:31:47 -0400 Subject: [Bioperl-l] Perl 6 has 'roles' - may be cleanly applicable to the Root/RootI issue Message-ID: <5147da5514e402.514e4025147da5@emich.edu> I looked at the Perl6 site, there is an RFC on interfaces: http://dev.perl.org/perl6/rfc/265.html Roles seem to be the Perl 6 answer to the Root/RootI issue in Bioperl. Maybe it is too early to suggest this. http://dev.perl.org/perl6/doc/design/apo/A12.html: The primary role of a class is to manage instances, that is, objects. So a class must worry about object creation and destruction, and everything that happens in between. Classes have a secondary role as units of software reuse, in that they can be inherited from or delegated to. However, because this is a secondary role, and because of weaknesses in models of inheritance, composition, and delegation, Perl 6 will split out the notion of software reuse into a separate class-like entity called a "role". Roles are an abstraction mechanism for use by classes that don't care about the secondary aspects of software reuse, or that (looking at it the other way) care so much about it that they want to encapsulate any decisions about implementation, composition, delegation, and maybe even inheritance. Sounds fancy, but just think of them as includes of partial classes, with some safety checks. Roles don't manage objects. They manage interfaces and other abstract behavior (like default implementations), and they help classes manage objects. As such, a role may only be composed into a class or into another role, never inherited from or delegated to. That's what classes are for. From slenk at emich.edu Tue Oct 3 16:45:15 2006 From: slenk at emich.edu (Stephen Gordon Lenk) Date: Tue, 03 Oct 2006 12:45:15 -0400 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm Message-ID: <5120d6a511f5a7.511f5a75120d6a@emich.edu> The separation of interface and implementation is generally regarded as a good idea. Right now the Bioperl community is doing this as part of the implementation of Bioperl. I suggest that this is an example of something which you might want to have as part of the Perl implementation. If Perl 6 (or even Perl 5) does not have this as a core part of the language or as a standard package (reusable by all in a common fashion), you may want to suggest to the Perl implementers that a way for interface/implementation distinctions be made part of the core language. My 2 cents, as you people are the experts on your own code. ----- Original Message ----- From: Chris Fields Date: Tuesday, October 3, 2006 10:29 am Subject: Re: [Bioperl-l] Use of Root.pm versus RootI.pm > > The interface classes (those ending in 'I') should actually inherit > > from RootI, not Root. > > > > In reality this recommendation is more theoretical than it makes > that> much of a difference I think. The motivation is that interface > > classes should not determine the actual implementation of a class > > (hash ref, array ref, whatever), and since Root.pm contains lots of > > implementation using a hash ref that decision will basically have > > been made. > > > > On the contrary though, RootI contains implementation too, although > > I'm not sure it would prescribe the object implementation as opposed > > to merely implementing static methods (like throw(), warn(), etc). > > That would need to be checked. > > > > -hilmar > > The constructor in Bio::Root::RootI lets one know that its use is > deprecated, so you shouldn't have any cases of 'our > qw(Bio::Root::RootI)';there should be some way of inheriting Root > directly or indirectly. I would > say that any direct use of RootI is not good practice, though. > For the > current implementation we should only inherit Bio::Root::Root, which > implements RootI. > > Is there any reason to shut off the warning with BIOPERLDEBUG? > > >From RootI: > > sub new { > my $class = shift; > my @args = @_; > unless ( $ENV{'BIOPERLDEBUG'} ) { > carp("Use of new in Bio::Root::RootI is deprecated. Please use > Bio::Root::Root instead"); > } > eval "require Bio::Root::Root"; > return Bio::Root::Root->new(@args); > } > > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > > > > > On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote: > > > > > My understanding is that all Bioperl-compliant classes should > inherit> > from Bio::Root::Root, not Bio::Root::RootI. > > > > > > Additionally, if functions such as throw() or _rearrange() are > to be > > > used without a class instance reference, they are to be used > as class > > > methods via Bio::Root::Root, not Bio::Root::RootI. > > > > > > Is this correct? > > > > > > My naive audit of bioperl-live CVS brought up the following > > > statistics: > > > > > > # Root.pm > > > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l > > > 26 > > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | > wc -l > > > 346 > > > > > > # RootI.pm > > > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l > > > 9 > > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | > wc -l > > > 79 > > > > > > My guess would be that all RootI should be changed to plain > Root ? > > > > > > Any help appreciated, > > > > > > -- > > > Dr Torsten Seemann http://www.vicbioinformatics.com > > > Victorian Bioinformatics Consortium, Monash University, Australia > > > > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > -- > > =========================================================== > > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > > =========================================================== > > > > > > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at uiuc.edu Tue Oct 3 17:49:35 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 3 Oct 2006 12:49:35 -0500 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <5120d6a511f5a7.511f5a75120d6a@emich.edu> Message-ID: <000001c6e714$4c2cbb80$15327e82@pyrimidine> Perl6 already has added flexibility for separation of implementation/interface (I believe they are called roles). http://dev.perl.org/perl6/doc/design/syn/S12.html To tell the truth, I'm not sure about Perl 5, except the way the Bioperl devs have up the distinction between interface and implementation. However, I find the way we use interfaces is very simple (set up interface with some/all methods as unimplemented, use the module as an abstract base class, then override the unimplemented methods). It works for me. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: Stephen Gordon Lenk [mailto:slenk at emich.edu] > Sent: Tuesday, October 03, 2006 11:45 AM > To: Chris Fields > Cc: 'Hilmar Lapp'; 'Torsten Seemann'; 'bioperl-l' > Subject: Re: [Bioperl-l] Use of Root.pm versus RootI.pm > > The separation of interface and implementation is generally > regarded as a good idea. Right now the Bioperl community is > doing this as part of the implementation of Bioperl. I suggest > that this is an example of something which you might want to > have as part of the Perl implementation. If Perl 6 (or even > Perl 5) does not have this as a core part of the language or > as a standard package (reusable by all in a common fashion), > you may want to suggest to the Perl implementers that a way > for interface/implementation distinctions be made part of the > core language. My 2 cents, as you people are the experts on > your own code. > > > ----- Original Message ----- > From: Chris Fields > Date: Tuesday, October 3, 2006 10:29 am > Subject: Re: [Bioperl-l] Use of Root.pm versus RootI.pm > > > > The interface classes (those ending in 'I') should actually inherit > > > from RootI, not Root. > > > > > > In reality this recommendation is more theoretical than it makes > > that> much of a difference I think. The motivation is that interface > > > classes should not determine the actual implementation of a class > > > (hash ref, array ref, whatever), and since Root.pm contains lots of > > > implementation using a hash ref that decision will basically have > > > been made. > > > > > > On the contrary though, RootI contains implementation too, although > > > I'm not sure it would prescribe the object implementation as > opposed > > > to merely implementing static methods (like throw(), warn(), etc). > > > That would need to be checked. > > > > > > -hilmar > > > > The constructor in Bio::Root::RootI lets one know that its use is > > deprecated, so you shouldn't have any cases of 'our > > qw(Bio::Root::RootI)';there should be some way of inheriting Root > > directly or indirectly. I would > > say that any direct use of RootI is not good practice, though. > > For the > > current implementation we should only inherit Bio::Root::Root, which > > implements RootI. > > > > Is there any reason to shut off the warning with BIOPERLDEBUG? > > > > >From RootI: > > > > sub new { > > my $class = shift; > > my @args = @_; > > unless ( $ENV{'BIOPERLDEBUG'} ) { > > carp("Use of new in Bio::Root::RootI is deprecated. Please use > > Bio::Root::Root instead"); > > } > > eval "require Bio::Root::Root"; > > return Bio::Root::Root->new(@args); > > } > > > > > > Christopher Fields > > Postdoctoral Researcher - Switzer Lab > > Dept. of Biochemistry > > University of Illinois Urbana-Champaign > > > > > > > > > > > > On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote: > > > > > > > My understanding is that all Bioperl-compliant classes should > > inherit> > from Bio::Root::Root, not Bio::Root::RootI. > > > > > > > > Additionally, if functions such as throw() or _rearrange() are > > to be > > > > used without a class instance reference, they are to be used > > as class > > > > methods via Bio::Root::Root, not Bio::Root::RootI. > > > > > > > > Is this correct? > > > > > > > > My naive audit of bioperl-live CVS brought up the following > > > > statistics: > > > > > > > > # Root.pm > > > > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l > > > > 26 > > > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | > > wc -l > > > > 346 > > > > > > > > # RootI.pm > > > > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l > > > > 9 > > > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | > > wc -l > > > > 79 > > > > > > > > My guess would be that all RootI should be changed to plain > > Root ? > > > > > > > > Any help appreciated, > > > > > > > > -- > > > > Dr Torsten Seemann http://www.vicbioinformatics.com > > > > Victorian Bioinformatics Consortium, Monash University, Australia > > > > > > > > _______________________________________________ > > > > Bioperl-l mailing list > > > > Bioperl-l at lists.open-bio.org > > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > > > > > -- > > > =========================================================== > > > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > > > =========================================================== > > > > > > > > > > > > > > > > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cmlapid at up.edu.ph Wed Oct 4 02:06:06 2006 From: cmlapid at up.edu.ph (Carlo Lapid) Date: Wed, 4 Oct 2006 10:06:06 +0800 Subject: [Bioperl-l] genbank mirror Message-ID: Hi, I'm trying to set up a local mirror of a large part of the Genbank database. For users to access the local database, I need to create a web-based search tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank flat files I've downloaded based on a query entered by the user. I'm trying to use Bioperl to create this from scratch, but I'm having a very hard time, especially since I want the user to have reasonable flexibility in customizing his search. The best that I've been able to accomplish is a search function that retrieves genbank sequence objects based on their primary IDs or accession numbers; by using the fetch method of the Bio::Index::GenBank module. But this doesn't help users who don't know the exact IDs for the sequences they want. Can anybody suggest a way to use Bioperl to search for an ordinary word or phrase, like "16S gene", which could be matched against the description field, or the entire genbank entry? (Alternatively, is there some other freely available tool or software that can do this?) I've been scouring the Bioperl documentation, but I couldn't find anything. I just need to be pointed in the right direction. What I thought was a relatively simple problem has been driving me crazy for days; if anybody has any suggestions I would really, really appreciate it. From torsten.seemann at infotech.monash.edu.au Wed Oct 4 02:58:03 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Wed, 04 Oct 2006 12:58:03 +1000 Subject: [Bioperl-l] genbank mirror In-Reply-To: References: Message-ID: <4523233B.7030505@infotech.monash.edu.au> > I'm trying to set up a local mirror of a large part of the Genbank database. > For users to access the local database, I need to create a web-based search > tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank > flat files I've downloaded based on a query entered by the user. Have you coinsidered bioperl-db / BioSQL ? http://www.bioperl.org/wiki/BioPerl_db http://lists.open-bio.org/pipermail/biosql-l/ -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From osborne1 at optonline.net Wed Oct 4 03:16:20 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Tue, 03 Oct 2006 23:16:20 -0400 Subject: [Bioperl-l] genbank mirror In-Reply-To: Message-ID: Carlo, You might want to look at the Bio::DB::Query::GenBank module: http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_dat abase However this works through NCBI's own eutils API, setting it up to query a local mirror may be very difficult. Brian O. On 10/3/06 10:06 PM, "Carlo Lapid" wrote: > Hi, > > I'm trying to set up a local mirror of a large part of the Genbank database. > For users to access the local database, I need to create a web-based search > tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank > flat files I've downloaded based on a query entered by the user. > > I'm trying to use Bioperl to create this from scratch, but I'm having a very > hard time, especially since I want the user to have reasonable flexibility > in customizing his search. The best that I've been able to accomplish is a > search function that retrieves genbank sequence objects based on their > primary IDs or accession numbers; by using the fetch method of the > Bio::Index::GenBank module. But this doesn't help users who don't know the > exact IDs for the sequences they want. > > Can anybody suggest a way to use Bioperl to search for an ordinary word or > phrase, like "16S gene", which could be matched against the description > field, or the entire genbank entry? (Alternatively, is there some other > freely available tool or software that can do this?) I've been scouring the > Bioperl documentation, but I couldn't find anything. I just need to be > pointed in the right direction. What I thought was a relatively simple > problem has been driving me crazy for days; if anybody has any suggestions I > would really, really appreciate it. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From osborne1 at optonline.net Wed Oct 4 03:28:06 2006 From: osborne1 at optonline.net (Brian Osborne) Date: Tue, 03 Oct 2006 23:28:06 -0400 Subject: [Bioperl-l] genbank mirror In-Reply-To: <4523233B.7030505@infotech.monash.edu.au> Message-ID: Torsten and Carlo, Right. For some simple examples of using Bio::DB::Query::BioQuery to query a BioSQL db take a look at Bio::DB::BioSQL::OBDA. You may also want to take a look at NCBI's eutils API, it's quite powerful but not local. Or the ENSEMBL API, people have set up their own local ENSEMBL dbs. There's an example of this API here: http://www.bioperl.org/wiki/Getting_Genomic_Sequences Brian O. On 10/3/06 10:58 PM, "Torsten Seemann" wrote: >> I'm trying to set up a local mirror of a large part of the Genbank database. >> For users to access the local database, I need to create a web-based search >> tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank >> flat files I've downloaded based on a query entered by the user. > > Have you coinsidered bioperl-db / BioSQL ? > > http://www.bioperl.org/wiki/BioPerl_db > http://lists.open-bio.org/pipermail/biosql-l/ From torsten.seemann at infotech.monash.edu.au Wed Oct 4 05:21:24 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Wed, 04 Oct 2006 15:21:24 +1000 Subject: [Bioperl-l] Clean-up of Bio::Root::IO Message-ID: <452344D4.8070908@infotech.monash.edu.au> Hi all, Now that we have Perl 5.6.1 as a minimum, the following modules are standard: File::Spec, File::Temp, File::Path Bio::Root::IO has functions catfile(), tmpdir(), tmpfile(), rmtree() which currently dispatch to the File:: version, or try to emulate it. We don't need to emulate anymore. Jason Stajich suggested in a previous post that they should be deprecated, and that users should use directly the File:: functions themselves. I have an uncommitted simplified version of Bio::Root::IO which does this, and "all tests pass". The functions currently (silently) dispatch directly to their native counterparts. The only tricky function is tempfile() which is *mostly* like File::Temp::tempfile(), but does some voodoo of converting (TEMPLATE=>'xxx') to the non-hash first parameter of the File:: version, so I'm hesitant to commit. It may do other magic - Hilmar? Comments? -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From gianluca.debellis at itb.cnr.it Wed Oct 4 09:25:26 2006 From: gianluca.debellis at itb.cnr.it (Gianluca De Bellis) Date: Wed, 04 Oct 2006 11:25:26 +0200 Subject: [Bioperl-l] Bioperl under WinXP Message-ID: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB> I'm trying to use Bioperl under WinXP-SP2 (novice) Bioperl has been just downloaded (v 1.2.3) Even the simplest program with a single command (use Bio::Perl;) ends up in an error of the Perl interpreter with these details AppName: perl.exe AppVer: 5.8.8.819 ModName: win32.dll ModVer: 0.0.0.0 Offset: 00003294 Coming from the windos reporting system Where is the problem? Thanks in advance From epsteinj at mail.nih.gov Wed Oct 4 11:25:57 2006 From: epsteinj at mail.nih.gov (Epstein, Jonathan A (NIH/NICHD) [E]) Date: Wed, 4 Oct 2006 07:25:57 -0400 Subject: [Bioperl-l] genbank mirror References: Message-ID: <42504F69898FE546B3F0238C9BD03275532603@NIHCESMLBX7.nih.gov> There's Seqhound: http://seqhound.blueprint.org/report.html We set this up locally, and it's probably the most comprehensive free solution out there, but it's non-trivial to setup. Also, since the Blueprint&BIND have lost most of their funding, I'm not sure how long you can count on SeqHound to remain operational (although for now it is being updated). Jonathan -----Original Message----- From: Carlo Lapid [mailto:cmlapid at up.edu.ph] Sent: Tue 10/3/2006 10:06 PM To: bioperl-l at bioperl.org Subject: [Bioperl-l] genbank mirror Hi, I'm trying to set up a local mirror of a large part of the Genbank database. For users to access the local database, I need to create a web-based search tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank flat files I've downloaded based on a query entered by the user. I'm trying to use Bioperl to create this from scratch, but I'm having a very hard time, especially since I want the user to have reasonable flexibility in customizing his search. The best that I've been able to accomplish is a search function that retrieves genbank sequence objects based on their primary IDs or accession numbers; by using the fetch method of the Bio::Index::GenBank module. But this doesn't help users who don't know the exact IDs for the sequences they want. Can anybody suggest a way to use Bioperl to search for an ordinary word or phrase, like "16S gene", which could be matched against the description field, or the entire genbank entry? (Alternatively, is there some other freely available tool or software that can do this?) I've been scouring the Bioperl documentation, but I couldn't find anything. I just need to be pointed in the right direction. What I thought was a relatively simple problem has been driving me crazy for days; if anybody has any suggestions I would really, really appreciate it. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From bix at sendu.me.uk Wed Oct 4 13:19:45 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 04 Oct 2006 14:19:45 +0100 Subject: [Bioperl-l] Bioperl under WinXP In-Reply-To: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB> References: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB> Message-ID: <4523B4F1.3010305@sendu.me.uk> Gianluca De Bellis wrote: > I'm trying to use Bioperl under WinXP-SP2 (novice) > > Bioperl has been just downloaded (v 1.2.3) > > Even the simplest program with a single command (use Bio::Perl;) ends up in > an error of the Perl interpreter with these details > > AppName: perl.exe AppVer: 5.8.8.819 ModName: win32.dll > > ModVer: 0.0.0.0 Offset: 00003294 > > Coming from the windos reporting system > > Where is the problem? Hard to say. Do non-bioperl scripts work? Make sure to follow the Bioperl installation instructions carefully: http://bioperl.org/wiki/Installing_Bioperl_on_Windows And make sure to install at least version 1.4. 1.2.3 is ancient and effectively unsupported. From cjfields at uiuc.edu Wed Oct 4 14:03:34 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 4 Oct 2006 09:03:34 -0500 Subject: [Bioperl-l] Bioperl under WinXP In-Reply-To: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB> Message-ID: <000601c6e7bd$e22ad190$15327e82@pyrimidine> If you're using PPM, you can install a (much) newer version of BioPerl from here: http://www.gmod.org/ggb/ppm/ Add that as one of your repositories in PPM4 (seeing that you are using ActivePerl 5.8.8.819), then search for bioperl. The version should be 1.512. In a few weeks we'll be releasing a new developer release. A WinXP PPM is expected, as well as a bundled package to install all prerequisites. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Gianluca De Bellis > Sent: Wednesday, October 04, 2006 4:25 AM > To: bioperl-l at bioperl.org > Subject: [Bioperl-l] Bioperl under WinXP > > I'm trying to use Bioperl under WinXP-SP2 (novice) > > Bioperl has been just downloaded (v 1.2.3) > > Even the simplest program with a single command (use Bio::Perl;) ends up > in > an error of the Perl interpreter with these details > > AppName: perl.exe AppVer: 5.8.8.819 ModName: win32.dll > > ModVer: 0.0.0.0 Offset: 00003294 > > Coming from the windos reporting system > > Where is the problem? > > > > Thanks in advance > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From hlapp at gmx.net Wed Oct 4 14:25:23 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 4 Oct 2006 10:25:23 -0400 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <002101c6e6f8$67b4ae10$15327e82@pyrimidine> References: <002101c6e6f8$67b4ae10$15327e82@pyrimidine> Message-ID: <28FB0A46-24C9-4EDD-B89C-BF7A108A1EDC@gmx.net> On Oct 3, 2006, at 10:29 AM, Chris Fields wrote: > The constructor in Bio::Root::RootI lets one know that its use is > deprecated, so you shouldn't have any cases of 'our qw > (Bio::Root::RootI)'; Don't confuse the constructor with the inheritance tree. Interface classes should never be instantiated, hence the constructor, consistent with the documentation, should never get executed. > there should be some way of inheriting Root directly or > indirectly. I would > say that any direct use of RootI is not good practice, though. I don't know what you mean by 'directly' or 'indirectly' but inheritance from interfaces, and interfaces extending (inheriting from) other interfaces, is certainly standard practice. I'm not sure at all why it would be a bad one. > For the current implementation we should only inherit > Bio::Root::Root, which > implements RootI. For the implementation classes, yes. For the interface classes, no. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Wed Oct 4 14:43:54 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 4 Oct 2006 10:43:54 -0400 Subject: [Bioperl-l] Clean-up of Bio::Root::IO In-Reply-To: <452344D4.8070908@infotech.monash.edu.au> References: <452344D4.8070908@infotech.monash.edu.au> Message-ID: <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net> On Oct 4, 2006, at 1:21 AM, Torsten Seemann wrote: > Bio::Root::IO has functions catfile(), tmpdir(), tmpfile(), rmtree() > which currently dispatch to the File:: version, or try to emulate > it. We > don't need to emulate anymore. Jason Stajich suggested in a previous > post that they should be deprecated, and that users should use > directly > the File:: functions themselves. I don't think there's a need to deprecate - if the methods just plain delegate to whatever File:: module is appropriate their implementation (supposedly) will become very simple and hence won't pose a maintenance burden anymore. One can still recommend for all new scripts or modules or code written to use the File:: modules directly, just I'm not sure there's a need to tell users that they should start changing their existing stuff. > > I have an uncommitted simplified version of Bio::Root::IO which does > this, and "all tests pass". The functions currently (silently) > dispatch > directly to their native counterparts. > > The only tricky function is tempfile() which is *mostly* like > File::Temp::tempfile(), but does some voodoo of converting > (TEMPLATE=>'xxx') to the non-hash first parameter of the File:: > version, > so I'm hesitant to commit. It may do other magic - Hilmar? Not that I would know of. If the tests pass (without having to change them!) I'd give it a try. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Oct 4 15:35:16 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 4 Oct 2006 10:35:16 -0500 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <28FB0A46-24C9-4EDD-B89C-BF7A108A1EDC@gmx.net> Message-ID: <001901c6e7ca$b12fd5b0$15327e82@pyrimidine> ... > Don't confuse the constructor with the inheritance tree. > > Interface classes should never be instantiated, hence the > constructor, consistent with the documentation, should never get > executed. I know that interfaces shouldn't be instantiated. I had noticed there are cases of 'our qw (Bio::Root::RootI)' where it is completely acceptable to inherit the interface. Makes sense to me now. > > there should be some way of inheriting Root directly or > > indirectly. I would > > say that any direct use of RootI is not good practice, though. > > I don't know what you mean by 'directly' or 'indirectly' but > inheritance from interfaces, and interfaces extending (inheriting > from) other interfaces, is certainly standard practice. I'm not sure > at all why it would be a bad one. I was talking specifically about inheriting RootI, and not about all Bioperl interfaces in general. I completely understand the use of interface/implementation in Bioperl. However, I missed one small fact until yesterday (of course AFTER I posed my reply), which was that interfaces may inherit RootI directly. My oops. I had understood that, in general, any Bioperl implementation should not inherit the RootI interface directly (they should inherit Root, since that implements RootI). The 'constructor' present in RootI is essentially to make sure that no one inherits from the wrong class. Probably a bad use of the terms 'direct' and 'indirect', so maybe I didn't get that across very well. What I meant was that all classes inherit Root in some way, either 'directly' (as the direct parent class) or 'indirectly' (through the inheritance tree). Probably comes from being primarily a molecular microbiologist and not a computer scientist. OT, but it would be nice to have an updated class diagram to sort out the inheritance hierarchy a bit easier. In the meantime, the Deobfuscator does help quite a bit. > > For the current implementation we should only inherit > > Bio::Root::Root, which > > implements RootI. > > For the implementation classes, yes. For the interface classes, no. I agree (see above). That's the one small bit about interfaces I missed along the way. Makes sense; they use throw_not_implemented(), which is a RootI method. > -hilmar Chris From pmiguel at purdue.edu Wed Oct 4 19:38:51 2006 From: pmiguel at purdue.edu (Phillip San Miguel) Date: Wed, 04 Oct 2006 15:38:51 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: <4521A552.60301@sendu.me.uk> References: <4521A552.60301@sendu.me.uk> Message-ID: <45240DCB.2080204@purdue.edu> Sendu Bala wrote: > Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll > upload tar.gz files when I have access to the server, then reply here > with links. > > In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for > instructions on getting and testing this RC. > > Developers: > Make sure you're in the AUTHORS file in all 4 packages, as > appropriate. > > Users: > Even though 1.5.2 is a 'developer' release, we consider it the most > stable and capable version of Bioperl, and recommend that you use > it in all but the most critical production environments. Please > try it out and let us know of any problems or difficulties you run > into. > > > Thank you, > Sendu. > I didn't see any tests done under solaris, so I asked our sys admin to do the install on one of our machines. Just another data point: He installed this release candidate on a Sun E450 box running solaris. uname -a gives: SunOS descartes 5.10 Generic_118833-18 sun4u sparc SUNW,Ultra-4 perl -v gives: This is perl, v5.8.8 built for sun4-solaris (etc.) $ time make test PERL_DL_NONLAZY=1 /usr/local/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t t/AAChange...................ok t/AAReverseMutate............ok t/abi........................Bio::SeqIO::staden::read from bioperl-ext is not installed or is installed incorrectly - skipping abi.t tests t/abi........................ok t/ace........................ok t/AlignIO....................ok t/AlignStats.................ok t/AlignUtil..................ok t/alignUtilities.............ok t/Allele.....................ok t/Alphabet...................ok t/Annotation.................ok t/AnnotationAdaptor..........ok t/asciitree..................ok t/Assembly...................ok 1/19 skipped: t/Biblio.....................ok t/Biblio_biofetch............ok t/Biblio_eutils..............ok t/BiblioReferences...........ok t/BioDBGFF...................ok t/BioDBSeqFeature............ok 1/46Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423. t/BioDBSeqFeature............ok t/BioDBSeqFeature_BDB........ok t/BioDBSeqFeature_mysql......ok 3/46prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 prepare_cached(SELECT sequence,offset FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname= ? AND offset >= ? AND offset <= ? ORDER BY offset ) statement handle DBI::st=HASH(0x8c5048) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 prepare_cached(SELECT sequence,offset FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname= ? AND offset >= ? AND offset <= ? ORDER BY offset ) statement handle DBI::st=HASH(0x8c5048) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 t/BioDBSeqFeature_mysql......ok t/BioFetch_DB................ok t/BioGraphics................ok t/BlastIndex.................ok 1/13 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BlastIndex.................ok t/BPbl2seq................... -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPbl2seq...................ok 1/108 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPbl2seq...................ok t/BPlite.....................ok 1/97 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPlite.....................ok 52/97 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPlite.....................ok 88/97 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead STACK Bio::Tools::BPlite::new /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/Tools/BPlite.pm:197 STACK toplevel t/BPlite.t:127 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPlite.....................ok t/BPpsilite.................. -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPpsilite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPpsilite..................ok 4/11 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::BPpsilite is deprecatedUse Bio::SearchIO classes instead --------------------------------------------------- t/BPpsilite..................ok t/bsml_sax...................ok t/Chain......................ok t/chaosxml...................ok t/cigarstring................ok t/ClusterIO..................ok t/Coalescent.................ok t/CodonTable.................ok t/Compatible.................ok t/consed.....................ok t/CoordinateGraph............ok t/CoordinateMapper...........ok t/Correlate..................ok t/ctf........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping ctf.t tests t/ctf........................ok t/CytoMap....................ok t/DB.........................skipped all skipped: Skipping all tests since they require network access, set BIOPERLDEBUG=1 to test t/DBCUTG.....................ok 11/34 skipped: Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test t/DBFasta....................ok t/DNAMutation................ok t/Domcut.....................ok t/ECnumber...................ok t/ELM........................ok 1/13 -------------------- WARNING --------------------- MSG: sleeping for 1 seconds --------------------------------------------------- t/ELM........................ok t/embl.......................ok t/EMBL_DB....................ok t/EMBOSS_Tools...............ok t/EncodedSeq.................ok t/entrezgene.................ok 491/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467. t/entrezgene.................ok 695/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467. t/entrezgene.................ok 723/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467. t/entrezgene.................ok 824/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467. t/entrezgene.................ok t/ePCR.......................ok t/ESEfinder..................ok 1/15# Looks like you planned 15 tests but only ran 14. t/ESEfinder..................dubious Test returned status 255 (wstat 65280, 0xff00) DIED. FAILED test 15 Failed 1/15 tests, 93.33% okay (less 9 skipped tests: 5 okay, 33.33%) t/est2genome.................ok t/EUtilities.................skipped all skipped: Set BIOPERLDEBUG=1 to run tests t/Exception..................ok t/Exonerate..................ok t/exp........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping exp.t tests t/exp........................ok t/fasta......................ok t/FeatureIO..................ok 7/33 -------------------- WARNING --------------------- MSG: '##feature-ontology' directive handling not yet implemented --------------------------------------------------- -------------------- WARNING --------------------- MSG: '##attribute-ontology' directive handling not yet implemented --------------------------------------------------- -------------------- WARNING --------------------- MSG: '##source-ontology' directive handling not yet implemented --------------------------------------------------- t/FeatureIO..................ok t/flat.......................ok t/FootPrinter................ok t/game.......................ok t/GbrowseGFF.................ok t/gcg........................ok t/GDB........................ok t/Gel........................ok t/genbank....................ok t/GeneCoordinateMapper.......ok t/Geneid.....................ok t/Genewise...................ok 2/51 skipped: t/Genomewise.................ok t/Genpred....................ok t/GFF........................ok t/GOR4.......................ok t/GOterm.....................ok t/GraphAdaptor...............ok t/GuessSeqFormat.............ok t/hmmer......................ok t/hmmer_pull.................ok t/HNN........................ok t/HtSNP......................ok t/Index......................ok t/InstanceSite...............ok t/interpro...................ok t/InterProParser.............ok t/IUPAC......................ok t/kegg.......................ok t/largefasta.................ok t/LargeLocatableSeq..........ok t/largepseq..................ok t/lasergene..................ok t/LinkageMap.................ok t/LiveSeq....................ok t/LocatableSeq...............ok t/Location...................ok t/LocationFactory............ok t/LocusLink..................ok t/lucy.......................ok t/Map........................ok t/MapIO......................ok t/masta......................ok t/Matrix.....................ok t/Measure....................ok t/MeSH.......................ok t/metafasta..................ok t/MetaSeq....................ok t/MicrosatelliteMarker.......ok t/MiniMIMentry...............ok t/MitoProt...................ok t/Molphy.....................ok t/MultiFile..................ok t/multiple_fasta.............ok t/Mutation...................ok t/Mutator....................ok t/NetPhos....................ok 10/14 skipped: Skipping tests which require network access, set BIOPERLDEBUG=1 to test t/Node.......................ok t/obo_parser.................ok t/OddCodes...................ok t/OMIMentry..................ok t/OMIMentryAllelicVariant....ok t/OMIMparser.................ok t/Ontology...................ok t/OntologyEngine.............ok t/OntologyStore..............ok t/PAML.......................ok t/Perl.......................ok t/phd........................ok t/Phenotype..................ok t/PhylipDist.................ok t/PhysicalMap................ok t/pICalculator...............ok t/Pictogram..................ok t/pir........................ok t/pln........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping pln.t tests t/pln........................ok t/PopGen.....................ok 2/89 skipped: t/PopGenSims.................ok t/primaryqual................ok t/PrimarySeq.................ok t/primedseq..................ok t/Primer.....................ok t/primer3....................ok t/Promoterwise...............ok t/ProtDist...................ok t/protgraph..................ok t/ProtMatrix.................ok t/ProtPsm....................ok t/Pseudowise.................ok t/psm........................ok t/QRNA.......................ok t/qual.......................ok t/RandDistFunctions..........ok t/RandomTreeFactory..........ok t/Range......................ok t/RangeI.....................ok t/raw........................ok t/RefSeq.....................ok t/Registry...................ok t/Relationship...............ok t/RelationshipType...........ok t/RemoteBlast................ok 11/13 skipped: to avoid timeout t/RepeatMasker...............ok t/RestrictionAnalysis........ok t/RestrictionEnzyme..........ok 1/14 -------------------- WARNING --------------------- MSG: Use of Bio::Tools::RestrictionEnzyme is deprecatedUse Bio::Restriction classes instead --------------------------------------------------- t/RestrictionEnzyme..........ok t/RestrictionIO..............ok t/RNAChange..................ok t/rnamotif...................ok t/RootI......................ok t/RootIO.....................ok 2/27 skipped: various reasons t/RootStorable...............ok t/Scansite...................ok t/scf........................ok t/SearchDist.................ok t/SearchIO...................ok t/Seg........................ok t/Seq........................ok t/seq_quality................ok t/SeqAnalysisParser..........ok t/SeqBuilder.................ok t/SeqDiff....................ok t/SeqFeatCollection..........ok t/SeqFeature.................ok t/seqfeaturePrimer...........ok t/SeqHound_DB................ok 4/14Writing into 'shoundlog' log file. t/SeqHound_DB................ok t/SeqIO......................ok t/SeqPattern.................ok t/seqread_fail...............ok t/SeqStats...................ok t/SequenceFamily.............ok t/sequencetrace..............ok t/SeqUtils...................ok t/SeqVersion.................ok t/seqwithquality.............ok t/SeqWords...................ok t/Sigcleave..................ok t/Signalp....................ok t/Sim4.......................ok t/SimilarityPair.............ok t/SimpleAlign................ok t/simpleGOparser.............ok t/singlet....................ok t/sirna......................ok t/SiteMatrix.................ok t/SNP........................ok t/Sopma......................ok t/Species....................ok 5/20 skipped: Skipping tests which require network access, set BIOPERLDEBUG=1 to test t/Spidey.....................ok t/splicedseq.................ok t/StandAloneBlast............ok t/StructIO...................ok t/Structure..................ok t/swiss......................ok t/Symbol.....................ok t/tab........................ok t/table......................ok t/TagHaplotype...............ok t/Taxonomy...................ok 44/98 skipped: Skipping tests which require network access, set BIOPERLDEBUG=1 to test t/TaxonTree..................ok t/Tempfile...................ok t/Term.......................ok t/tigrxml....................ok t/tinyseq....................ok t/Tmhmm......................ok t/Tools......................ok t/Tree.......................ok t/TreeBuild..................ok t/TreeIO.....................ok t/trim.......................ok t/tRNAscanSE.................ok t/UCSCParsers................ok t/Unflattener................ok t/Unflattener2...............ok t/UniGene....................ok t/Variation_IO...............ok t/WABA.......................ok t/XEMBL_DB...................ok 1/9 skipped: server may be down t/ztr........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping ztr.t tests t/ztr........................ok Failed Test Stat Wstat Total Fail Failed List of Failed ------------------------------------------------------------------------------- t/ESEfinder.t 255 65280 15 2 13.33% 15 2 tests and 98 subtests skipped. Failed 1/240 test scripts, 99.58% okay. 1/11910 subtests failed, 99.99% okay. *** Error code 29 make: Fatal error: Command failed for target `test_dynamic' real 13m10.064s user 11m14.891s sys 0m45.417s $ TEST_VERBOSE=1 perl t/ESEfinder.t 1..15 ok 1 - use Bio::Tools::Analysis::DNA::ESEfinder; ok 2 - use Data::Dumper; ok 3 - use Bio::PrimarySeq; ok 4 - use Bio::Seq; ok 5 ok 6 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 7 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 8 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 9 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 10 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 11 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 12 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 13 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test ok 14 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test # Looks like you planned 15 tests but only ran 14. From bix at sendu.me.uk Thu Oct 5 07:19:39 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 08:19:39 +0100 Subject: [Bioperl-l] EUtilities term handling Message-ID: <4524B20B.5010703@sendu.me.uk> This is actually a general question and not limited to EUtilities. As I see it EUtiltiies lets you do queries in Bioperl that you can do on a website. The question is, should a Bioperl module always work with queries that the website it is a front-end to works with? So for example, Bio::DB::EUtilities::esearch in -db mode 'gene' is essentially a frontend onto: http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?retmode=xml&db=gene&term= With a web-browser you can complete that url by supplying a term. For example, the term 'BRCA2+9606[taxid]' works and returns results: http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?retmode=xml&db=gene&term=BRCA2+9606[taxid] If you supply the exact same term to EUtilities::esearch like so: my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db => "gene", -term "BRCA2+9606[taxid]"); The search fails. From my 'user' perspective this is highly unexpected. Chris (the author) and I both understand /why/ it fails, but Chris doesn't think it is a bug, or at least something than can/should be changed. What do other people think? At the very least, if something unexpected happens, I'd suggest making a note of it in the POD somewhere. Eg. "Do not use + in term strings, even though they might work on the website". Chris: what is the disadvantage of always submitting '+' as '+' to the server? From bix at sendu.me.uk Thu Oct 5 07:24:45 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 08:24:45 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <4524B20B.5010703@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> Message-ID: <4524B33D.9070607@sendu.me.uk> Sendu Bala wrote: > > With a web-browser you can complete that url by supplying a term. For > example, the term 'BRCA2+9606[taxid]' works and returns results: > > http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?retmode=xml&db=gene&term=BRCA2+9606[taxid] > > > If you supply the exact same term to EUtilities::esearch like so: > > my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db => > "gene", -term "BRCA2+9606[taxid]"); *cough* my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db => "gene", -term => "BRCA2+9606[taxid]"); > The search fails. From m.weimer at dkfz-heidelberg.de Thu Oct 5 12:15:53 2006 From: m.weimer at dkfz-heidelberg.de (Marc Weimer) Date: Thu, 05 Oct 2006 14:15:53 +0200 Subject: [Bioperl-l] Bio::DB::SwissProt Error Message-ID: <1160050554.18691.11.camel@localhost> When running -------------------------------------------------------------- #! /usr/bin/perl -w use strict; use Bio::DB::SwissProt; my $db_obj = new Bio::DB::SwissProt(-verbose=>1); my $seq_obj = $db_obj->get_Seq_by_acc('P43780'); ------------------------------------------------------------- using Bioperl 1.4-1 I get the error message --------------------------------------------------------------------------------- request is POST http://www.ebi.ac.uk/cgi-bin/dbfetch Content-Length: 45 Content-Type: application/x-www-form-urlencoded format=swissprot&db=swall&style=raw&id=P43780 ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: swissprot stream with no ID. Not swissprot in my book STACK: Error::throw STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328 STACK Bio::SeqIO::swiss::next_seq /usr/share/perl5/Bio/SeqIO/swiss.pm:179 STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc /usr/share/perl5/Bio/DB/WebDBSeqI.pm:187 STACK: ./putativeGele.pl:8 ----------------------------------------------------------- -------------------------------------------------------------------------------- Any suggestions? Thanks, Marc From bix at sendu.me.uk Thu Oct 5 13:21:23 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 14:21:23 +0100 Subject: [Bioperl-l] Bio::DB::SwissProt Error In-Reply-To: <1160050554.18691.11.camel@localhost> References: <1160050554.18691.11.camel@localhost> Message-ID: <452506D3.5050501@sendu.me.uk> Marc Weimer wrote: [snip] > my $db_obj = new Bio::DB::SwissProt(-verbose=>1); > > my $seq_obj = $db_obj->get_Seq_by_acc('P43780'); [snip] > using Bioperl 1.4-1 I get the error message [snip] > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: swissprot stream with no ID. Not swissprot in my book [snip] > Any suggestions? It works with the latest Bioperl. I'm not sure if 1.5.1 works (the most recent official release), but 1.5.2 does (http://bioperl.org/wiki/Release_1.5.2), as does a checkout from CVS (http://bioperl.org/wiki/Getting_BioPerl#CVS). From m.weimer at dkfz-heidelberg.de Thu Oct 5 13:35:06 2006 From: m.weimer at dkfz-heidelberg.de (Marc Weimer) Date: Thu, 05 Oct 2006 15:35:06 +0200 Subject: [Bioperl-l] Bio::DB::SwissProt Error In-Reply-To: <452506D3.5050501@sendu.me.uk> References: <1160050554.18691.11.camel@localhost> <452506D3.5050501@sendu.me.uk> Message-ID: <1160055306.18691.14.camel@localhost> Works fine with 1.5.2 Thanks, Marc > Marc Weimer wrote: > [snip] > > my $db_obj = new Bio::DB::SwissProt(-verbose=>1); > > > > my $seq_obj = $db_obj->get_Seq_by_acc('P43780'); > [snip] > > using Bioperl 1.4-1 I get the error message > [snip] > > ------------- EXCEPTION: Bio::Root::Exception ------------- > > MSG: swissprot stream with no ID. Not swissprot in my book > [snip] > > Any suggestions? > > It works with the latest Bioperl. I'm not sure if 1.5.1 works (the most > recent official release), but 1.5.2 does > (http://bioperl.org/wiki/Release_1.5.2), as does a checkout from CVS > (http://bioperl.org/wiki/Getting_BioPerl#CVS). -- ######################################## Dr. Marc Weimer German Cancer Research Center Central Unit Biostatistics Im Neuenheimer Feld 280 D-69120 Heidelberg Phone: +49 (0) 6221/42-2387 Fax: +49 (0) 6221/42-2397 ######################################## From hlapp at gmx.net Thu Oct 5 13:55:58 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 5 Oct 2006 09:55:58 -0400 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <4524B20B.5010703@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> Message-ID: <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> On Oct 5, 2006, at 3:19 AM, Sendu Bala wrote: > This is actually a general question and not limited to EUtilities. > As I > see it EUtiltiies lets you do queries in Bioperl that you can do on a > website. The question is, should a Bioperl module always work with > queries that the website it is a front-end to works with? I think yes, but stick to this definition. Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez website it will actually not work. Hence, it should be no surprise that it doesn't work either using Bio::DB::EUtilities. The URL you are using to make your point is much more an example for using a web-service (SOAP, REST, or not) than it is for using a website. Using the web-service URL with a space in place of the '+' works, but yields a different result (just searches for BRCA2), so if tested for correct result the test fails. I.e., you don't expect an input form on a website to accept URL- encoded input. Instead, you expect it to do any URL-encoding for you that needs to be done. Conversely, if you are using a URL to retrieve stuff using e.g. wget or curl, it is clear that you will need to do URL encoding yourself unless there is a command line option that lets you instruct the querying program to do so. I would be careful with mangling the two definitions into one, resulting in a module that needs to serve two masters. You could consider providing an option though that lets you turn off the URL encoding on demand. Aside from that, one of the advantages of having the service wrapped in Bioperl is in fact that you can have it accept a wider variety of parameters that the actual service would allow you to have, e.g., arrays, hashes, or whatever seems appropriate. My $0.02. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Thu Oct 5 14:08:01 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 15:08:01 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> Message-ID: <452511C1.5020709@sendu.me.uk> Hilmar Lapp wrote: > > On Oct 5, 2006, at 3:19 AM, Sendu Bala wrote: > >> This is actually a general question and not limited to EUtilities. As I >> see it EUtiltiies lets you do queries in Bioperl that you can do on a >> website. The question is, should a Bioperl module always work with >> queries that the website it is a front-end to works with? > > I think yes, but stick to this definition. > > Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez > website it will actually not work. Hence, it should be no surprise that > it doesn't work either using Bio::DB::EUtilities. On the contrary, I find it a surprise because EUtilities is an interface to NCBI's eutils, not the entrez website. If I had previously read instructions on using eutils: http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.section.constructing-urls I might (do) expect that I /should/ use + in my term. > Aside from that, one of the advantages of having the service wrapped in > Bioperl is in fact that you can have it accept a wider variety of > parameters that the actual service would allow you to have, e.g., > arrays, hashes, or whatever seems appropriate. I was going to suggest that terms be supplied as an array, leaving Bioperl code to decide how to 'AND' all the terms (elements in the array) together. It would also further force the user not to think of how eutils normally works, but to only consider the Bioperl instructions on how to form a query. But I'm not sure of the value of all that. From cjfields at uiuc.edu Thu Oct 5 14:06:50 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 09:06:50 -0500 Subject: [Bioperl-l] Bio::DB::SwissProt Error In-Reply-To: <452506D3.5050501@sendu.me.uk> References: <1160050554.18691.11.camel@localhost> <452506D3.5050501@sendu.me.uk> Message-ID: <1AC863FF-3C44-4017-B20F-BD6DB413B318@uiuc.edu> On Oct 5, 2006, at 8:21 AM, Sendu Bala wrote: > Marc Weimer wrote: > [snip] >> my $db_obj = new Bio::DB::SwissProt(-verbose=>1); >> >> my $seq_obj = $db_obj->get_Seq_by_acc('P43780'); > [snip] >> using Bioperl 1.4-1 I get the error message > [snip] >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> MSG: swissprot stream with no ID. Not swissprot in my book > [snip] >> Any suggestions? > > It works with the latest Bioperl. I'm not sure if 1.5.1 works (the > most > recent official release), but 1.5.2 does > (http://bioperl.org/wiki/Release_1.5.2), as does a checkout from CVS > (http://bioperl.org/wiki/Getting_BioPerl#CVS). Mark, you'll have to update to 1.5.2 or CVS, as Sendu suggested. There were server changes for biofetch which were fixed about 4-6 months ago (post rel. 1.5.1); I think several changes were made to Bio::SeqIO::swiss as well during this period. I think the error here results from Bio::SeqIO::swiss trying to parse an empty byte stream. Sendu, do you think that Bio::SeqIO::swiss (and other SeqIO parsers) should throw a more specific message for getting an empty byte stream? Or is it more trouble than it's worth? Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Thu Oct 5 14:14:40 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 15:14:40 +0100 Subject: [Bioperl-l] Bio::DB::SwissProt Error In-Reply-To: <1AC863FF-3C44-4017-B20F-BD6DB413B318@uiuc.edu> References: <1160050554.18691.11.camel@localhost> <452506D3.5050501@sendu.me.uk> <1AC863FF-3C44-4017-B20F-BD6DB413B318@uiuc.edu> Message-ID: <45251350.5030608@sendu.me.uk> Chris Fields wrote: > >>> ------------- EXCEPTION: Bio::Root::Exception ------------- >>> MSG: swissprot stream with no ID. Not swissprot in my book [snip] > I think the error here results from Bio::SeqIO::swiss trying to parse an > empty byte stream. Sendu, do you think that Bio::SeqIO::swiss (and > other SeqIO parsers) should throw a more specific message for getting an > empty byte stream? Or is it more trouble than it's worth? Trouble wise, I've no idea without looking into it. Generally speaking though I can say that the error message is pretty useless and I'm always in favour of better error messages. From hlapp at gmx.net Thu Oct 5 14:21:49 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 5 Oct 2006 10:21:49 -0400 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <452511C1.5020709@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> Message-ID: On Oct 5, 2006, at 10:08 AM, Sendu Bala wrote: >> >> On Oct 5, 2006, at 3:19 AM, Sendu Bala wrote: >> >>> This is actually a general question and not limited to >>> EUtilities. As I >>> see it EUtiltiies lets you do queries in Bioperl that you can do >>> on a >>> website. The question is, should a Bioperl module always work with >>> queries that the website it is a front-end to works with? >> >> I think yes, but stick to this definition. >> >> Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez >> website it will actually not work. Hence, it should be no surprise >> that >> it doesn't work either using Bio::DB::EUtilities. > > On the contrary, I find it a surprise because EUtilities is an > interface > to NCBI's eutils, not the entrez website. > > If I had previously read instructions on using eutils: > http://www.ncbi.nlm.nih.gov/books/bv.fcgi? > rid=coursework.section.constructing-urls > I might (do) expect that I /should/ use + in my term. This is my point - stick to your definitions. Are you wrapping a query form on a website or are you wrapping a web service (i.e., a URL)? The examples you give are about wrapping a web-service. Your original question was about wrapping a website. Yet another question is what the author of Bio::DB::EUtilities intended to wrap. The other thing to consider is user-friendliness. If you are wrapping a web-service, do you still make not URL-encoding the user input the default? What will 90% of the users probably want or expect to be able to do? URL-encode all input themselves or expect the module to do this for them unless they turn it off? As far as I'm concerned, I'll happily count myself among those who are lazy and ignorant, don't read NCBI's documentation, don't want to know how to URL encode and why this needs to be done, but just want it to work. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Thu Oct 5 14:31:06 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 09:31:06 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <4524B20B.5010703@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> Message-ID: On Oct 5, 2006, at 2:19 AM, Sendu Bala wrote: > This is actually a general question and not limited to EUtilities. > As I > see it EUtiltiies lets you do queries in Bioperl that you can do on a > website. The question is, should a Bioperl module always work with > queries that the website it is a front-end to works with? > > So for example, Bio::DB::EUtilities::esearch in -db mode 'gene' is > essentially a frontend onto: > > http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi? > retmode=xml&db=gene&term= > > With a web-browser you can complete that url by supplying a term. For > example, the term 'BRCA2+9606[taxid]' works and returns results: > > http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi? > retmode=xml&db=gene&term=BRCA2+9606[taxid] > > If you supply the exact same term to EUtilities::esearch like so: > > my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db => > "gene", -term "BRCA2+9606[taxid]"); > > The search fails. From my 'user' perspective this is highly > unexpected. > Chris (the author) and I both understand /why/ it fails, but Chris > doesn't think it is a bug, or at least something than can/should be > changed. What do other people think? At the very least, if something > unexpected happens, I'd suggest making a note of it in the POD > somewhere. Eg. "Do not use + in term strings, even though they might > work on the website". > > Chris: what is the disadvantage of always submitting '+' as '+' to the > server? A few reasons: 1) According to NCBI, you can use '+' in queries, but not as a boolean. Global changes of '+' to a space may change the meaning of the query in a few rare occasions. So, if you really wanted to search for the string 'BRCA2+ATG', NCBI looks for that term literally. 2) '+' is a URI reserved symbol for a space delimiter. Therefore, any parameters containing '+' are URI-encoded into %2B, which is decoded on NCBI's end back to '+' (The is demonstrable with current EUtilities output and the returned XML data). 3) Why not just use a space (implicit AND)? Or an explicit boolean? Or '&' (which apparently works but is not specified in the NCBI Entrez docs)? The bug is in the query and not in the code, i.e. is is a user- generated bug, not an EUtilities bug. And it shouldn't be unexpected, as NCBI has very specific rules for building queries for Entrez (just like any other database). If I were to use nonstandard queries for MySQL, BioFetch, UCSC, or anything else, I would expect to get bad results. As the old saying goes, garbage in, garbage out. The following link has their updated rules: http://www.ncbi.nlm.nih.gov/books/bv.fcgi? rid=helpentrez.chapter.EntrezHelp Here is their old one: http://www.ncbi.nlm.nih.gov/entrez/query/static/help/helpdoc.html We could, of course, put something in POD, but you never presented that option to me before. I'll grant that the EUtilities API needs some cleaning up, not easy to do when the returned data varies from each utility. But it does get the URL encoding correct, at least in this case. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Thu Oct 5 14:32:49 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 15:32:49 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> Message-ID: <45251791.9040409@sendu.me.uk> Hilmar Lapp wrote: > > On Oct 5, 2006, at 10:08 AM, Sendu Bala wrote: > >> On the contrary, I find it a surprise because EUtilities is an interface >> to NCBI's eutils, not the entrez website. >> >> If I had previously read instructions on using eutils: >> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.section.constructing-urls >> >> I might (do) expect that I /should/ use + in my term. > > This is my point - stick to your definitions. Are you wrapping a query > form on a website or are you wrapping a web service (i.e., a URL)? > > The examples you give are about wrapping a web-service. Your original > question was about wrapping a website. Right... I don't see that that changes the answer to my question though does it? "The question is, should a Bioperl module always work with queries that the web-service it is a front-end to works with?" For me, the answer is still yes. > As far as I'm concerned, I'll happily count myself among those who are > lazy and ignorant, don't read NCBI's documentation, don't want to know > how to URL encode and why this needs to be done, but just want it to work. That's a reasonable attitude to take. Which comes back to the question I asked of Chris - naively, if you send + as + you can please everyone, can't you? Both people who have read the docs on the web-service and those who haven't? Or are there real queries in which a user may want to search for a phrase with a literal + in it (and where such a search works via eutils)? From bix at sendu.me.uk Thu Oct 5 14:44:33 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 15:44:33 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: References: <4524B20B.5010703@sendu.me.uk> Message-ID: <45251A51.6020802@sendu.me.uk> Chris Fields wrote: > The bug is in the query and not in the code, i.e. is is a > user-generated bug, not an EUtilities bug. And it shouldn't be > unexpected, as NCBI has very specific rules for building queries for > Entrez (just like any other database). So I guess this comes down to something Hilmar mentioned and I never even considered before. You consider your EUtilities stuff as a frontend to entrez, and therefore consider valid queries as queries that are valid for entrez and not eutils? If that's the case, fine. I understand why you don't think this is a bug. Again, something that might warrant a mention in the POD. Currently the naming of the modules and the explicit references to eutils (and me knowing the implementation uses eutils) got me confused. From cjfields at uiuc.edu Thu Oct 5 14:51:28 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 09:51:28 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <452511C1.5020709@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> Message-ID: <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> On Oct 5, 2006, at 9:08 AM, Sendu Bala wrote: >>> This is actually a general question and not limited to >>> EUtilities. As I >>> see it EUtiltiies lets you do queries in Bioperl that you can do >>> on a >>> website. The question is, should a Bioperl module always work with >>> queries that the website it is a front-end to works with? >> >> I think yes, but stick to this definition. >> >> Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez >> website it will actually not work. Hence, it should be no surprise >> that >> it doesn't work either using Bio::DB::EUtilities. > > On the contrary, I find it a surprise because EUtilities is an > interface > to NCBI's eutils, not the entrez website. It uses NCBI's CGI interface for eutils, not the SOAP interface. Very different. I have considered using the NCBI SOAP-based interface, but the web services are still somewhat incomplete, unlike the CGI interface. > If I had previously read instructions on using eutils: > http://www.ncbi.nlm.nih.gov/books/bv.fcgi? > rid=coursework.section.constructing-urls > I might (do) expect that I /should/ use + in my term. You are looking at part of the naked URL on that page. Here's what that page says: "When constructing URLs for the eUtils, please use lowercase characters for all parameters except &WebEnv. There is no required order for the URL parameters in an eUtils URL, and null values or inappropriate parameters are ignored. Avoid placing spaces in the URLs, particularly in queries. If a space is required, use a plus sign (+) instead of a space: * Incorrect: &id=352, 25125, 234, ... * Correct: &id=352,25125,234,... * Incorrect: &term=biomol mrna[properties] AND mouse[organism] * Correct: &term=biomol+mrna[properties]+AND+mouse[organism] Other special characters, such as the # symbol used in referring to a query key on the History server, should be represented by their URL encodings (%23 for #).top link" I use URI for building the URL with the parameters. URI specifically encodes all of this for you, so spaces convert to '+' and '+' converts to %2B. >> Aside from that, one of the advantages of having the service >> wrapped in >> Bioperl is in fact that you can have it accept a wider variety of >> parameters that the actual service would allow you to have, e.g., >> arrays, hashes, or whatever seems appropriate. > > I was going to suggest that terms be supplied as an array, leaving > Bioperl code to decide how to 'AND' all the terms (elements in the > array) together. It would also further force the user not to think of > how eutils normally works, but to only consider the Bioperl > instructions > on how to form a query. But I'm not sure of the value of all that. Why do we need to intuit what the user is thinking at an particular time? How would I know that someone actually wanted to search using the literal string 'abc+123' as opposed to 'abc 123'? I see value in your last suggestion but I think a class or set of classes would be best suited for that: MySQL Query | in out | MySQL Query Entrez Query |-----> Generic Query class----->| Entrez Query SRS Query | | SRS Query ad infinitum... The generic query object could then be used in DB searches as an option besides using a raw string. Though it would get tricky with SQL's complexity... Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Thu Oct 5 14:54:04 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 5 Oct 2006 10:54:04 -0400 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45251791.9040409@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <45251791.9040409@sendu.me.uk> Message-ID: <9916EDEE-EA3C-4C55-A004-A46F37B559BF@gmx.net> On Oct 5, 2006, at 10:32 AM, Sendu Bala wrote: >> The examples you give are about wrapping a web-service. Your >> original question was about wrapping a website. > > Right... I don't see that that changes the answer to my question > though does it? > > "The question is, should a Bioperl module always work with > queries that the web-service it is a front-end to works with?" > > For me, the answer is still yes. The answer is still yes. My point was the query that works with a website is not necessarily the query that works with a web-service, even if that web-service also powers the website. > >> As far as I'm concerned, I'll happily count myself among those who >> are lazy and ignorant, don't read NCBI's documentation, don't want >> to know how to URL encode and why this needs to be done, but just >> want it to work. > > That's a reasonable attitude to take. Which comes back to the > question I asked of Chris - naively, if you send + as + you can > please everyone, can't you? Both people who have read the docs on > the web-service and those who haven't? Or are there real queries in > which a user may want to search for a phrase with a literal + in it > (and where such a search works via eutils)? So are you suggesting to URL-encode some characters but not others? This would move you into muddy waters and I'm wondering what the gain is from that, and for whom it is a gain. It sounds like it will mostly benefit those who have studied the NCBI documentation and know exactly the URL they want to send and want to ignore the EUtilities POD. My humble guess is the far majority of people will either not read any documentation, or read the module's POD. Maybe a better way to serve both types of people is to accept a parameter -querystring that is expected to include everything from 'term=' onwards (including 'term=' itself) which gives you complete control and freedom if you know what you are doing, and otherwise implement what you suggested before: > I was going to suggest that terms be supplied as an array, leaving > Bioperl code to decide how to 'AND' all the terms (elements in the > array) together. It would also further force the user not to think of > how eutils normally works, but to only consider the Bioperl > instructions > on how to form a query. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Thu Oct 5 15:02:01 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 16:02:01 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> Message-ID: <45251E69.7040507@sendu.me.uk> Chris Fields wrote: > > On Oct 5, 2006, at 9:08 AM, Sendu Bala wrote: > >> On the contrary, I find it a surprise because EUtilities is an interface >> to NCBI's eutils, not the entrez website. > > It uses NCBI's CGI interface for eutils, not the SOAP interface. Very > different. I have considered using the NCBI SOAP-based interface, but > the web services are still somewhat incomplete, unlike the CGI interface. I don't know anything about the SOAP interface. I'm talking about the CGI interface that you use. >> If I had previously read instructions on using eutils: >> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.section.constructing-urls >> >> I might (do) expect that I /should/ use + in my term. > > You are looking at part of the naked URL on that page. Here's what that > page says: I know what it says... > * Correct: &term=biomol+mrna[properties]+AND+mouse[organism] The correct query is the one that has +s in it. > I use URI for building the URL with the parameters. URI specifically > encodes all of this for you, so spaces convert to '+' and '+' converts > to %2B. Well, yes. This causes what I thought of as a bug. It prevents me from submitting a /correct/ eutils term. However it isn't a bug if you explain to users they shouldn't be submitting valid eutils terms, but only valid /entrez/ terms. From cjfields at uiuc.edu Thu Oct 5 15:15:49 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 10:15:49 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45251A51.6020802@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <45251A51.6020802@sendu.me.uk> Message-ID: On Oct 5, 2006, at 9:44 AM, Sendu Bala wrote: > Chris Fields wrote: >> The bug is in the query and not in the code, i.e. is is a user- >> generated bug, not an EUtilities bug. And it shouldn't be >> unexpected, as NCBI has very specific rules for building queries >> for Entrez (just like any other database). > > So I guess this comes down to something Hilmar mentioned and I > never even considered before. You consider your EUtilities stuff as > a frontend to entrez, and therefore consider valid queries as > queries that are valid for entrez and not eutils? The eutils tools access the same databases as the web page, in the same way, using the same search terms. From the EUtilities docs: "The eUtils access the core search and retrieval engine of the Entrez system and, therefore, are only capable of retrieving data that are already in Entrez." > If that's the case, fine. I understand why you don't think this is > a bug. Again, something that might warrant a mention in the POD. > Currently the naming of the modules and the explicit references to > eutils (and me knowing the implementation uses eutils) got me > confused. I'll note that in there is URI encoding in POD, but that should be a no-brainer. I don't think every Bio::DB* class specifies this, mainly because it is taken for granted. Pretty much anything that builds URL strings needs to encode based on the URI standard, and any server that accepts URLs is expected to decode using the same standard. So, again, why does that have to be specifically outlined in POD? Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Oct 5 15:24:39 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 10:24:39 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45251E69.7040507@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> Message-ID: >> I use URI for building the URL with the parameters. URI >> specifically encodes all of this for you, so spaces convert to '+' >> and '+' converts to %2B. > > Well, yes. This causes what I thought of as a bug. It prevents me > from submitting a /correct/ eutils term. However it isn't a bug if > you explain to users they shouldn't be submitting valid eutils > terms, but only valid /entrez/ terms. I can specify in POD that URI encoding is in effect if that placates you, and maybe add a bit about how terms are to be built (based on the website). I also noticed that the esearch POD doesn't have a demo in the SYNOPSIS yet (my fault). However, I think this is all a bit silly. This is something most people already realize and take for granted (it's standard for any CGI interface to use URI encoding). Also, most Entrez users do not use a term like 'BRCA2+Human [ORGANISM]'. They use 'BRCA2 AND Human[ORGANISM]' or 'BRCA2 Human [ORGANISM]', the latter which is implicit. All of this is on the Entrez website. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From MEC at stowers-institute.org Thu Oct 5 15:12:02 2006 From: MEC at stowers-institute.org (Cook, Malcolm) Date: Thu, 5 Oct 2006 10:12:02 -0500 Subject: [Bioperl-l] using nfreeze instead of freeze in Bio::SeqFeature::Store Message-ID: Lincoln, I committed a change to Bio::SeqFeature::Store to use nfreeze instead of freeze which should allow SeqFeature objects to survive database freeze/thaw cycles across architectures. I hope I was not presumptuous or in error in doing this.... Regards, Malcolm Cook Database Applications Manager - Bioinformatics Stowers Institute for Medical Research - Kansas City, Missouri From bix at sendu.me.uk Thu Oct 5 15:28:55 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 16:28:55 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: References: <4524B20B.5010703@sendu.me.uk> <45251A51.6020802@sendu.me.uk> Message-ID: <452524B7.5080003@sendu.me.uk> Chris Fields wrote: > > On Oct 5, 2006, at 9:44 AM, Sendu Bala wrote: > >> Chris Fields wrote: >>> The bug is in the query and not in the code, i.e. is is a >>> user-generated bug, not an EUtilities bug. And it shouldn't be >>> unexpected, as NCBI has very specific rules for building queries for >>> Entrez (just like any other database). >> >> So I guess this comes down to something Hilmar mentioned and I never >> even considered before. You consider your EUtilities stuff as a >> frontend to entrez, and therefore consider valid queries as queries >> that are valid for entrez and not eutils? > > The eutils tools access the same databases as the web page, in the same > way, using the same search terms. It doesn't. The eutils interface behaves differently with +s than does the entrez website interface. In eutils + means space, whilst in entrez, + means the plus symbol. >> If that's the case, fine. I understand why you don't think this is a >> bug. Again, something that might warrant a mention in the POD. >> Currently the naming of the modules and the explicit references to >> eutils (and me knowing the implementation uses eutils) got me confused. > > I'll note that in there is URI encoding in POD, but that should be a > no-brainer. Just that it is URI encoded isn't the problem. The problem is the difference in behaviour outlined above. > I don't think every Bio::DB* class specifies this, mainly > because it is taken for granted. Pretty much anything that builds URL > strings needs to encode based on the URI standard, and any server that > accepts URLs is expected to decode using the same standard. > > So, again, why does that have to be specifically outlined in POD? Because they're different. If I construct a valid eutils query it might not work. You ought to explain why. "EUtilities takes any valid entrez query and transforms it into a valid eutils query for submission. Do not try and provide a valid eutils query of your own, or the extra transformation will result in no results" From bix at sendu.me.uk Thu Oct 5 15:30:44 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 16:30:44 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> Message-ID: <45252524.7030006@sendu.me.uk> Chris Fields wrote: >>> I use URI for building the URL with the parameters. URI specifically >>> encodes all of this for you, so spaces convert to '+' and '+' >>> converts to %2B. >> >> Well, yes. This causes what I thought of as a bug. It prevents me from >> submitting a /correct/ eutils term. However it isn't a bug if you >> explain to users they shouldn't be submitting valid eutils terms, but >> only valid /entrez/ terms. > > I can specify in POD that URI encoding is in effect if that placates > you, and maybe add a bit about how terms are to be built (based on the > website). I also noticed that the esearch POD doesn't have a demo in > the SYNOPSIS yet (my fault). > > However, I think this is all a bit silly. This is something most people > already realize and take for granted (it's standard for any CGI > interface to use URI encoding). > > Also, most Entrez users do not use a term like 'BRCA2+Human[ORGANISM]'. > They use 'BRCA2 AND Human[ORGANISM]' or 'BRCA2 Human[ORGANISM]', the > latter which is implicit. All of this is on the Entrez website. Exactly. You're assuming an entrez user and expecting an entrez query. I don't think its silly given the name of the modules for the user to assume the code needs an eutils query, which is a different thing with different behaviour /independent/ of URI encoding. From cjfields at uiuc.edu Thu Oct 5 15:50:51 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 10:50:51 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45251E69.7040507@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> Message-ID: <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu> > I know what it says... Ah, that's the Sendu I know and love. > >> * Correct: &term=biomol+mrna[properties]+AND+mouse[organism] > > The correct query is the one that has +s in it. Yes, that's because it's a URL, not a raw search term string (it has been URI-encoded so spaces are converted to '+'). If you use that as a direct query in Entrez you will not get the same response. You do get something if you use the new NCBI global query form on the main page, but clicking on the nucleotide or PMC hits reveals that the URL is malformed and no term is present. That is exactly the same response in EUtilities: 0 0 0 Note the QueryTranslation tag is empty. The only noticeable difference is using egquery (which I just fixed in CVS yesterday). The returned XML gives no hits for any database, which is true based on individual esearch queries for those database, and is actually more consistent than the website version. >> I use URI for building the URL with the parameters. URI specifically >> encodes all of this for you, so spaces convert to '+' and '+' >> converts >> to %2B. > > Well, yes. This causes what I thought of as a bug. It prevents me from > submitting a /correct/ eutils term. However it isn't a bug if you > explain to users they shouldn't be submitting valid eutils terms, but > only valid /entrez/ terms. If you mean that most users will actually use a URL-like search term, then I would say you have a point. But that simply isn't the case. If clarifying the docs makes it better, then so be it. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Oct 5 15:59:53 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 10:59:53 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45252524.7030006@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> Message-ID: <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> On Oct 5, 2006, at 10:30 AM, Sendu Bala wrote: > Chris Fields wrote: >>>> I use URI for building the URL with the parameters. URI >>>> specifically encodes all of this for you, so spaces convert to >>>> '+' and '+' converts to %2B. >>> >>> Well, yes. This causes what I thought of as a bug. It prevents me >>> from submitting a /correct/ eutils term. However it isn't a bug >>> if you explain to users they shouldn't be submitting valid eutils >>> terms, but only valid /entrez/ terms. >> I can specify in POD that URI encoding is in effect if that >> placates you, and maybe add a bit about how terms are to be built >> (based on the website). I also noticed that the esearch POD >> doesn't have a demo in the SYNOPSIS yet (my fault). >> However, I think this is all a bit silly. This is something most >> people already realize and take for granted (it's standard for any >> CGI interface to use URI encoding). >> Also, most Entrez users do not use a term like 'BRCA2+Human >> [ORGANISM]'. They use 'BRCA2 AND Human[ORGANISM]' or 'BRCA2 Human >> [ORGANISM]', the latter which is implicit. All of this is on the >> Entrez website. > > Exactly. You're assuming an entrez user and expecting an entrez > query. I don't think its silly given the name of the modules for > the user to assume the code needs an eutils query, which is a > different thing with different behaviour /independent/ of URI > encoding. It's a silly distinction. The POD for Bio::DB::EUtilities states: Bio::DB::EUtilities - interface for handling web queries and data retrieval from NCBI's Entrez Utilities. My question is this : why would anyone (particularly the everyday bioperl user) want to use URL-encoded parameters for a query? That seems to be your main argument here. If so, wouldn't I just paste them together then send them off NCBI eutils? Would I devote ~ 10 classes to that? I could do that in a short program using an array, join, and LWP::Simple. The purpose is quite clearly stated, but if you feel that by badgering me to add something to POD I consider common sense, then you're right. You've succeeded. Bravo. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Thu Oct 5 16:02:05 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 17:02:05 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu> Message-ID: <45252C7D.3050009@sendu.me.uk> Chris Fields wrote: > >>> * Correct: &term=biomol+mrna[properties]+AND+mouse[organism] >> >> The correct query is the one that has +s in it. > > Yes, that's because it's a URL, not a raw search term string (it has > been URI-encoded so spaces are converted to '+'). If you use that as a > direct query in Entrez you will not get the same response. But we're not doing Entrez queries. We're using a module called EUtilities to do an eutils query, which involves forming a url in which spaces should to be converted to +. That's the source of confusion. Is the user supposed to do this, or is EUtilities? All you had to do 8 emails ago is tell me that EUtilities is supposed to do that. You /still/ haven't told me that. I give up. From cjfields at uiuc.edu Thu Oct 5 16:12:11 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 11:12:11 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45252C7D.3050009@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu> <45252C7D.3050009@sendu.me.uk> Message-ID: On Oct 5, 2006, at 11:02 AM, Sendu Bala wrote: > Chris Fields wrote: >> >>>> * Correct: &term=biomol+mrna[properties]+AND+mouse[organism] >>> >>> The correct query is the one that has +s in it. >> Yes, that's because it's a URL, not a raw search term string (it >> has been URI-encoded so spaces are converted to '+'). If you use >> that as a direct query in Entrez you will not get the same response. > > But we're not doing Entrez queries. We're using a module called > EUtilities to do an eutils query, which involves forming a url in > which spaces should to be converted to +. That's the source of > confusion. Is the user supposed to do this, or is EUtilities? > > All you had to do 8 emails ago is tell me that EUtilities is > supposed to do that. You /still/ haven't told me that. I give up. It should be apparent from the documentation and the URLs posted in debugging output the first few times you used it. Again, why would I dedicate ~ 10 classes to pasting together URI-encoded strings? Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Thu Oct 5 16:22:36 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 17:22:36 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> Message-ID: <4525314C.7020205@sendu.me.uk> Chris Fields wrote: > > On Oct 5, 2006, at 10:30 AM, Sendu Bala wrote: > >> Exactly. You're assuming an entrez user and expecting an entrez query. >> I don't think its silly given the name of the modules for the user to >> assume the code needs an eutils query, which is a different thing with >> different behaviour /independent/ of URI encoding. > > It's a silly distinction. The POD for Bio::DB::EUtilities states: > > Bio::DB::EUtilities - interface for handling web queries and data > retrieval from NCBI's Entrez Utilities. > > My question is this : why would anyone (particularly the everyday > bioperl user) want to use URL-encoded parameters for a query? Well I'll tell you why I was trying to use URL-encoded parameters, if that helps you any. I read the pod for EUtilities but all the examples have very simple -term s defined with just a single word. So I wonder how I'm supposed to make an 'AND' term. I also have no idea what utilities I'm supposed to use, or what databases etc. I need to get the answer I want. The POD points me here: http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html Combined with the EUtilities synopsis I know I'm supposed to start with esearch so I look at: http://eutils.ncbi.nlm.nih.gov/entrez/query/static/esearch_help.html And figure out what my terms are supposed to be. Then I test some example terms in my web browser using the esearch base url (http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?) to see if they work, and copy/paste the terms into my EUtilities-using perl script, replacing variable terms with perl variables. Then I find that my terms don't work, ask you about it, and you fail to tell me I should be testing my terms at http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=search&DB=gene. If you think I'm stupid, fine, but I'm probably not the only stupid person on the planet. Which is why I suggested a POD addition. You don't have to make any POD change if you don't want to. I simply thought it might help avoid anyone 'badgering' you in the future with a similar problem. From bix at sendu.me.uk Thu Oct 5 16:28:51 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 17:28:51 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu> <45252C7D.3050009@sendu.me.uk> Message-ID: <452532C3.9030804@sendu.me.uk> Chris Fields wrote: > > On Oct 5, 2006, at 11:02 AM, Sendu Bala wrote: > >> Chris Fields wrote: >>> >>>>> * Correct: &term=biomol+mrna[properties]+AND+mouse[organism] >>>> >>>> The correct query is the one that has +s in it. >>> Yes, that's because it's a URL, not a raw search term string (it has >>> been URI-encoded so spaces are converted to '+'). If you use that as >>> a direct query in Entrez you will not get the same response. >> >> But we're not doing Entrez queries. We're using a module called >> EUtilities to do an eutils query, which involves forming a url in >> which spaces should to be converted to +. That's the source of >> confusion. Is the user supposed to do this, or is EUtilities? >> >> All you had to do 8 emails ago is tell me that EUtilities is supposed >> to do that. You /still/ haven't told me that. I give up. > > It should be apparent from the documentation and the URLs posted in > debugging output the first few times you used it. Again, why would I > dedicate ~ 10 classes to pasting together URI-encoded strings? I'm not sure how not doing URI-encoding would suddenly make your classes worthless. I find them to be very useful (even when I didn't know there was any URI-encoding, was incorrectly using +s and it happened to work anyway). From bernd.web at gmail.com Thu Oct 5 14:09:38 2006 From: bernd.web at gmail.com (Bernd Web) Date: Thu, 5 Oct 2006 16:09:38 +0200 Subject: [Bioperl-l] Eutilities Batch Message-ID: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com> Hi, I am using the new EUtilities. It looks great. I was trying to use epost followed by elink but i get an error. The same error is actually given with the example on http://doc.bioperl.org/bioperl-live/Bio/DB/EUtilities/elink.html: Can't call method "get_databases" on an undefined value at EU.pl line 25. For completeness, the code is shown below too. Any suggestions what is going wrong? Regards, Bernd # chain EUtilities for complex queries use Bio::DB::EUtilities; my $esearch = Bio::DB::EUtilities->new(-eutil => 'esearch', -db => 'pubmed', -term => 'hutP', -usehistory => 'y'); $esearch->get_response; # parse the response, fetch a cookie my $elink = Bio::DB::EUtilities->new(-eutil => 'elink', -db => 'protein,taxonomy', -dbfrom => 'pubmed', -cookie => $esearch->next_cookie, -cmd => 'neighbor'); # this retrieves the Bio::DB::EUtilities::ElinkData object my ($linkset) = $elink->next_linkset; my @ids; # step through IDs for each linked database in the ElinkData object for my $db ($linkset->get_databases) { @ids = $linkset->get_LinkIds_by_db($db); #returns primary ID's # do something here } From cjfields at uiuc.edu Thu Oct 5 17:31:33 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 12:31:33 -0500 Subject: [Bioperl-l] Eutilities Batch In-Reply-To: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com> References: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com> Message-ID: I'll look into it. I'm busy updating the EUtilities tools now. Chris On Oct 5, 2006, at 9:09 AM, Bernd Web wrote: > Hi, > > I am using the new EUtilities. It looks great. > I was trying to use epost followed by elink but i get an error. The > same error is actually given with the example on > http://doc.bioperl.org/bioperl-live/Bio/DB/EUtilities/elink.html: > Can't call method "get_databases" on an undefined value at EU.pl > line 25. > > For completeness, the code is shown below too. > > Any suggestions what is going wrong? > > Regards, > Bernd > > # chain EUtilities for complex queries > > use Bio::DB::EUtilities; > > my $esearch = Bio::DB::EUtilities->new(-eutil => 'esearch', > -db => 'pubmed', > -term => 'hutP', > -usehistory => 'y'); > > $esearch->get_response; # parse the response, fetch a cookie > > my $elink = Bio::DB::EUtilities->new(-eutil => 'elink', > -db => > 'protein,taxonomy', > -dbfrom => 'pubmed', > -cookie => $esearch- > >next_cookie, > -cmd => 'neighbor'); > > # this retrieves the Bio::DB::EUtilities::ElinkData object > > my ($linkset) = $elink->next_linkset; > my @ids; > > # step through IDs for each linked database in the ElinkData object > > for my $db ($linkset->get_databases) { > @ids = $linkset->get_LinkIds_by_db($db); #returns primary ID's > # do something here > } > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From daniel.lang at biologie.uni-freiburg.de Thu Oct 5 17:12:02 2006 From: daniel.lang at biologie.uni-freiburg.de (Daniel Lang) Date: Thu, 05 Oct 2006 19:12:02 +0200 Subject: [Bioperl-l] Bio::DB::SeqFeature Message-ID: <45253CE2.1070208@biologie.uni-freiburg.de> Hi, we are storing Bio::SeqFeature::Gene::GeneStructure objects (with multiple transcripts) using Bio::DB::SeqFeature::Store in a mysql db (latest bioperl-live checkout). The Bio::SeqFeature::Gene::GeneStructure's are generated from scratch out of a database. The first observation is that is seems to work (fetched objects behave like Bio::SeqFeature::Gene::GeneStructure's) despite the fact that we get these warnings: Can't store item CODE(0x113db10) at lib/Storable.pm (autosplit into lib/auto/Storable/_freeze.al) line 287, line 1. Can't store item CODE(0x11786f0) at lib/Storable.pm (autosplit into lib/auto/Storable/_freeze.al) line 287, line 1. (in cleanup) Not a CODE reference at /home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, line 1. prepare_cached(SELECT f.id,f.object FROM feature as f WHERE ( f.seqid=? AND f.end>=? AND f.start<=? AND ((f.tier=? AND f.bin between ? AND ?) OR (f.tier=? AND f.bin between ? AND ?) OR (f.tier=? AND f.bin between ? AND ?) OR (f.tier=? AND f.bin between ? AND ?) OR (f.tier=? AND f.bin between ? AND ?) OR (f.tier=? AND f.bin between ? AND ?) OR (f.tier=? AND f.bin between ? AND ?)) ) ) statement handle DBI::st=HASH(0x1c317cf0) still Active at /home/lang/bioperl/bioperl-live//Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422 (in cleanup) Not a CODE reference at /home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, line 1. Is this something serious? Does this mean that the stored object doesn't have everything it had before freezing? Or are we using Bio::DB::SeqFeature inappropriately? The other question would be, if we can visualize these stored feature objects easily using gbrowse? I didn't find a hint mentioning Bio::DB::SeqFeature as being supported by gbrowse on the gmod pages... Is it working already? Will it? Thanks in advance, Daniel -- Daniel Lang University of Freiburg, Plant Biotechnology Schaenzlestr. 1, D-79104 Freiburg fax: +49 761 203 6945 phone: +49 761 203 6974 homepage: http://www.plant-biotech.net/ e-mail: daniel.lang at biologie.uni-freiburg.de ################################################# My software never has bugs. It just develops random features. ################################################# From cjfields at uiuc.edu Thu Oct 5 17:45:40 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 12:45:40 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <452532C3.9030804@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu> <45252C7D.3050009@sendu.me.uk> <452532C3.9030804@sendu.me.uk> Message-ID: <003DD8C4-6E59-44C2-9A1C-117E036D93BC@uiuc.edu> On Oct 5, 2006, at 11:28 AM, Sendu Bala wrote: > I'm not sure how not doing URI-encoding would suddenly make your > classes worthless. I find them to be very useful (even when I > didn't know there was any URI-encoding, was incorrectly using +s > and it happened to work anyway). That's not my point (and sincerest apologies for the 'badgering' bit). If you made the assumption that all the parameters had to be URI-encoded, why couldn't I do something like: my %param = (#make up your list of parameters here#); my $eutil = 'esearch'; my $url = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/$eutil.fcgi"; # join the key value pairs with '=', then join all those with & # add to end of url # post and retrieve via LWP::Simple It's more user-friendly to set up the parameters so that you wouldn't have to encode everything yourself, esp. when the most reliable way to encode URI strings is to 'use URI'. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Oct 5 18:11:25 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 13:11:25 -0500 Subject: [Bioperl-l] Eutilities Batch In-Reply-To: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com> References: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com> Message-ID: <4A340977-C6AD-4728-8947-BF5A8A782807@uiuc.edu> On Oct 5, 2006, at 9:09 AM, Bernd Web wrote: > Hi, > > I am using the new EUtilities. It looks great. > I was trying to use epost followed by elink but i get an error. The > same error is actually given with the example on > http://doc.bioperl.org/bioperl-live/Bio/DB/EUtilities/elink.html: > Can't call method "get_databases" on an undefined value at EU.pl > line 25. > > For completeness, the code is shown below too. > > Any suggestions what is going wrong? > > Regards, > Bernd Grr...that's my error, sorry Bernd. The POD wasn't updated to match the change I made and has a few errors. The elink object, for starters, doesn't fetch the response using get_response(). Also, the ElinkData method has changed slightly but accomplishes the same thing. Odd, since I copied and pasted that from working code... Just a note: these are considered highly experimental at the moment, though they should be ready for general use and toying around. I would like any suggestions on methods and so on you may have (Sendu has made some very helpful ones off-list which I plan on implementing). Feel free to let me know if something doesn't work. Note that, because of their experimental nature, you will want to take note of any methods changes in particular as I try to solidify the API and clean up the POD, so expect some momentary 'outages'. I plan on setting up a remedial interface for all the container objects (like ElinkData) which will help clarify things and solidify the API in the next few weeks, at least to a point where the class methods have a consistent naming scheme. I plan on using this as a backend web agent for a general Entrez interface at some point to get data into Bio* objects. In the meantime, try this: use Bio::DB::EUtilities; my $esearch = Bio::DB::EUtilities->new(-eutil => 'esearch', -db => 'pubmed', -term => 'hutP', -usehistory => 'y'); $esearch->get_response; # parse the response, fetch a cookie my $elink = Bio::DB::EUtilities->new(-eutil => 'elink', -db => 'protein,taxonomy', -dbfrom => 'pubmed', -cookie => $esearch- >next_cookie, -cmd => 'neighbor'); $elink->get_response; # this retrieves the Bio::DB::EUtilities::ElinkData object my $linkset = $elink->next_linkset; my @ids; # step through IDs for each linked database in the ElinkData object for my $db ($linkset->get_all_linkdbs) { @ids = $linkset->get_LinkIds_by_db($db); #returns primary ID's print join q(,), @ids; # do something here } Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From dmessina at wustl.edu Thu Oct 5 18:07:56 2006 From: dmessina at wustl.edu (David Messina) Date: Thu, 5 Oct 2006 13:07:56 -0500 Subject: [Bioperl-l] BioPerl Deobfuscator updated Message-ID: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> I'm pleased to announce a revised version of the BioPerl Deobfuscator is now available. Many thanks to Mauricio Cuadra for updating bioperl.org's installation: http://bioperl.org/cgi-bin/deob_interface.cgi I've incorporated many of the suggestions you all sent in after the first release, and many of the modules that had non-standard documentation have been updated in the meantime, too, so hopefully you'll find it much improved. There are still some issues with a few modules; please report any problems you see. Also, it's now indexing bioperl-live instead of 1.4, which should make it a little more useful, too. A complete list of changes is below. I welcome your bug reports and suggestions for improvements, via email, this list, Bugzilla, or the Wiki page. Thanks, Dave Changes 0.0.3 Mon Oct 2 20:01:45 CDT 2006 FIX: change default $deob_detail_path to be a relative URL instead of having localhost hardcoded. Thanks to Jason Stajich for pointing this out. FIX: Bio::Ontology modules are no longer missing their prefix in the class list, and their methods are now shown in the lower pane as expected. Thanks to Hilmar Lapp for reporting this bug. FIX: can now handle (and ignore) VERSION POD section. FIX: missing SYNOPSIS section now handled properly. In fact, the SYNOPSIS and DESCRIPTION sections can be in reverse order now, although for consistency this is not recommended. FIX: Bug #2114: "Obfuscator doesn't show "Bio:Matrix:Generic" has been fixed. This bug turned out to afflict multiple modules, which weren't getting parsed correctly by deob_index.pl. NEW: Table cells have been padded out to get rid of that "scrunched" look. Thanks to Sendu Bala for this great suggestion. NEW: If the 'Returns' subsection of a method's documentation contains a POD L<> link, the Deobfuscator assumes this to be a package name, and wraps it in an href for display. This feature is not robust, but seems to work well enough for now. NEW: the list of classes is now sorted alphabetically depth- first, so that subclasses appear just after their parent class. Thanks to Amir Karger for noticing the strange sorting behavior. NEW: HTML page title now 'BioPerl Deobfuscator' to distinguish it from other Deobfuscators out there. Thanks to Amir Karger for suggesting this. NEW: 'No match' search string now more prominent. Yep, kudos to Amir Karger again -- another great idea! NEW: Search box caption now explicitly states that only package names can be searched. Big ups to Amir Karger for this suggestion. The ability to search method names is planned for a future version. NEW: added -x option to deob_index.pl. This allows the use of an 'excluded modules' file. This feature was added to resolve an issue with four modules which rely on external modules to compile. Class::Inspector, used by the Deobfuscator needs to load a module to traverse its inheritance tree, and modules must compile before they can be loaded. CHANGE: using short name now when traversing with File::Find to help identify excluded modules (deob_index.pl). From lincoln.stein at gmail.com Thu Oct 5 18:41:08 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Thu, 5 Oct 2006 14:41:08 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC1 In-Reply-To: References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net> Message-ID: <6dce9a0b0610051141x6b61407ar1c0a13cf7616b35f@mail.gmail.com> The non-numeric comparison bug in Bio::DB::SeqFeature is fixed in the latest CVS. Do I need to do anything special to get the CVS fixes into the release candidate? Lincoln On 10/2/06, Chris Fields wrote: > > [I won't create a wiki account just to report this.] > > > > Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG > > not set. Lots of warnings about missing packages and all, but this > > looks interesting: > > > > Argument "+" isn't numeric in numeric lt (<) at Bio/DB/ > > SeqFeature/Segment.pm line 423. > > This is verified on Mac OS X. > > > Otherwise: > > > > Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, > > 99.99% okay. > > > > The failed test is: > > > > t/ESEfinder..................dubious > > Test returned status 255 (wstat 65280, 0xff00) > > DIED. FAILED test 15 > > What do you get when you run that set of tests using 'perl -I. -w t/ > ESEFinder.t'? The bad status code is odd and could be a remote > server issue. > > Chris > > > > > > florin > > > > -- > > If we wish to count lines of code, we should not regard them as lines > > produced but as lines spent. -- Edsger Dijkstra > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From MEC at stowers-institute.org Thu Oct 5 19:18:08 2006 From: MEC at stowers-institute.org (Cook, Malcolm) Date: Thu, 5 Oct 2006 14:18:08 -0500 Subject: [Bioperl-l] using nfreeze instead of freeze in Bio::SeqFeature::Store Message-ID: Yes, there is overhead (c.f. perldoc Storable) "When writing in network order, all fields are written out as standard lengths, which allows full interworking, but takes longer to read and write)" And, I suppose there is also risk of loosing precision in using network order: You can also store data in network order to allow easy sharing across multiple platforms, or when storing on a socket known to be remotely connected. The routines to call have an initial "n" prefix for *network*, as in "nstore" and "nstore_fd". At retrieval time, your data will be correctly restored so you don't have to know whether you're restoring from native or network ordered data. Double values are stored stringified to ensure portability as well, at the slight risk of loosing some precision in the last decimals. So, I agree, it should be configuration option, perhaps defaulting to using network order. However, given the factoring of ../Bio/DB/SeqFeature/Store.pm I'm not sure how to best make it a configuration option since the two provided serializers don't share a common interface. Possibly something like: =head1 Methods for Connecting and Initializating a Database =head2 new Title : new Usage : $db = Bio::DB::SeqFeature::Store->new(@options) Function: connect to a database Returns : A descendent of Bio::DB::Seqfeature::Store Args : several - see below Status : public This class method creates a new database connection. The following -name=E$value arguments are accepted:http://iowg.brcdevel.org/gff3.html#a_fasta Name Value ---- ----- -adaptor The name of the Adaptor class (default DBI::mysql) -serializer The name of the serializer class (default Storable) -network_order Strive to 'preserve network order' (if the serializer implements it. Currently, only Storable.pm does, and this will cause it to use nfreeze instead of freeze. (default 1) -index_subfeatures Whether or not to make subfeatures searchable (default true) -cache Activate LRU caching feature -- size of cache -compress Compresses features before storing them in database using Compress::Zlib Malcolm Cook Database Applications Manager - Bioinformatics Stowers Institute for Medical Research - Kansas City, Missouri > -----Original Message----- > From: Lincoln Stein [mailto:lincoln.stein at gmail.com] > Sent: Thursday, October 05, 2006 1:43 PM > To: Cook, Malcolm > Cc: lstein at cshl.org; bioperl-l > Subject: Re: using nfreeze instead of freeze in Bio::SeqFeature::Store > > I think it's fine unless there is a significant performance hit, in > which case the change should be made into a configuration option. Do > you know if there is any overhead on doing this? > > Lincoln > > On 10/5/06, Cook, Malcolm wrote: > > Lincoln, > > > > I committed a change to Bio::SeqFeature::Store to use > nfreeze instead of > > freeze which should allow SeqFeature objects to survive database > > freeze/thaw cycles across architectures. > > > > I hope I was not presumptuous or in error in doing this.... > > > > Regards, > > > > Malcolm Cook > > Database Applications Manager - Bioinformatics > > Stowers Institute for Medical Research - Kansas City, Missouri > > > > > > > -- > Lincoln D. Stein > Cold Spring Harbor Laboratory > 1 Bungtown Road > Cold Spring Harbor, NY 11724 > (516) 367-8380 (voice) > (516) 367-8389 (fax) > FOR URGENT MESSAGES & SCHEDULING, > PLEASE CONTACT MY ASSISTANT, > SANDRA MICHELSEN, AT michelse at cshl.edu > From lincoln.stein at gmail.com Thu Oct 5 18:32:40 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Thu, 5 Oct 2006 14:32:40 -0400 Subject: [Bioperl-l] Bio::DB::SeqFeature In-Reply-To: <45253CE2.1070208@biologie.uni-freiburg.de> References: <45253CE2.1070208@biologie.uni-freiburg.de> Message-ID: <6dce9a0b0610051132p7d7fcf84g27578731f9727f3f@mail.gmail.com> Hi Daniel, The warnings you are seeing are occurring because Bio::SeqFeature::Gene::GeneStructure contains a CODE reference. I think it must be registering a cleanup method via its Bio::Root::Root ancestor. When Storable serializes the object, it complains that it can't serialize the CODE reference and instead converts it into the string "CODE(0xXXXXX)". Then, after you thaw the object, Bio::Root::Root is complaining that the CODE reference is invalid because it is a string, not a reference. Yuck. I think, however, that I can fix this by setting some magic variables in Storable version 2.05 that will decompile and compile the CODE references. I will try this and send you a note when the code is in CVS. GBrowse does run off Bio::DB::SeqFeature::Store and is noticeably faster than the original Bio::DB::GFF adaptor. Nothing really changes except that you set the db_adaptor option to Bio::DB::SeqFeature::Store. I haven't tried it using Bio::SeqFeature::Gene::GeneStructure, so no guarantees, but I am hopeful that it will work. Lincoln On 10/5/06, Daniel Lang wrote: > Hi, > > we are storing Bio::SeqFeature::Gene::GeneStructure objects (with > multiple transcripts) using Bio::DB::SeqFeature::Store in a mysql db > (latest bioperl-live checkout). > > The Bio::SeqFeature::Gene::GeneStructure's are generated from scratch > out of a database. > > The first observation is that is seems to work (fetched objects behave > like Bio::SeqFeature::Gene::GeneStructure's) despite the fact that we > get these warnings: > > Can't store item CODE(0x113db10) at lib/Storable.pm (autosplit into > lib/auto/Storable/_freeze.al) line 287, line 1. > Can't store item CODE(0x11786f0) at lib/Storable.pm (autosplit into > lib/auto/Storable/_freeze.al) line 287, line 1. > (in cleanup) Not a CODE reference at > /home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, line 1. > prepare_cached(SELECT f.id,f.object > FROM feature as f > WHERE ( f.seqid=? > AND f.end>=? AND f.start<=? AND ((f.tier=? AND f.bin between ? AND ?) > OR (f.tier=? AND f.bin between ? AND ?) > OR (f.tier=? AND f.bin between ? AND ?) > OR (f.tier=? AND f.bin between ? AND ?) > OR (f.tier=? AND f.bin between ? AND ?) > OR (f.tier=? AND f.bin between ? AND ?) > OR (f.tier=? AND f.bin between ? AND ?)) > ) > > ) statement handle DBI::st=HASH(0x1c317cf0) still Active at > /home/lang/bioperl/bioperl-live//Bio/DB/SeqFeature/Store/DBI/mysql.pm > line 1422 > (in cleanup) Not a CODE reference at > /home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, line 1. > > Is this something serious? Does this mean that the stored object doesn't > have everything it had before freezing? Or are we using > Bio::DB::SeqFeature inappropriately? > > The other question would be, if we can visualize these stored feature > objects easily using gbrowse? I didn't find a hint mentioning > Bio::DB::SeqFeature as being supported by gbrowse on the gmod pages... > Is it working already? Will it? > > Thanks in advance, > Daniel > > -- > > Daniel Lang > University of Freiburg, Plant Biotechnology > Schaenzlestr. 1, D-79104 Freiburg > fax: +49 761 203 6945 > phone: +49 761 203 6974 > homepage: http://www.plant-biotech.net/ > e-mail: daniel.lang at biologie.uni-freiburg.de > > ################################################# > My software never has bugs. > It just develops random features. > ################################################# > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From hlapp at gmx.net Thu Oct 5 20:34:49 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 5 Oct 2006 16:34:49 -0400 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <4525314C.7020205@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> <4525314C.7020205@sendu.me.uk> Message-ID: <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net> On Oct 5, 2006, at 12:22 PM, Sendu Bala wrote: > If you think I'm stupid, fine, but I'm probably not the only stupid > person on the planet. That's a great suggestion that I hope we can all agree on? I'll happily count myself among the stupid ones too so you're not alone, and stupid people and even more so those who are lucky enough not to be stupid have an obligation to document stuff so that even the stupid can understand, no matter how silly the documentation might get. Is that agreeable without causing yet more progressive hair loss? Actually - I'm having second thoughts. Isn't it a distinguishing feature of stupid people that - among other things - they are stupid enough to believe they don't need to read documentation? You admitted publicly that you read documentation - are you just faking the stupid? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Thu Oct 5 21:11:06 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 16:11:06 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> <4525314C.7020205@sendu.me.uk> <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net> Message-ID: <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu> On Oct 5, 2006, at 3:34 PM, Hilmar Lapp wrote: > > On Oct 5, 2006, at 12:22 PM, Sendu Bala wrote: > >> If you think I'm stupid, fine, but I'm probably not the only stupid >> person on the planet. > > That's a great suggestion that I hope we can all agree on? I'll > happily count myself among the stupid ones too so you're not alone, > and stupid people and even more so those who are lucky enough not > to be stupid have an obligation to document stuff so that even the > stupid can understand, no matter how silly the documentation might > get. > > Is that agreeable without causing yet more progressive hair loss? > > Actually - I'm having second thoughts. Isn't it a distinguishing > feature of stupid people that - among other things - they are > stupid enough to believe they don't need to read documentation? You > admitted publicly that you read documentation - are you just faking > the stupid? > > -hilmar If lack of good documentation == stupid, I know of a few other modules in trouble besides mine. Based on that we're in for a whole lot of stupid! And I feel stupid for my earlier remarks, Sendu, so apologies. And Hilmar, you're too late on the hair loss, at least on my end. I have corrected the EUtilities POD to reflect that all text input needs to be raw as URI encoding is done in the module, which should work (I think). I plan on committing it tonight. It also indicates that EUtilities search queries need to be made as if they are regular Entrez queries. Would that be sufficient? Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From pmiguel at purdue.edu Thu Oct 5 20:42:00 2006 From: pmiguel at purdue.edu (Phillip San Miguel) Date: Thu, 05 Oct 2006 16:42:00 -0400 Subject: [Bioperl-l] BioPerl Deobfuscator updated In-Reply-To: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> Message-ID: <45256E18.3080103@purdue.edu> David Messina wrote: > I'm pleased to announce a revised version of the BioPerl Deobfuscator > is now available. Many thanks to Mauricio Cuadra for updating > bioperl.org's installation: > > http://bioperl.org/cgi-bin/deob_interface.cgi > > I've incorporated many of the suggestions you all sent in after the > first release, and many of the modules that had non-standard > documentation have been updated in the meantime, too, so hopefully > you'll find it much improved. There are still some issues with a few > modules; please report any problems you see. Also, it's now indexing > bioperl-live instead of 1.4, which should make it a little more > useful, too. A complete list of changes is below. > > I welcome your bug reports and suggestions for improvements, via > email, this list, Bugzilla, or the Wiki page. > > > Thanks, > Dave > > Here are some comments: Would be good to have the column headings for the methods table in the fixed part of the page, rather than the scroll box. That way you could always see the column headings from anywhere in the list. Second, I've noticed that there are a fair number of methods that have "not documented" for "Returns" and "Usage". But in every case I've checked both of these were documented. For example, consider methods for Bio::Seq::SeqWithQuality. The method "accession_number" is listed as "not documented". But if you click on Bio::Seq:SeqWithQuality link to the documentation, usage is defined as: "$unique_biological_key = $obj->accession_number;" and returns is defined as "A string". Finally, it would be good to have the version of bioperl being deobfuscated on the deob_interface.cgi page. Just as a quick sanity-checking measure. After poking around a bit I found that bioperl-live is being indexed in the wiki. But, I can tell, it is just the sort of thing I'm going to forget and look for every time come back to the page after a few months... Overall very nice, though. Just what is needed when I'm trying to remember "which was the method that returns subseq string and which one returns an object?" Phillip SanMiguel Purdue University From bix at sendu.me.uk Thu Oct 5 21:24:34 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 05 Oct 2006 22:24:34 +0100 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> <4525314C.7020205@sendu.me.uk> <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net> <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu> Message-ID: <45257812.5050008@sendu.me.uk> Chris Fields wrote: > > I have corrected the EUtilities POD to reflect that all text input needs > to be raw as URI encoding is done in the module, which should work (I > think). I plan on committing it tonight. It also indicates that > EUtilities search queries need to be made as if they are regular Entrez > queries. Would that be sufficient? You may not even need to mention anything about URI encoding, which might frighten some people. Something as simple as: =head1 SYNOPSIS use Bio::DB::EUtilities; my $esearch = Bio::DB::EUtilities->new(-eutil => 'esearch', -db => 'pubmed', -term => 'hutP AND xyz', ... and/or some POD for the new() method: =head2 new Title : new ... Args : -eutil => ... -db => ... -term => string, an entrez-style query =cut would get the point across, I think. BTW, can the term string be supplied anywhere else other than new()? It doesn't matter at all if it can't, I'm just idly wondering if I missed anything. From dmessina at wustl.edu Thu Oct 5 21:42:49 2006 From: dmessina at wustl.edu (David Messina) Date: Thu, 5 Oct 2006 16:42:49 -0500 Subject: [Bioperl-l] BioPerl Deobfuscator updated In-Reply-To: <45256E18.3080103@purdue.edu> References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> <45256E18.3080103@purdue.edu> Message-ID: <5861A24A-E0F9-4B06-9AD1-68857E6561F9@wustl.edu> Thanks so much, Phillip, for taking the time to check out the new version and send your comments. I really appreciate it! I've added them to the wiki page so I can track them. Best, Dave From cjfields at uiuc.edu Thu Oct 5 21:50:11 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 16:50:11 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45257812.5050008@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> <4525314C.7020205@sendu.me.uk> <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net> <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu> <45257812.5050008@sendu.me.uk> Message-ID: Sendu, I have the parameters all set up as get/sets at this point, but I'm open to suggestions on that. Note in the BEGIN block the heredoc eval {} block. Yes, nasty I know, but I hate AUTOLOAD. It works as a quick way of getting parameter get/sets up-and-running. I plan on making those explicit get/sets as soon as I can then sorting out particular ones to the various eutil modules where they are primarily used. Long story short, every parameter is a get/set at this time (including term()). The common ones needed for most EUtilities are initialized in the parent EUtilities::_initialize(), and eutil- specific parameters are initialized in the individual eutil plugins. Each eutil plugin only sets whatever parameters may be needed for operation (though you could circumvent that, since all of them are inherited via EUtilities). We could always simplify it to accept simple key-value pairs, but get/ sets (at least to me) allow more flexibility as long as you remember which parameters are set and to what. Chris On Oct 5, 2006, at 4:24 PM, Sendu Bala wrote: > Chris Fields wrote: >> I have corrected the EUtilities POD to reflect that all text input >> needs to be raw as URI encoding is done in the module, which >> should work (I think). I plan on committing it tonight. It also >> indicates that EUtilities search queries need to be made as if >> they are regular Entrez queries. Would that be sufficient? > > You may not even need to mention anything about URI encoding, which > might frighten some people. Something as simple as: > > =head1 SYNOPSIS > > use Bio::DB::EUtilities; > > my $esearch = Bio::DB::EUtilities->new(-eutil => 'esearch', > -db => 'pubmed', > -term => 'hutP AND > xyz', > ... > > and/or some POD for the new() method: > > =head2 new > > Title : new > ... > Args : -eutil => ... > -db => ... > -term => string, an entrez-style query > > =cut > > would get the point across, I think. > > BTW, can the term string be supplied anywhere else other than new > ()? It doesn't matter at all if it can't, I'm just idly wondering > if I missed anything. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Oct 5 21:51:06 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 5 Oct 2006 16:51:06 -0500 Subject: [Bioperl-l] EUtilities term handling In-Reply-To: <45257812.5050008@sendu.me.uk> References: <4524B20B.5010703@sendu.me.uk> <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net> <452511C1.5020709@sendu.me.uk> <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu> <45251E69.7040507@sendu.me.uk> <45252524.7030006@sendu.me.uk> <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu> <4525314C.7020205@sendu.me.uk> <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net> <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu> <45257812.5050008@sendu.me.uk> Message-ID: <5B2E844F-7B8B-4F69-9005-138826B835FB@uiuc.edu> > You may not even need to mention anything about URI encoding, which > might frighten some people. Something as simple as: > > =head1 SYNOPSIS > > use Bio::DB::EUtilities; > > my $esearch = Bio::DB::EUtilities->new(-eutil => 'esearch', > -db => 'pubmed', > -term => 'hutP AND > xyz', > ... > > and/or some POD for the new() method: > > =head2 new > > Title : new > ... > Args : -eutil => ... > -db => ... > -term => string, an entrez-style query > > =cut > > would get the point across, I think. Oops, forgot. I'll add this in and update new() when I can. Thanks! Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From arareko at campus.iztacala.unam.mx Thu Oct 5 22:12:49 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Thu, 05 Oct 2006 17:12:49 -0500 Subject: [Bioperl-l] BioPerl Deobfuscator updated In-Reply-To: <45256E18.3080103@purdue.edu> References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> <45256E18.3080103@purdue.edu> Message-ID: <45258361.8080803@campus.iztacala.unam.mx> Phillip San Miguel wrote: > Finally, it would be good to have the version of bioperl being > deobfuscated on the deob_interface.cgi page. Just as a quick > sanity-checking measure. After poking around a bit I found that > bioperl-live is being indexed in the wiki. But, I can tell, it is just > the sort of thing I'm going to forget and look for every time come back > to the page after a few months... Dave, I think this value can be stored in one of the index files and passed as an argument to the deob_index.pl script. What do you think? Mauricio. -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From lincoln.stein at gmail.com Thu Oct 5 18:42:41 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Thu, 5 Oct 2006 14:42:41 -0400 Subject: [Bioperl-l] using nfreeze instead of freeze in Bio::SeqFeature::Store In-Reply-To: References: Message-ID: <6dce9a0b0610051142h56479843ofc5429d959cb6e3@mail.gmail.com> I think it's fine unless there is a significant performance hit, in which case the change should be made into a configuration option. Do you know if there is any overhead on doing this? Lincoln On 10/5/06, Cook, Malcolm wrote: > Lincoln, > > I committed a change to Bio::SeqFeature::Store to use nfreeze instead of > freeze which should allow SeqFeature objects to survive database > freeze/thaw cycles across architectures. > > I hope I was not presumptuous or in error in doing this.... > > Regards, > > Malcolm Cook > Database Applications Manager - Bioinformatics > Stowers Institute for Medical Research - Kansas City, Missouri > > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From torsten.seemann at infotech.monash.edu.au Fri Oct 6 05:26:10 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Fri, 06 Oct 2006 15:26:10 +1000 Subject: [Bioperl-l] Clean-up of Bio::Root::IO In-Reply-To: <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net> References: <452344D4.8070908@infotech.monash.edu.au> <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net> Message-ID: <4525E8F2.1000704@infotech.monash.edu.au> Hilmar, > I don't think there's a need to deprecate - if the methods just plain > delegate to whatever File:: module is appropriate their > implementation (supposedly) will become very simple and hence won't > pose a maintenance burden anymore. >> I have an uncommitted simplified version of Bio::Root::IO which does >> this, and "all tests pass". The functions currently (silently) >> dispatch >> directly to their native counterparts. >> >> The only tricky function is tempfile() which is *mostly* like >> File::Temp::tempfile(), but does some voodoo of converting >> (TEMPLATE=>'xxx') to the non-hash first parameter of the File:: >> version, >> so I'm hesitant to commit. It may do other magic - Hilmar? > > Not that I would know of. If the tests pass (without having to change > them!) I'd give it a try. Tempfile.t had two tests that failed. It seems that Bio::Root::IO had some magic whereby it would keep a list of all tempfilenames created with UNLINK != 0 and when the Bio::Root::IO object was destroyed (eg. undef $obj) it would MANUALLY unlink each of them. This would occur before File::Temp got to unlink them. Not sure why it was written like this (as File::Temp will delete them at the end of the script anyway) but maybe it was legacy for when File::Temp::tempfile WASN'T available. Anyway, I've kept backward compatibility there, although I think eventually it should be removed and Tempfile.t adjusted. Although all tests pass with my new trim Bio/Root/IO.pm I am still concerned about committing as the assumption is that the BioPerl test suite is good enough to handle such a change to an important module, but the reality may be different :-) Let me know if you think I should commit anyway, Your advice is appreciated. -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From dmessina at wustl.edu Fri Oct 6 05:25:56 2006 From: dmessina at wustl.edu (David Messina) Date: Fri, 6 Oct 2006 00:25:56 -0500 Subject: [Bioperl-l] BioPerl Deobfuscator updated In-Reply-To: <45258361.8080803@campus.iztacala.unam.mx> References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> <45256E18.3080103@purdue.edu> <45258361.8080803@campus.iztacala.unam.mx> Message-ID: On Oct 5, 2006, at 5:12 PM, Mauricio Herrera Cuadra wrote: > I think this value can be stored in one of the index files and > passed as an argument to the deob_index.pl script. What do you think? Yep, I think that works nicely. I added this feature and committed it to CVS. Here's what the new header looks like if you do deob_index.pl -s "bioperl-live": ? Thanks for the suggestions, guys. Dave -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: deob_header.jpg Type: image/jpeg Size: 25739 bytes Desc: not available URL: From deep_ans at yahoo.com Fri Oct 6 13:22:49 2006 From: deep_ans at yahoo.com (deepak shingan) Date: Fri, 6 Oct 2006 06:22:49 -0700 (PDT) Subject: [Bioperl-l] Sort blast file result according to evalues Message-ID: <20061006132249.49450.qmail@web51711.mail.yahoo.com> Hi , Is there any way to parse the blast file according to evalue for each hit. I want the output sorted according to hit evalue. I am using SearchIO algorithm and already tried sorting the hits according to bits, gaps, but I am not able to sort the hits by evalue. As evalues are mainly associated with hsp and each hit may have multiple hsps. waiting for help. Thanks, Dun Dansi --------------------------------- How low will we go? Check out Yahoo! Messenger?s low PC-to-Phone call rates. From hlapp at gmx.net Fri Oct 6 14:03:04 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 6 Oct 2006 10:03:04 -0400 Subject: [Bioperl-l] Clean-up of Bio::Root::IO In-Reply-To: <4525E8F2.1000704@infotech.monash.edu.au> References: <452344D4.8070908@infotech.monash.edu.au> <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net> <4525E8F2.1000704@infotech.monash.edu.au> Message-ID: <074183F9-004A-4693-A7BC-D5E5524FCC92@gmx.net> This is a 1.5, i.e. developers release that's in the works, and also you'd be doing this on the main trunk. If you get the tests to pass there's no reason to hold back. You may be right and in reality it has repercussions somewhere, but those will be the opportunities to improve our test suite. -hilmar On Oct 6, 2006, at 1:26 AM, Torsten Seemann wrote: > Although all tests pass with my new trim Bio/Root/IO.pm I am still > concerned about committing as the assumption is that the BioPerl > test suite is good enough to handle such a change to an important > module, but the reality may be different :-) > > Let me know if you think I should commit anyway, > > Your advice is appreciated. -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Fri Oct 6 14:58:09 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 6 Oct 2006 09:58:09 -0500 Subject: [Bioperl-l] Sort blast file result according to evalues In-Reply-To: <20061006132249.49450.qmail@web51711.mail.yahoo.com> References: <20061006132249.49450.qmail@web51711.mail.yahoo.com> Message-ID: The evalue for the hit is retrieved by the BlastHit::signifiance() method, if I remember correctly. So if $hit is a Bio::Search::Hit::BlastHit object, you use $hit->significance. If you want individual HSP evalues, you would use $hsp->evalue for the individual HSP objects. The output is normally sorted by the order they appear in the alignments and table, which is typically by increasing evalue or decreasing bits (score). So they are already sorted. If you wanted to run a sort yourself you could use a sort block using '{$a- >significance() <=> $b->significance()} @hits', but as pointed out on the wiki it may be safer to run a Schwartzian transform instead: http://www.bioperl.org/wiki/Bioperl_Best_Practices#Sorting Chris On Oct 6, 2006, at 8:22 AM, deepak shingan wrote: > Hi , > Is there any way to parse the blast file according to evalue for > each hit. I want the output sorted according to hit evalue. I am > using SearchIO algorithm and already tried sorting the hits > according to bits, gaps, but I am not able to sort the hits by evalue. > As evalues are mainly associated with hsp and each hit may have > multiple hsps. > > waiting for help. > > Thanks, > Dun Dansi > > > > > > --------------------------------- > How low will we go? Check out Yahoo! Messenger?s low PC-to-Phone > call rates. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Fri Oct 6 15:03:45 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 6 Oct 2006 10:03:45 -0500 Subject: [Bioperl-l] Clean-up of Bio::Root::IO In-Reply-To: <074183F9-004A-4693-A7BC-D5E5524FCC92@gmx.net> References: <452344D4.8070908@infotech.monash.edu.au> <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net> <4525E8F2.1000704@infotech.monash.edu.au> <074183F9-004A-4693-A7BC-D5E5524FCC92@gmx.net> Message-ID: <265AD609-F74E-4545-B3DD-FF94290BE0B4@uiuc.edu> On Oct 6, 2006, at 9:03 AM, Hilmar Lapp wrote: > This is a 1.5, i.e. developers release that's in the works, and also > you'd be doing this on the main trunk. If you get the tests to pass > there's no reason to hold back. > > You may be right and in reality it has repercussions somewhere, but > those will be the opportunities to improve our test suite. > > -hilmar Agreed, though I think Sendu only wants bug fixes for 1.5.2. You could always commit to CVS HEAD and it could be in 1.5.3. Let me rethink that. There were some subtle tempfile/tempdir issues that were popping up on WinXP where the some tempfiles were not being deleted b/c of permissions issues; I had planned on adding that to Bugzilla today or tomorrow. Maybe changing to File::Temp would fix that, so in essence it would be a bug fix! I'll go ahead and post the bug. Chris >> Although all tests pass with my new trim Bio/Root/IO.pm I am still >> concerned about committing as the assumption is that the BioPerl >> test suite is good enough to handle such a change to an important >> module, but the reality may be different :-) >> >> Let me know if you think I should commit anyway, >> >> Your advice is appreciated. > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From pmiguel at purdue.edu Fri Oct 6 15:06:56 2006 From: pmiguel at purdue.edu (Phillip San Miguel) Date: Fri, 06 Oct 2006 11:06:56 -0400 Subject: [Bioperl-l] BioPerl Deobfuscator updated In-Reply-To: <5861A24A-E0F9-4B06-9AD1-68857E6561F9@wustl.edu> References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> <45256E18.3080103@purdue.edu> <5861A24A-E0F9-4B06-9AD1-68857E6561F9@wustl.edu> Message-ID: <45267110.7030905@purdue.edu> David Messina wrote: > Thanks so much, Phillip, for taking the time to check out the new > version and send your comments. I really appreciate it! I've added > them to the wiki page so I can track them. > > Best, > Dave > Dave, No problem. I've just added a "keyword" to search BioPerl Deobfuscator to my Firefox browser. That way I can just type "deob qual" in my URL bar in firefox and the browser jumps directly to BioPerl Deobfuscator (like a bookmark) but it pre-submits the search item "qual". I heard about the Firefox "keywords" in a TWiT/FLOSS episode on mozilla. You just go to any search page and right-click in the search box of interest and one of the choices is "Add a Keyword for this Search". Then you just have to fill out "Name" and "Keyword" fields and drop the keyword into whatever folder you like. The "Keyword" then becomes the word to invoke that search with parameters that follow it when it is typed into the URL bar. Phillip From arareko at campus.iztacala.unam.mx Fri Oct 6 15:18:02 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Fri, 06 Oct 2006 10:18:02 -0500 Subject: [Bioperl-l] BioPerl Deobfuscator updated In-Reply-To: References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu> <45256E18.3080103@purdue.edu> <45258361.8080803@campus.iztacala.unam.mx> Message-ID: <452673AA.7070305@campus.iztacala.unam.mx> Looks great! I'll update it during the weekend. Mauricio. David Messina wrote: > > On Oct 5, 2006, at 5:12 PM, Mauricio Herrera Cuadra wrote: >> I think this value can be stored in one of the index files and passed >> as an argument to the deob_index.pl script. What do you think? > > Yep, I think that works nicely. I added this feature and committed it to > CVS. Here's what the new header looks like if you do deob_index.pl -s > "bioperl-live": > > > Thanks for the suggestions, guys. > > Dave > > > ------------------------------------------------------------------------ > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From bix at sendu.me.uk Fri Oct 6 15:27:14 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 06 Oct 2006 16:27:14 +0100 Subject: [Bioperl-l] Sort blast file result according to evalues In-Reply-To: References: <20061006132249.49450.qmail@web51711.mail.yahoo.com> Message-ID: <452675D2.9090803@sendu.me.uk> Chris Fields wrote: > The evalue for the hit is retrieved by the BlastHit::signifiance() > method, if I remember correctly. So if $hit is a > Bio::Search::Hit::BlastHit object, you use $hit->significance. If > you want individual HSP evalues, you would use $hsp->evalue for the > individual HSP objects. > > The output is normally sorted by the order they appear in the > alignments and table, which is typically by increasing evalue or > decreasing bits (score). So they are already sorted. Concur. > If you wanted to run a sort yourself you could use a sort block using > '{$a->significance() <=> $b->significance()} @hits' Actually, it is best to use the sort_hits() method of the result object prior to asking for any hits. (As this allows for potential optimization in the parser.) ->significance is still the thing you need to sort on though. From cjfields at uiuc.edu Fri Oct 6 15:52:57 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 6 Oct 2006 10:52:57 -0500 Subject: [Bioperl-l] Sort blast file result according to evalues In-Reply-To: <452675D2.9090803@sendu.me.uk> References: <20061006132249.49450.qmail@web51711.mail.yahoo.com> <452675D2.9090803@sendu.me.uk> Message-ID: <31A6FC3A-8BEB-42B8-B51D-66E659EF7495@uiuc.edu> On Oct 6, 2006, at 10:27 AM, Sendu Bala wrote: >> If you wanted to run a sort yourself you could use a sort block using >> '{$a->significance() <=> $b->significance()} @hits' > > Actually, it is best to use the sort_hits() method of the result > object > prior to asking for any hits. (As this allows for potential > optimization > in the parser.) Ah, forgot about that one! Chris Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Fri Oct 6 18:36:49 2006 From: jason at bioperl.org (Jason Stajich) Date: Fri, 6 Oct 2006 11:36:49 -0700 Subject: [Bioperl-l] tempfile cleanup In-Reply-To: <437E2D49-F5F6-424A-BC61-EC4BFEB9F3C4@berkeley.edu> References: <437E2D49-F5F6-424A-BC61-EC4BFEB9F3C4@berkeley.edu> Message-ID: <0FCEC6B2-E190-4800-AAB1-89559C552FA6@bioperl.org> I think the magic trickery in there for cleanup is that File::Temp only cleans up tempfiles when Perl exits not when the Root::IO object goes out of scope -- so this can be a problem for people on CGI scripts that stay resident in memory and don't ever have tempfiles cleaned up. The managing the list aspect allows us to call _cleanup periodically (perhaps before the start of every Blast run) to insure that tempfiles are removed. perhaps newer File::Temp versions can solve this better now but I believe that was the behavior we were trying to deal with with managing the list of to-be-deleted files by the Root::IO object. This is some hackery that also had to do with not expecting File::Temp to be installed I believe. -jason From torsten.seemann at infotech.monash.edu.au Mon Oct 9 04:52:29 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Mon, 09 Oct 2006 14:52:29 +1000 Subject: [Bioperl-l] Multiple packages in the one .pm file Message-ID: <4529D58D.1080004@infotech.monash.edu.au> Hi all, The following modules have more than one "package xxxx;" declaration in them. For small, internal classes I guess this is fine, but for others, they should be split up into the filesystem - otherwise they are troublesome to locate and the online documentation doesn't list them! eg. bioperl-run/Bio/Tools/Run/Analysis/Job.pm is in bioperl-run/Bio/Tools/Run/Analysis.pm Here's the culprits: % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio | sed 's/:.*$//' | sort | uniq -d ; done bioperl-live/Bio/AnalysisI.pm bioperl-live/Bio/DB/Fasta.pm bioperl-live/Bio/DB/GFF.pm bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm bioperl-live/Bio/DB/SeqFeature/Store/memory.pm bioperl-live/Bio/SeqIO/interpro.pm bioperl-run/Bio/Tools/Run/Analysis.pm bioperl-run/Bio/Tools/Run/Analysis/soap.pm -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From pmiguel at purdue.edu Mon Oct 9 19:57:12 2006 From: pmiguel at purdue.edu (Phillip San Miguel) Date: Mon, 09 Oct 2006 15:57:12 -0400 Subject: [Bioperl-l] Will 1.5.2 bugfixes propagate into next RC? Message-ID: <452AA998.5010104@purdue.edu> I found a bug in Bio::SeqIO::phd and am wondering if the fix will propagate into the next release candidate? The bug is here: http://bugzilla.open-bio.org/show_bug.cgi?id=2120 I also created a patch that fixes it (on my machine, anyway). It is a fairly minor change, so it seems like it would be worth propagating it into the next release candidate. -- Phillip SanMiguel From bix at sendu.me.uk Mon Oct 9 20:57:28 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 09 Oct 2006 21:57:28 +0100 Subject: [Bioperl-l] Will 1.5.2 bugfixes propagate into next RC? In-Reply-To: <452AA998.5010104@purdue.edu> References: <452AA998.5010104@purdue.edu> Message-ID: <452AB7B8.4040404@sendu.me.uk> Phillip San Miguel wrote: > I found a bug in Bio::SeqIO::phd and am wondering if the fix will > propagate into the next release candidate? > > The bug is here: > > http://bugzilla.open-bio.org/show_bug.cgi?id=2120 > > I also created a patch that fixes it (on my machine, anyway). It is a > fairly minor change, so it seems like it would be worth propagating it > into the next release candidate. If it gets committed to HEAD before I make the next candidate, then yes. I'll do that if no one beats me to it (and if someone does, please add a new test for this). BTW Phillip, thank you for the bug report but in future use the attachment capabilities for files, please don't paste them into the comments box. From bix at sendu.me.uk Mon Oct 9 21:01:56 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 09 Oct 2006 22:01:56 +0100 Subject: [Bioperl-l] Analysis soap problem Message-ID: <452AB8C4.1010704@sendu.me.uk> I thought I'd 'advertise' this bug on the list so more people see it: http://bugzilla.open-bio.org/show_bug.cgi?id=2117 I don't want to make the next 1.5.2 release candidate until its fixed. Does anyone have any idea about it? Even if you can't fix it, just explaining what's (supposed) to be going on would help a lot. Thank you, Sendu. From Kevin.M.Brown at asu.edu Mon Oct 9 22:40:54 2006 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Mon, 9 Oct 2006 15:40:54 -0700 Subject: [Bioperl-l] Analysis soap problem Message-ID: <1A4207F8295607498283FE9E93B775B40219690B@EX02.asurite.ad.asu.edu> If I had to guess from looking at the snippet provided, the variable $seq holds no data so when you try to setup the regex /^$seq$/ you end up with /^$/ (blank line) and the warning. > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Sendu Bala > Sent: Monday, October 09, 2006 2:02 PM > To: bioperl-l List > Subject: [Bioperl-l] Analysis soap problem > > I thought I'd 'advertise' this bug on the list so more people see it: > http://bugzilla.open-bio.org/show_bug.cgi?id=2117 > > I don't want to make the next 1.5.2 release candidate until > its fixed. > Does anyone have any idea about it? Even if you can't fix it, just > explaining what's (supposed) to be going on would help a lot. > > Thank you, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at uiuc.edu Tue Oct 10 02:34:23 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 9 Oct 2006 21:34:23 -0500 Subject: [Bioperl-l] Analysis soap problem In-Reply-To: <452AB8C4.1010704@sendu.me.uk> References: <452AB8C4.1010704@sendu.me.uk> Message-ID: <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu> I have 'fixed' this in CVS. Note the quotes; it depends on what you might consider fixed. Multiple calls to results() were returning empty hash refs, so no data was being returned. For now, I stored the hash reference in a variable then tested each one. All tests now pass, including the 'outseq' one. Maybe it's just me, but shouldn't results() either consistently return the same information, or contain documentation that it doesn't do so? Anyway, I have left the bugzilla report open for now. Chris On Oct 9, 2006, at 4:01 PM, Sendu Bala wrote: > I thought I'd 'advertise' this bug on the list so more people see it: > http://bugzilla.open-bio.org/show_bug.cgi?id=2117 > > I don't want to make the next 1.5.2 release candidate until its fixed. > Does anyone have any idea about it? Even if you can't fix it, just > explaining what's (supposed) to be going on would help a lot. > > Thank you, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bosborne11 at verizon.net Tue Oct 10 02:09:45 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 09 Oct 2006 22:09:45 -0400 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: <4529D58D.1080004@infotech.monash.edu.au> Message-ID: Torsten, Fixed interpro.pm, it could have been written more simply (or more like other SeqIO modules). Can't really address the others. Brian O. On 10/9/06 12:52 AM, "Torsten Seemann" wrote: > Hi all, > > The following modules have more than one "package xxxx;" declaration in > them. For small, internal classes I guess this is fine, but for others, > they should be split up into the filesystem - otherwise they are > troublesome to locate and the online documentation doesn't list them! > > eg. > bioperl-run/Bio/Tools/Run/Analysis/Job.pm > is in > bioperl-run/Bio/Tools/Run/Analysis.pm > > Here's the culprits: > > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio | > sed 's/:.*$//' | sort | uniq -d ; done > > bioperl-live/Bio/AnalysisI.pm > bioperl-live/Bio/DB/Fasta.pm > bioperl-live/Bio/DB/GFF.pm > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm > bioperl-live/Bio/SeqIO/interpro.pm > > bioperl-run/Bio/Tools/Run/Analysis.pm > bioperl-run/Bio/Tools/Run/Analysis/soap.pm From bix at sendu.me.uk Tue Oct 10 07:03:20 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 10 Oct 2006 08:03:20 +0100 Subject: [Bioperl-l] Analysis soap problem In-Reply-To: <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu> References: <452AB8C4.1010704@sendu.me.uk> <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu> Message-ID: <452B45B8.8010401@sendu.me.uk> Chris Fields wrote: > I have 'fixed' this in CVS. Note the quotes; it depends on what you > might consider fixed. Multiple calls to results() were returning > empty hash refs, so no data was being returned. For now, I stored > the hash reference in a variable then tested each one. All tests now > pass, including the 'outseq' one. > > Maybe it's just me, but shouldn't results() either consistently > return the same information, or contain documentation that it doesn't > do so? Anyway, I have left the bugzilla report open for now. Judging by the tests there seems a clear expectation that multiple calls to results() should work, and certainly that makes sense and seems natural. So I'd say that results() should be fixed and the test script reverted. From cjfields at uiuc.edu Tue Oct 10 11:42:33 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 06:42:33 -0500 Subject: [Bioperl-l] Analysis soap problem In-Reply-To: <452B45B8.8010401@sendu.me.uk> References: <452AB8C4.1010704@sendu.me.uk> <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu> <452B45B8.8010401@sendu.me.uk> Message-ID: I agree, though I think Martin Senger should be contacted, at least to get his thoughts. Has anyone tried yet? Chris On Oct 10, 2006, at 2:03 AM, Sendu Bala wrote: > Chris Fields wrote: >> I have 'fixed' this in CVS. Note the quotes; it depends on what you >> might consider fixed. Multiple calls to results() were returning >> empty hash refs, so no data was being returned. For now, I stored >> the hash reference in a variable then tested each one. All tests now >> pass, including the 'outseq' one. >> >> Maybe it's just me, but shouldn't results() either consistently >> return the same information, or contain documentation that it doesn't >> do so? Anyway, I have left the bugzilla report open for now. > > Judging by the tests there seems a clear expectation that multiple > calls > to results() should work, and certainly that makes sense and seems > natural. So I'd say that results() should be fixed and the test script > reverted. > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Tue Oct 10 12:14:31 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 10 Oct 2006 13:14:31 +0100 Subject: [Bioperl-l] Analysis soap problem In-Reply-To: References: <452AB8C4.1010704@sendu.me.uk> <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu> <452B45B8.8010401@sendu.me.uk> Message-ID: <452B8EA7.1080800@sendu.me.uk> Chris Fields wrote: > I agree, though I think Martin Senger should be contacted, at least to > get his thoughts. Has anyone tried yet? He's CCd on the bug report, but I haven't tried directly, no. Do you want to tackle this (contacting him and/or fixing the bug)? Cheers, Sendu. From cjfields at uiuc.edu Tue Oct 10 13:20:03 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 08:20:03 -0500 Subject: [Bioperl-l] Analysis soap problem In-Reply-To: <452B8EA7.1080800@sendu.me.uk> Message-ID: <001801c6ec6e$cc016900$15327e82@pyrimidine> I'll try giving it a closer look, just didn't have much time yesterday. I'll also try contacting Martin. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Sendu Bala > Sent: Tuesday, October 10, 2006 7:15 AM > To: bioperl-l > Subject: Re: [Bioperl-l] Analysis soap problem > > Chris Fields wrote: > > I agree, though I think Martin Senger should be contacted, at least to > > get his thoughts. Has anyone tried yet? > > He's CCd on the bug report, but I haven't tried directly, no. Do you > want to tackle this (contacting him and/or fixing the bug)? > > Cheers, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From pmiguel at purdue.edu Tue Oct 10 14:26:35 2006 From: pmiguel at purdue.edu (Phillip San Miguel) Date: Tue, 10 Oct 2006 10:26:35 -0400 Subject: [Bioperl-l] Bug reports and attachments In-Reply-To: <452AB7B8.4040404@sendu.me.uk> References: <452AA998.5010104@purdue.edu> <452AB7B8.4040404@sendu.me.uk> Message-ID: <452BAD9B.5010903@purdue.edu> Sendu Bala wrote: > > BTW Phillip, thank you for the bug report but in future use the > attachment capabilities for files, please don't paste them into the > comments box. > Sendu, Sounds reasonable to me. I should note, however; when I entered the bug, I was looking for some method to attach files. There is none on the "Enter Bug: Bioperl" page: http://bugzilla.open-bio.org/enter_bug.cgi?product=Bioperl Also, "bug writing guidelines" makes no mention of it. I vaguely remembered there being some method to do it--but given the "bug writing guidelines" exhortations to be specific and detailed, I thought I must put the information somewhere. So I put them them the only place offered (on that page)--"Description:" I see that, once submitted, attachments can be added to a bug report. Is that normally how it is done? Doesn't each attachment result in a separate email to the bioperl guts email list? Anyway, I've just added the files to the bug report as attachments, in case someone needs them to construct a test. -- Phillip From bix at sendu.me.uk Tue Oct 10 15:10:25 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 10 Oct 2006 16:10:25 +0100 Subject: [Bioperl-l] Bug reports and attachments In-Reply-To: <452BAD9B.5010903@purdue.edu> References: <452AA998.5010104@purdue.edu> <452AB7B8.4040404@sendu.me.uk> <452BAD9B.5010903@purdue.edu> Message-ID: <452BB7E1.5020200@sendu.me.uk> Phillip San Miguel wrote: > Sendu Bala wrote: >> BTW Phillip, thank you for the bug report but in future use the >> attachment capabilities for files, please don't paste them into the >> comments box. >> > Sendu, Sounds reasonable to me. I should note, however; when I > entered the bug, I was looking for some method to attach files. There > is none on the "Enter Bug: Bioperl" page: > > http://bugzilla.open-bio.org/enter_bug.cgi?product=Bioperl > > Also, "bug writing guidelines" makes no mention of it. I vaguely > remembered there being some method to do it--but given the "bug > writing guidelines" exhortations to be specific and detailed, I > thought I must put the information somewhere. So I put them them the > only place offered (on that page)--"Description:" I agree that things could be better here. Who looks after bugzilla, and is this an alterable feature? > I see that, once submitted, attachments can be added to a bug report. > Is that normally how it is done? Yes, AFAIK. > Doesn't each attachment result in a separate email to the bioperl > guts email list? Yes, but that's not a problem. In fact, doing it this way means you don't email everyone subscribed to guts your big files in plain text, but instead they get a small email with a link to the download. > Anyway, I've just added the files to the bug report as attachments, > in case someone needs them to construct a test. Thank you. From arareko at campus.iztacala.unam.mx Tue Oct 10 15:14:00 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Tue, 10 Oct 2006 10:14:00 -0500 Subject: [Bioperl-l] Bug reports and attachments In-Reply-To: <452BAD9B.5010903@purdue.edu> References: <452AA998.5010104@purdue.edu> <452AB7B8.4040404@sendu.me.uk> <452BAD9B.5010903@purdue.edu> Message-ID: <452BB8B8.40409@campus.iztacala.unam.mx> Phillip San Miguel wrote: > I see that, once submitted, attachments can be added to a bug report. > Is that normally how it is done? Yes, it's the normal method: create the bug report, then attach files. > Doesn't each attachment result in a separate email to the bioperl > guts email list? Adding a file will generate an informative email per bug change (attaching the file in this case) but won't send the attachment to the list. Regards, Mauricio. -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From cjfields at uiuc.edu Tue Oct 10 15:20:55 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 10:20:55 -0500 Subject: [Bioperl-l] Bug reports and attachments In-Reply-To: <452BAD9B.5010903@purdue.edu> Message-ID: <002801c6ec7f$ae8d85f0$15327e82@pyrimidine> > Also, "bug writing guidelines" makes no mention of it. I vaguely > remembered there being some method to do it--but given the "bug writing > guidelines" exhortations to be specific and detailed, I thought I must > put the information somewhere. So I put them them the only place offered > (on that page)--"Description:" > I see that, once submitted, attachments can be added to a bug > report. Is that normally how it is done? Doesn't each attachment result > in a separate email to the bioperl guts email list? > Anyway, I've just added the files to the bug report as attachments, > in case someone needs them to construct a test. Phillip, Initial bug reports only require the general description, OS used, bioperl version, etc. That's quite normal. Any relevant attachments are added afterward. We should probably make that clearer upfront on the wiki page; I don't know if anyone can make similar changes to bugzilla. Any bug changes, CVS commits, etc are mailed to bioperl-guts, yes. That isn't an issue though; it keeps the developers updated on the various bugs/commits that are going on and is a pretty common practice. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 10 16:48:22 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 11:48:22 -0500 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: <4529D58D.1080004@infotech.monash.edu.au> References: <4529D58D.1080004@infotech.monash.edu.au> Message-ID: <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> There are a number of other bioperl-run examples (the Bio::Tools::Run::Analysis::soap issue I looked into revealed such). I agree with both points, 1) that it depends on the size of the classes, and 2) from a maintainability standpoint, it can be very frustrating when looking for documentation. Is there really any advantage to doing this? Chris On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote: > Hi all, > > The following modules have more than one "package xxxx;" > declaration in > them. For small, internal classes I guess this is fine, but for > others, > they should be split up into the filesystem - otherwise they are > troublesome to locate and the online documentation doesn't list them! > > eg. > bioperl-run/Bio/Tools/Run/Analysis/Job.pm > is in > bioperl-run/Bio/Tools/Run/Analysis.pm > > Here's the culprits: > > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio | > sed 's/:.*$//' | sort | uniq -d ; done > > bioperl-live/Bio/AnalysisI.pm > bioperl-live/Bio/DB/Fasta.pm > bioperl-live/Bio/DB/GFF.pm > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm > bioperl-live/Bio/SeqIO/interpro.pm > > bioperl-run/Bio/Tools/Run/Analysis.pm > bioperl-run/Bio/Tools/Run/Analysis/soap.pm > > -- > Dr Torsten Seemann http://www.vicbioinformatics.com > Victorian Bioinformatics Consortium, Monash University, Australia > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 10 16:48:22 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 11:48:22 -0500 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: <4529D58D.1080004@infotech.monash.edu.au> References: <4529D58D.1080004@infotech.monash.edu.au> Message-ID: <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> There are a number of other bioperl-run examples (the Bio::Tools::Run::Analysis::soap issue I looked into revealed such). I agree with both points, 1) that it depends on the size of the classes, and 2) from a maintainability standpoint, it can be very frustrating when looking for documentation. Is there really any advantage to doing this? Chris On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote: > Hi all, > > The following modules have more than one "package xxxx;" > declaration in > them. For small, internal classes I guess this is fine, but for > others, > they should be split up into the filesystem - otherwise they are > troublesome to locate and the online documentation doesn't list them! > > eg. > bioperl-run/Bio/Tools/Run/Analysis/Job.pm > is in > bioperl-run/Bio/Tools/Run/Analysis.pm > > Here's the culprits: > > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio | > sed 's/:.*$//' | sort | uniq -d ; done > > bioperl-live/Bio/AnalysisI.pm > bioperl-live/Bio/DB/Fasta.pm > bioperl-live/Bio/DB/GFF.pm > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm > bioperl-live/Bio/SeqIO/interpro.pm > > bioperl-run/Bio/Tools/Run/Analysis.pm > bioperl-run/Bio/Tools/Run/Analysis/soap.pm > > -- > Dr Torsten Seemann http://www.vicbioinformatics.com > Victorian Bioinformatics Consortium, Monash University, Australia > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From lzhtom at hotmail.com Tue Oct 10 19:42:48 2006 From: lzhtom at hotmail.com (zhihua li) Date: Tue, 10 Oct 2006 19:42:48 +0000 Subject: [Bioperl-l] No "new" method in Bio::Tool::Run::AnalysisFactor::Pise? Message-ID: Hi netters. I've installed Bioperl 1.5.1, both core and run modules. But when I tried to use the Pise module, an error occured saying that there's no "new" method in this package. My script is: use strict; use warnings; use Bio::Tools::Run::AnalysisFactory::Pise; my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new(); my $program=$factory->program('mfold'); $program->seq('my_input_file'); my $job = $program->run(); print STDERR $job->contect('mfold.out'); The error message I got is: Can't locate object method "new" via package "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load "Bio::Tools::Run::AnalysisFactor::Pise"?) I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/Pise.pm and it DOES contain a sub new. So what's going on? Anyone could give me a hint? Thanks a lot! From cjfields at uiuc.edu Tue Oct 10 20:27:27 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 15:27:27 -0500 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com> References: <4529D58D.1080004@infotech.monash.edu.au> <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com> Message-ID: Makes sense to me. I think, as long as they're documented, it shouldn't be a problem. I think the main point is that the class methods for these don't show up using perldoc (something I ran into with Bio::DB::Fasta's inclusion of Bio::PrimarySeq::Fasta), but they do show up when using other documentation. So 'perldoc Bio::DB::Fasta' works, but 'perldoc Bio::PrimarySeq::Fasta' doesn't. So these can be problematic when looking for specific methods. However, I think pod2html handles multiple package declarations in one module, and the PDOC online do as well. Does the Deobfuscator? Chris On Oct 10, 2006, at 3:11 PM, Lincoln Stein wrote: > Hi, > > These ones are all mine: > > > bioperl-live/Bio/DB/Fasta.pm > > bioperl-live/Bio/DB/GFF.pm > > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm > > In each case, the second modules are teeny tiny ones that implement > iterators which are at most two methods long (typically a new() and > a next()). I prefer not to split them out because they will just > clutter up the file tree with stuff that is already well documented > in the "parent ship" modules. > > Lincoln > > > On 10/10/06, Chris Fields wrote: There are a > number of other bioperl-run examples (the > Bio::Tools::Run::Analysis::soap issue I looked into revealed such). > > I agree with both points, 1) that it depends on the size of the > classes, and 2) from a maintainability standpoint, it can be very > frustrating when looking for documentation. Is there really any > advantage to doing this? > > Chris > > On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote: > > > Hi all, > > > > The following modules have more than one "package xxxx;" > > declaration in > > them. For small, internal classes I guess this is fine, but for > > others, > > they should be split up into the filesystem - otherwise they are > > troublesome to locate and the online documentation doesn't list > them! > > > > eg. > > bioperl-run/Bio/Tools/Run/Analysis/Job.pm > > is in > > bioperl-run/Bio/Tools/Run/Analysis.pm > > > > Here's the culprits: > > > > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/ > Bio | > > sed 's/:.*$//' | sort | uniq -d ; done > > > > bioperl-live/Bio/AnalysisI.pm > > bioperl-live/Bio/DB/Fasta.pm > > bioperl-live/Bio/DB/GFF.pm > > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm > > bioperl-live/Bio/SeqIO/interpro.pm > > > > bioperl-run/Bio/Tools/Run/Analysis.pm > > bioperl-run/Bio/Tools/Run/Analysis/soap.pm > > > > -- > > Dr Torsten Seemann http://www.vicbioinformatics.com > > Victorian Bioinformatics Consortium, Monash University, Australia > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > -- > Lincoln D. Stein > Cold Spring Harbor Laboratory > 1 Bungtown Road > Cold Spring Harbor, NY 11724 > (516) 367-8380 (voice) > (516) 367-8389 (fax) > FOR URGENT MESSAGES & SCHEDULING, > PLEASE CONTACT MY ASSISTANT, > SANDRA MICHELSEN, AT michelse at cshl.edu Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 10 20:30:16 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 15:30:16 -0500 Subject: [Bioperl-l] No "new" method in Bio::Tool::Run::AnalysisFactor::Pise? In-Reply-To: References: Message-ID: <870B7500-AA83-42D7-965B-865B91AA8E7F@uiuc.edu> On Oct 10, 2006, at 2:42 PM, zhihua li wrote: > Hi netters. > > I've installed Bioperl 1.5.1, both core and run modules. But when > I tried to use the Pise module, an error occured saying that > there's no "new" method in this package. > > My script is: > > use strict; > use warnings; > use Bio::Tools::Run::AnalysisFactory::Pise; > my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new(); > my $program=$factory->program('mfold'); > $program->seq('my_input_file'); > my $job = $program->run(); > print STDERR $job->contect('mfold.out'); > > The error message I got is: > > Can't locate object method "new" via package > "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load > "Bio::Tools::Run::AnalysisFactor::Pise"?) > > I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/ > Pise.pm and it DOES contain a sub new. > > So what's going on? Anyone could give me a hint? > > Thanks a lot! Well, according to your error output you have AnalysisFactory misspelled ('AnalysisFactor'), which should tell you what the problem is. Look for the same thing in your script. Chris > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Tue Oct 10 20:43:06 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 10 Oct 2006 21:43:06 +0100 Subject: [Bioperl-l] No "new" method in Bio::Tool::Run::AnalysisFactor::Pise? In-Reply-To: References: Message-ID: <452C05DA.5050803@sendu.me.uk> zhihua li wrote: > Hi netters. > > I've installed Bioperl 1.5.1, both core and run modules. But when I > tried to use the Pise module, an error occured saying that there's no > "new" method in this package. > > My script is: > > use strict; > use warnings; > use Bio::Tools::Run::AnalysisFactory::Pise; > my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new(); > my $program=$factory->program('mfold'); > $program->seq('my_input_file'); > my $job = $program->run(); > print STDERR $job->contect('mfold.out'); > > The error message I got is: > > Can't locate object method "new" via package > "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load > "Bio::Tools::Run::AnalysisFactor::Pise"?) > > I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/Pise.pm > and it DOES contain a sub new. > > So what's going on? Anyone could give me a hint? You have a typo. Bio::Tools::Run::AnalysisFactory::Pise, not Bio::Tools::Run::AnalysisFactor::Pise From lincoln.stein at gmail.com Tue Oct 10 20:11:00 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Tue, 10 Oct 2006 16:11:00 -0400 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> References: <4529D58D.1080004@infotech.monash.edu.au> <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> Message-ID: <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com> Hi, These ones are all mine: > bioperl-live/Bio/DB/Fasta.pm > bioperl-live/Bio/DB/GFF.pm > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm In each case, the second modules are teeny tiny ones that implement iterators which are at most two methods long (typically a new() and a next()). I prefer not to split them out because they will just clutter up the file tree with stuff that is already well documented in the "parent ship" modules. Lincoln On 10/10/06, Chris Fields wrote: > > There are a number of other bioperl-run examples (the > Bio::Tools::Run::Analysis::soap issue I looked into revealed such). > > I agree with both points, 1) that it depends on the size of the > classes, and 2) from a maintainability standpoint, it can be very > frustrating when looking for documentation. Is there really any > advantage to doing this? > > Chris > > On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote: > > > Hi all, > > > > The following modules have more than one "package xxxx;" > > declaration in > > them. For small, internal classes I guess this is fine, but for > > others, > > they should be split up into the filesystem - otherwise they are > > troublesome to locate and the online documentation doesn't list them! > > > > eg. > > bioperl-run/Bio/Tools/Run/Analysis/Job.pm > > is in > > bioperl-run/Bio/Tools/Run/Analysis.pm > > > > Here's the culprits: > > > > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio | > > sed 's/:.*$//' | sort | uniq -d ; done > > > > bioperl-live/Bio/AnalysisI.pm > > bioperl-live/Bio/DB/Fasta.pm > > bioperl-live/Bio/DB/GFF.pm > > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm > > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm > > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm > > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm > > bioperl-live/Bio/SeqIO/interpro.pm > > > > bioperl-run/Bio/Tools/Run/Analysis.pm > > bioperl-run/Bio/Tools/Run/Analysis/soap.pm > > > > -- > > Dr Torsten Seemann http://www.vicbioinformatics.com > > Victorian Bioinformatics Consortium, Monash University, Australia > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From asjo at koldfront.dk Tue Oct 10 20:04:35 2006 From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=) Date: Tue, 10 Oct 2006 22:04:35 +0200 Subject: [Bioperl-l] No "new" method in Bio::Tool::Run::AnalysisFactor::Pise? References: Message-ID: <871wpglyy4.fsf@topper.koldfront.dk> On Tue, 10 Oct 2006 19:42:48 +0000, zhihua wrote: > use Bio::Tools::Run::AnalysisFactory::Pise; > my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new(); ^ y [...] > Can't locate object method "new" via package > "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load > "Bio::Tools::Run::AnalysisFactor::Pise"?) You missed a 'y' in "Factory". Best wishes, -- "We've reached a special place... Spiritually... Adam Sj?gren ecumenically... grammatically." asjo at koldfront.dk From dmessina at wustl.edu Tue Oct 10 21:08:45 2006 From: dmessina at wustl.edu (David Messina) Date: Tue, 10 Oct 2006 16:08:45 -0500 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: References: <4529D58D.1080004@infotech.monash.edu.au> <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com> Message-ID: > However, I think pod2html handles multiple package declarations in > one module, and the PDOC online do as well. Does the Deobfuscator? Nope. From my cursory examination at the time they mostly were, as Lincoln said, short and sweet, so I didn't consider it a big deal. I do think the Deobfuscator should theoretically handle such cases anyway, though. I'll add it as a feature request on the wiki page. Or if you're chomping at the bit for it, I could certainly be beer- suaded to do it sooner rather than later... :) Dave From cjfields at uiuc.edu Tue Oct 10 21:33:39 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 10 Oct 2006 16:33:39 -0500 Subject: [Bioperl-l] Multiple packages in the one .pm file In-Reply-To: References: <4529D58D.1080004@infotech.monash.edu.au> <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu> <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com> Message-ID: <7F35F565-7D28-4B06-A501-4D4083652C5C@uiuc.edu> Me? I'm a lowly postdoc. Lincoln's got the cash! Chris On Oct 10, 2006, at 4:08 PM, David Messina wrote: >> However, I think pod2html handles multiple package declarations in >> one module, and the PDOC online do as well. Does the Deobfuscator? > > Nope. From my cursory examination at the time they mostly were, as > Lincoln said, short and sweet, so I didn't consider it a big deal. > > I do think the Deobfuscator should theoretically handle such cases > anyway, though. I'll add it as a feature request on the wiki page. > Or if you're chomping at the bit for it, I could certainly be beer- > suaded to do it sooner rather than later... :) > > Dave > Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From sdavis2 at mail.nih.gov Wed Oct 11 09:43:35 2006 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Wed, 11 Oct 2006 05:43:35 -0400 Subject: [Bioperl-l] No "new" method in Bio::Tool::Run::AnalysisFactor::Pise? In-Reply-To: References: Message-ID: <452CBCC7.30108@mail.nih.gov> zhihua li wrote: > Hi netters. > > I've installed Bioperl 1.5.1, both core and run modules. But when I > tried to use the Pise module, an error occured saying that there's no > "new" method in this package. > > My script is: > > use strict; > use warnings; > use Bio::Tools::Run::AnalysisFactory::Pise; > my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new(); > my $program=$factory->program('mfold'); > $program->seq('my_input_file'); > my $job = $program->run(); > print STDERR $job->contect('mfold.out'); > > The error message I got is: > > Can't locate object method "new" via package > "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load > "Bio::Tools::Run::AnalysisFactor::Pise"?) > > I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/Pise.pm > and it DOES contain a sub new. > > So what's going on? Anyone could give me a hint? > > Thanks a lot! The module name is Bio::Tools::Run::AnalysisFactory::Pise. Note that it is not "factor" but "factory". That should probably fix your problem. Sean From jay at jays.net Sat Oct 7 22:34:23 2006 From: jay at jays.net (Jay Hannah) Date: Sat, 07 Oct 2006 17:34:23 -0500 Subject: [Bioperl-l] Documentation typo: Bio::Search::Result::GenericResult Message-ID: <45282B6F.1030308@jays.net> I just updated my bioperl-live this morning, so I think I'm current. :) perldoc Bio::Search::Result::GenericResult ------------ SYNOPSIS # typically one gets Results from a SearchIO stream use Bio::SearchIO; my $io = new Bio::SearchIO(-format => 'blast', -file => 't/data/HUMBETGLOA.tblastx'); while( my $result = $io->next_result) { # process all search results within the input stream while( my $hit = $result->next_hits()) { ------------- Except that "next_hits()" does not exist. Should be "next_hit()". (Should I have posted a patch instead?) Thanks, j From bosborne11 at verizon.net Tue Oct 10 22:42:25 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 10 Oct 2006 18:42:25 -0400 Subject: [Bioperl-l] Documentation typo: Bio::Search::Result::GenericResult In-Reply-To: <45282B6F.1030308@jays.net> Message-ID: j, No need, not for something so simple. Brian O. On 10/7/06 6:34 PM, "Jay Hannah" wrote: > Except that "next_hits()" does not exist. Should be "next_hit()". > > (Should I have posted a patch instead?) From zchou at cau.edu.cn Wed Oct 11 06:34:24 2006 From: zchou at cau.edu.cn (zhuocheng Hou) Date: Wed, 11 Oct 2006 14:34:24 +0800 Subject: [Bioperl-l] about retreive alinged sequence Message-ID: <000a01c6ecff$4ea4b2f0$0915020a@zchou> Hello,everyone, I am a new user of Bioperl. I want to align mutiple DNA sequences based on translated proteins by using clustalw. However, I don't know how to retreive aligned sequences out. The codes as follows (from the tutorials of HOWTOPAML): # # These codes run and can find the screen print out of clustalw ....... my $aa_aln = $aln_factory->align(\@prots, at params); # project the protein alignment back to CDS coordinates my $dna_aln = aa_to_dna_aln($aa_aln, \%seqs); my @each = $dna_aln->each_seq(); # The following codes were writed by me. However, it doesn't work. I want to get teh $dna_aln contents and write it to files. my $in = Bio::AlignIO->newFh(-Fh=>\$dna_aln,-format => 'fasta'); my $aln=$dna_aln; my $out = Bio::AlignIO->new(-file => ">out.msf" , -format => 'msf'); #print $out $_ while <$in>; while ($aln = $in->next_aln() ) { my $out->write_aln($aln); } Best regards, Zhuocheng CAU From n.haigh at sheffield.ac.uk Wed Oct 11 14:00:33 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 11 Oct 2006 15:00:33 +0100 Subject: [Bioperl-l] about retreive alinged sequence In-Reply-To: <000a01c6ecff$4ea4b2f0$0915020a@zchou> References: <000a01c6ecff$4ea4b2f0$0915020a@zchou> Message-ID: <452CF901.6020409@sheffield.ac.uk> Dear Zhuocheng I'm not familiar with the aa_to_dna_al method but it appears that from your code that it returns an alignment object. Please find comments inserted below - hope they help! Nathan zhuocheng Hou wrote: > Hello,everyone, > > I am a new user of Bioperl. I want to align mutiple DNA sequences based on translated proteins by using clustalw. However, I don't know how to retreive aligned sequences out. > > The codes as follows (from the tutorials of HOWTOPAML): > > # > # These codes run and can find the screen print out of clustalw > ....... > my $aa_aln = $aln_factory->align(\@prots, at params); > # project the protein alignment back to CDS coordinates > my $dna_aln = aa_to_dna_aln($aa_aln, \%seqs); > $dna_aln should be a Bio::AlignIO object so all you need to do is setup the output stream to write the alignment object similar to what you wrote below. i.e. my $out = Bio::AlignIO->new(-file => ">out.msf" , -format => 'msf'); Then simply write the input alignment ($dna_aln) to the output stream with this: my $out->write_aln($dna_aln); > my @each = $dna_aln->each_seq(); > > # The following codes were writed by me. However, it doesn't work. I want to get teh $dna_aln contents and write it to files. > > > my $in = Bio::AlignIO->newFh(-Fh=>\$dna_aln,-format => 'fasta'); > my $aln=$dna_aln; > my $out = Bio::AlignIO->new(-file => ">out.msf" , > -format => 'msf'); > #print $out $_ while <$in>; > while ($aln = $in->next_aln() ) { > my $out->write_aln($aln); > } > > > Best regards, > > Zhuocheng > CAU > > > ------------------------------------------------------------------------ > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From melcher at rescomp.berkeley.edu Wed Oct 11 21:09:17 2006 From: melcher at rescomp.berkeley.edu (Graham Melcher) Date: Wed, 11 Oct 2006 14:09:17 -0700 Subject: [Bioperl-l] Accessing GO through MYSQL? Message-ID: <20061011210917.GA783@rescomp.berkeley.edu> Hey all, Preface:: This is my first post to this list, please redirect if my questions belong elsewhere. I need to lookup GO ontology information given GO:Accessors, and I have a local mysql db that mirrors the GO db from that website. I am not sure if the Bio::Ontology::* libraries were designed to be used in a dynamic, load-as-you-need sort of way, and am wondering how other people have gone about solving this problem. Details follow... Right now I'm using Class::DBI to access the Mysql database, then made a new set of subclassed Bio::Ontology::TermI and Bio::Ontology::RelationshipI which use these class::DBI objects to access the relevent information in the database on the fly. Unfortunately, I was getting stuck with the implementation of some of the other Bio::Ontology::*I, especially Ontology. Making all of these subclasses seems infeasible, or at least enough work that it might be available somewhere. Are mysql accessors out there, and I just haven't found them, or is Bio::Ontology possibly not way to go? Alternatively, if I end up having to write this sort of Bio::Ontology - Class::DBI interface, would anyone be interested in it being made generally usable and available? Finally, I just found go-perl, but although I haven't had a lot of time to look into it, it doesn't seem to use mysql either. Thanks! Graham -- Graham Melcher From sdavis2 at mail.nih.gov Thu Oct 12 11:51:14 2006 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Thu, 12 Oct 2006 07:51:14 -0400 Subject: [Bioperl-l] Accessing GO through MYSQL? In-Reply-To: <20061011210917.GA783@rescomp.berkeley.edu> References: <20061011210917.GA783@rescomp.berkeley.edu> Message-ID: <452E2C32.7070502@mail.nih.gov> Graham Melcher wrote: > Finally, I just found go-perl, but although I haven't had a lot of time > to look into it, it doesn't seem to use mysql either. > Yep. Keep going. Go-perl and Go-db-perl: http://www.godatabase.org/dev/go-db-perl/doc/go-db-perl-doc.html Sean From hlapp at gmx.net Thu Oct 12 04:44:49 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 12 Oct 2006 00:44:49 -0400 Subject: [Bioperl-l] NESCent Phyloinformatics Hackathon Message-ID: <939B253E-2F87-450A-A277-78B5645D3494@gmx.net> (apologies in advance to those who receive this multiple times) The National Evolutionary Synthesis Center (NESCent) in collaboration with Arlin Stoltzfus (U. Maryland, NIST), Aaron Mackey (GSK), Rutger Vos (UBC), and Mark Holder (FSU) sponsors a Phyloinformatics Hackathon to take place Dec 11-15 in Durham, NC. The (wiki) website with more information and a formal proposal is at https://www.nescent.org/wg_phyloinformatics/ In short, the goal is to leverage the Bio* toolkits to provide the "glue" for evolutionary analyses of various types that depend on automation, interoperability, and data integration. CALL FOR INPUT: The specific objectives are driven by "use cases", that is, specific target problems of interest to evolutionary biologists (click 'Use Cases' at the above website). We invite community input in order to focus efforts on the most urgent or pervasive problems. The wiki for the hackathon allows direct editing of the use cases after registration. You may also upload data files, or add comments to the "Forum" page. Alternatively, send email to hlapp at nescent.org. You may also contact any of the organizers with questions or comments. ATTENDANCE: The hackathon is scheduled for Dec 11-15, 2006 in Durham NC. Space is limited, and attendance is by invitation. If you have not been contacted but desire to attend, please contact Hilmar Lapp (hlapp at nescent.org). ORGANIZERS: Hilmar Lapp (NESCent; hlapp at nescent.org) Aaron Mackey (GSK; aaron.j.mackey at gsk.com) Mark Holder (FSU; mholder at scs.fsu.edu) Arlin Stoltzfus (CARB, NIST; arlin.stoltzfus at nist.gov) Todd Vision (NESCent; tjv at bio.unc.edu) Rutger Vos (UBC; rvosa at sfu.ca) From neetisomaiya at gmail.com Thu Oct 12 06:03:20 2006 From: neetisomaiya at gmail.com (neeti somaiya) Date: Thu, 12 Oct 2006 11:33:20 +0530 Subject: [Bioperl-l] need help urgently Message-ID: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> We are using BioPerl to parse a BLAST output file, and then we want to load full alignments into a CLOB column in one of our database tables. We are trying to use sql loader for the same. Anyone has an idea how we can go about it? We have tried loading sequences into CLOB columns using sql loader, and that works fine, but the same syntax when used for loading alignments, is not working. -- -Neeti Even my blood says, B positive From neetisomaiya at gmail.com Thu Oct 12 06:03:20 2006 From: neetisomaiya at gmail.com (neeti somaiya) Date: Thu, 12 Oct 2006 11:33:20 +0530 Subject: [Bioperl-l] need help urgently Message-ID: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> We are using BioPerl to parse a BLAST output file, and then we want to load full alignments into a CLOB column in one of our database tables. We are trying to use sql loader for the same. Anyone has an idea how we can go about it? We have tried loading sequences into CLOB columns using sql loader, and that works fine, but the same syntax when used for loading alignments, is not working. -- -Neeti Even my blood says, B positive From neetisomaiya at gmail.com Thu Oct 12 06:03:20 2006 From: neetisomaiya at gmail.com (neeti somaiya) Date: Thu, 12 Oct 2006 11:33:20 +0530 Subject: [Bioperl-l] need help urgently Message-ID: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> We are using BioPerl to parse a BLAST output file, and then we want to load full alignments into a CLOB column in one of our database tables. We are trying to use sql loader for the same. Anyone has an idea how we can go about it? We have tried loading sequences into CLOB columns using sql loader, and that works fine, but the same syntax when used for loading alignments, is not working. -- -Neeti Even my blood says, B positive From sayali_salodkar at persistent.co.in Thu Oct 12 10:16:34 2006 From: sayali_salodkar at persistent.co.in (Sayali) Date: Thu, 12 Oct 2006 15:46:34 +0530 Subject: [Bioperl-l] regarding polyphred output Message-ID: <006e01c6ede7$7f10bef0$00b1580a@persistent.co.in> Hi, I want to parse the output of polyphred http://droog.gs.washington.edu/PolyPhred.html. Is there a parser already available in Bioperl which would help me in doing the same. Thanks, Sayali DISCLAIMER ========== This e-mail may contain privileged and confidential information which is the property of Persistent Systems Pvt. Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Pvt. Ltd. does not accept any liability for virus infected mails. From sayali_salodkar at persistent.co.in Thu Oct 12 10:16:34 2006 From: sayali_salodkar at persistent.co.in (Sayali) Date: Thu, 12 Oct 2006 15:46:34 +0530 Subject: [Bioperl-l] regarding polyphred output Message-ID: <006e01c6ede7$7f10bef0$00b1580a@persistent.co.in> Hi, I want to parse the output of polyphred http://droog.gs.washington.edu/PolyPhred.html. Is there a parser already available in Bioperl which would help me in doing the same. Thanks, Sayali DISCLAIMER ========== This e-mail may contain privileged and confidential information which is the property of Persistent Systems Pvt. Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Pvt. Ltd. does not accept any liability for virus infected mails. From sdavis2 at mail.nih.gov Thu Oct 12 10:40:12 2006 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Thu, 12 Oct 2006 06:40:12 -0400 Subject: [Bioperl-l] need help urgently In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> Message-ID: <200610120640.12250.sdavis2@mail.nih.gov> On Thursday 12 October 2006 02:03, neeti somaiya wrote: > We are using BioPerl to parse a BLAST output file, and then we want to load > full alignments into a CLOB column in one of our database tables. We are > trying to use sql loader for the same. Anyone has an idea how we can go > about it? > We have tried loading sequences into CLOB columns using sql loader, and > that works fine, but the same syntax when used for loading alignments, is > not working. Neeti, You'll need to be a bit more specific about what you are doing. Can you post the code you are using and error messages? Also, what is "sql loader"? And what database are you trying to use? Sean From sdavis2 at mail.nih.gov Thu Oct 12 10:40:12 2006 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Thu, 12 Oct 2006 06:40:12 -0400 Subject: [Bioperl-l] need help urgently In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> Message-ID: <200610120640.12250.sdavis2@mail.nih.gov> On Thursday 12 October 2006 02:03, neeti somaiya wrote: > We are using BioPerl to parse a BLAST output file, and then we want to load > full alignments into a CLOB column in one of our database tables. We are > trying to use sql loader for the same. Anyone has an idea how we can go > about it? > We have tried loading sequences into CLOB columns using sql loader, and > that works fine, but the same syntax when used for loading alignments, is > not working. Neeti, You'll need to be a bit more specific about what you are doing. Can you post the code you are using and error messages? Also, what is "sql loader"? And what database are you trying to use? Sean From sdavis2 at mail.nih.gov Thu Oct 12 10:40:12 2006 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Thu, 12 Oct 2006 06:40:12 -0400 Subject: [Bioperl-l] need help urgently In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> Message-ID: <200610120640.12250.sdavis2@mail.nih.gov> On Thursday 12 October 2006 02:03, neeti somaiya wrote: > We are using BioPerl to parse a BLAST output file, and then we want to load > full alignments into a CLOB column in one of our database tables. We are > trying to use sql loader for the same. Anyone has an idea how we can go > about it? > We have tried loading sequences into CLOB columns using sql loader, and > that works fine, but the same syntax when used for loading alignments, is > not working. Neeti, You'll need to be a bit more specific about what you are doing. Can you post the code you are using and error messages? Also, what is "sql loader"? And what database are you trying to use? Sean From crabtree at tigr.ORG Thu Oct 12 11:28:06 2006 From: crabtree at tigr.ORG (Jonathan Crabtree) Date: Thu, 12 Oct 2006 07:28:06 -0400 Subject: [Bioperl-l] need help urgently In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com> Message-ID: <452E26C6.6040800@tigr.org> Hi Neeti- neeti somaiya wrote: > We are using BioPerl to parse a BLAST output file, and then we want to load > full alignments into a CLOB column in one of our database tables. We are > trying to use sql loader for the same. Anyone has an idea how we can go > about it? > This doesn't sound like a BioPerl issue per se, so this list might not be the best venue for your question. Since SQL*Loader is an Oracle utility you may have better luck in a forum frequented by Oracle DBAs and/or general bioinformatics people. (Not that this isn't such a forum, but unless your difficulty is actually being caused by BioPerl, or there's some kind of SQL*Loader wrapper in BioPerl--which I don't think is the case--you run the risk of having people complain that your question doesn't have enough to do with BioPerl.) > We have tried loading sequences into CLOB columns using sql loader, and that > works fine, but the same syntax when used for loading alignments, is not > working. > It's been a while since I've done any work with SQL*Loader, but I'd guess that the reason it works with sequences and not alignments is because there are characters in the alignments (newlines, perhaps?) that SQL*Loader is incorrectly interpreting as either column (field) or row (record) delimiters. You may need to change your flat file encoding to use delimiters other than the defaults (and alter the SQL*Loader control file accordingly.) As Sean pointed out, however, it's difficult to be much help without seeing an example of a failed input and the corresponding error(s)! One other thing I remember about SQL*Loader (as of Oracle 8-9 or so) is that all the CLOB values had to appear *last* in the SQL*Loader record, at least if you were using variable-length fields. But since you've loaded sequences successfully, I doubt this is the issue. One final thought is that I believe SQL*Loader has an option whereby you can place your LOB values in files external to the main SQL*Loader input file, which sidesteps the field/row delimiter issue completely; you may want to look into this if you're not already loading your Oracle database this way. Jonathan From bix at sendu.me.uk Fri Oct 13 08:56:01 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 13 Oct 2006 09:56:01 +0100 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <4521E74E.1040404@infotech.monash.edu.au> References: <4521E74E.1040404@infotech.monash.edu.au> Message-ID: <452F54A1.7010908@sendu.me.uk> Bio::Tools::Run::WrapperBase is currently a Bio::Root::RootI. It's certainly interface-like, but doesn't follow the normal interface naming convention (WrapperBaseI). Should it be a Bio::Root::Root? Renamed WrapperBaseI? Left alone? From cjfields at uiuc.edu Fri Oct 13 12:20:58 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 13 Oct 2006 07:20:58 -0500 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <452F54A1.7010908@sendu.me.uk> References: <4521E74E.1040404@infotech.monash.edu.au> <452F54A1.7010908@sendu.me.uk> Message-ID: <43CC4E80-8F15-4C83-929D-DDC719360C8F@uiuc.edu> I would say, according to BioPerl convention, it should be renamed WrapperBaseI. It has a few interface-like methods and (importantly) lacks a constructor. Unless someone else out there has other reasoning? Note that this will require lots of bioperl-run changes as well, at least I think it will. Chris On Oct 13, 2006, at 3:56 AM, Sendu Bala wrote: > Bio::Tools::Run::WrapperBase is currently a Bio::Root::RootI. It's > certainly interface-like, but doesn't follow the normal interface > naming > convention (WrapperBaseI). Should it be a Bio::Root::Root? Renamed > WrapperBaseI? Left alone? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From avilella at gmail.com Fri Oct 13 15:26:47 2006 From: avilella at gmail.com (Albert Vilella) Date: Fri, 13 Oct 2006 16:26:47 +0100 Subject: [Bioperl-l] hickup in SimpleAlign remove_gaps method Message-ID: <358f4d650610130826k1a126983t73839d665bbcb502@mail.gmail.com> Hi all, While using the remove_gaps method in Bio::SimpleAlign I found out that if the alignment is (bad enough for) having no columns without any gap at all, the method will give a: Use of uninitialized value in split at this line in add_seq: map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq); So my idea was to tweak this line to something like: map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq || ''); But I am unsure about any other side effects this may have. Anyone? Albert. From cjfields at uiuc.edu Fri Oct 13 15:51:38 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 13 Oct 2006 10:51:38 -0500 Subject: [Bioperl-l] hickup in SimpleAlign remove_gaps method In-Reply-To: <358f4d650610130826k1a126983t73839d665bbcb502@mail.gmail.com> References: <358f4d650610130826k1a126983t73839d665bbcb502@mail.gmail.com> Message-ID: You can check to see if it passes all tests. I'm guessing SimpleAlign.t tests this method out in some way (though it's always safer to check). Chris On Oct 13, 2006, at 10:26 AM, Albert Vilella wrote: > Hi all, > > While using the remove_gaps method in Bio::SimpleAlign I found out > that if the alignment is (bad enough for) having no columns without > any gap at all, the method will give a: > > Use of uninitialized value in split at this line in add_seq: > > map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq); > > So my idea was to tweak this line to something like: > > map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq || ''); > > But I am unsure about any other side effects this may have. > > Anyone? > > Albert. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From jay at jays.net Fri Oct 13 16:09:16 2006 From: jay at jays.net (Jay Hannah) Date: Fri, 13 Oct 2006 11:09:16 -0500 Subject: [Bioperl-l] Documentation typo: Bio::Search::Result::GenericResult In-Reply-To: References: Message-ID: <452FBA2C.7070003@jays.net> Thanks Brian! My first BioPerl patch has been applied! A one byte typo correction! I'm so excited! :) /home/repository/bioperl/bioperl-live/Bio/Search/Result/GenericResult.pm,v ---------------------------- revision 1.27 date: 2006/10/10 22:41:46; author: bosborne; state: Exp; lines: +4 -4 next_hit, not next_hits ---------------------------- I'm a simple man who takes great satisfaction in the simple things. :) j Brian Osborne wrote: > j, > > No need, not for something so simple. > > Brian O. > > > On 10/7/06 6:34 PM, "Jay Hannah" wrote: >> Except that "next_hits()" does not exist. Should be "next_hit()". >> >> (Should I have posted a patch instead?) > From jay at jays.net Fri Oct 13 16:24:48 2006 From: jay at jays.net (Jay Hannah) Date: Fri, 13 Oct 2006 11:24:48 -0500 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? Message-ID: <452FBDD0.2070008@jays.net> So I'm doing the following: 1) Using Bio::SeqIO to read in a genbank file and kick out fasta. 2) Reading that fasta file w/ command line formatdb. 3) Using that output for command line blastall. 4) Using Bio::SearchIO to read the blast results. (If there's a better way, do tell. -grin-) This sequence is working great for nucleotide BLASTing, but I'm stuck on step 1 when trying protein BLAST. my $seq_in = Bio::SeqIO->new( -file => " "genbank", -alphabet => "protein" ); my $seq_out_protein = Bio::SeqIO->new( -file => ">out", -format => 'fasta', -alphabet => 'protein' ); while (my $inseq = $seq_in->next_seq) { $inseq->molecule("protein"); $seq_out_protein->write_seq($inseq); } This creates a nucleotide file "out". Setting -alphabet doesn't seem to do anything. Setting molecule("protein") doesn't seem to do anything either. I was expecting that it would just pull all the CDS strings out of the genbank file and dump those into fasta format? Am I missing something obvious? Thanks, j From bosborne11 at verizon.net Fri Oct 13 16:54:02 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Fri, 13 Oct 2006 12:54:02 -0400 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? In-Reply-To: <452FBDD0.2070008@jays.net> Message-ID: Jay, You're looking for the "translation" string in the CDS section, yes? You need to delve a bit into features, the CDS is considered to be a feature of the main or parent nucleotide sequence and the translation is part of CDS feature: http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Features_from_Genbank Brian O. On 10/13/06 12:24 PM, "Jay Hannah" wrote: > Am I missing something From bix at sendu.me.uk Fri Oct 13 16:59:46 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 13 Oct 2006 17:59:46 +0100 Subject: [Bioperl-l] Documentation typo: Bio::Search::Result::GenericResult In-Reply-To: <452FBA2C.7070003@jays.net> References: <452FBA2C.7070003@jays.net> Message-ID: <452FC602.3080302@sendu.me.uk> Jay Hannah wrote: > Thanks Brian! > > My first BioPerl patch has been applied! A one byte typo correction! I'm so excited! :) > > /home/repository/bioperl/bioperl-live/Bio/Search/Result/GenericResult.pm,v > ---------------------------- > revision 1.27 > date: 2006/10/10 22:41:46; author: bosborne; state: Exp; lines: +4 -4 > next_hit, not next_hits > ---------------------------- Congratulations! :D Next it will be two byte corrections and from there, the sky's the limit! :) From hlapp at gmx.net Fri Oct 13 17:28:50 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 13 Oct 2006 13:28:50 -0400 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <452F54A1.7010908@sendu.me.uk> References: <4521E74E.1040404@infotech.monash.edu.au> <452F54A1.7010908@sendu.me.uk> Message-ID: <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net> What does the POD (and the code) say about instantiating it? -hilmar On Oct 13, 2006, at 4:56 AM, Sendu Bala wrote: > Bio::Tools::Run::WrapperBase is currently a Bio::Root::RootI. It's > certainly interface-like, but doesn't follow the normal interface > naming > convention (WrapperBaseI). Should it be a Bio::Root::Root? Renamed > WrapperBaseI? Left alone? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From jay at jays.net Fri Oct 13 18:56:38 2006 From: jay at jays.net (Jay Hannah) Date: Fri, 13 Oct 2006 13:56:38 -0500 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? In-Reply-To: References: Message-ID: <452FE166.5080405@jays.net> Brian Osborne wrote: > You're looking for the "translation" string in the CDS section, yes? You > need to delve a bit into features, the CDS is considered to be a feature of > the main or parent nucleotide sequence and the translation is part of CDS > feature: > > http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Features_from_Genbank Yes. Thanks. I "rolled my own" -- I'm now doing this: while (my $inseq = $seq_in->next_seq) { my @features = $inseq->get_SeqFeatures(); foreach my $feat ( @features ) { next unless ($feat->primary_tag eq "CDS"); my @db_xrefs = $feat->annotation->get_Annotations("db_xref"); @db_xrefs = grep { /^GI:/ } @db_xrefs; die "Panic! More than one GI: db_xref?" if (@db_xrefs > 1); die "Panic! No GI: db_xref?" unless (@db_xrefs == 1); my $gi = $db_xrefs[0]; $gi =~ s/^GI://; my @translations = $feat->annotation->get_Annotations("translation"); die "Panic! More than one translation?" if (@translations > 1); my @protein_ids = $feat->annotation->get_Annotations("protein_id"); die "Panic! More than one protein_id?" if (@protein_ids > 1); my @product = $feat->annotation->get_Annotations("product"); die "Panic! More than one product?" if (@product > 1); print ">gi|$gi|gb|$protein_ids[0]|"; print $inseq->id . " $product[0]\n"; print "$translations[0]\n"; } } To generate a homebrew fasta file for a protein BLAST. I just thought that -alphabet and molecule() would do that stuff for me? What else would "protein" mean in those? Does anyone use -alphabet and/or molecule()? For what? How? Again, here's what I'm talking about: ========== my $seq_out_protein = Bio::SeqIO->new( -file => ">out", -format => 'fasta', -alphabet => 'protein' # No effect? ); while (my $inseq = $seq_in->next_seq) { $inseq->molecule("protein"); # No effect? ========== Thanks, j From bosborne11 at verizon.net Fri Oct 13 21:20:40 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Fri, 13 Oct 2006 17:20:40 -0400 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? In-Reply-To: <452FE166.5080405@jays.net> Message-ID: Jay, Yes, people use the -alphabet parameter. If you set it to something then Bioperl will not try to determine whether the sequence is protein, rna, or dna and this is particularly useful when the sequence contains characters that Bioperl would object to (sequences with distasteful characters can be created by various applications, for example, or you might introduce some weird character for some reason). Setting the -alphabet would also speed up Bioperl a bit, for the same reason. Brian O. On 10/13/06 2:56 PM, "Jay Hannah" wrote: > > I just thought that -alphabet and molecule() would do that stuff for me? What > else would "protein" mean in those? From jay at jays.net Sat Oct 14 15:25:05 2006 From: jay at jays.net (Jay Hannah) Date: Sat, 14 Oct 2006 10:25:05 -0500 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? In-Reply-To: References: Message-ID: <45310151.5050901@jays.net> Brian Osborne wrote: > Yes, people use the -alphabet parameter. If you set it to something then > Bioperl will not try to determine whether the sequence is protein, rna, or > dna and this is particularly useful when the sequence contains characters > that Bioperl would object to (sequences with distasteful characters can be > created by various applications, for example, or you might introduce some > weird character for some reason). Setting the -alphabet would also speed up > Bioperl a bit, for the same reason. Huh. That's what I assumed when I stumbled into the -alphabet parameter. So I thought this would read the protein sequences out of my genbank file and write a fasta file for me: my $seq_in = Bio::SeqIO->new( -file => "<$file", -format => "genbank", -alphabet => "protein" # No effect? ); my $seq_out = Bio::SeqIO->new( -file => ">$outfile", -format => "fasta", -alphabet => "protein" # No effect? ); while (my $inseq = $seq_in->next_seq) { $inseq->molecule("protein"); # No effect? $seq_out->write_seq($inseq); } It didn't. Would it be a Good Thing if it did what I was expecting? (Like I said I rolled my own, but I'm always looking for ways to enhance BioPerl that other people might find useful... Someday I will contribute something useful, by golly. -grin-) (Background: I'm doing protein BLASTs from genbank files. To make formatdb happy I have to have fasta files full of the protein sequences.) j From bosborne11 at verizon.net Sat Oct 14 18:40:21 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Sat, 14 Oct 2006 14:40:21 -0400 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? In-Reply-To: <45310151.5050901@jays.net> Message-ID: Jay, What you expected was that setting the -alphabet to "protein" would make Bioperl translate the input nucleotide sequence to output protein. In Bioperl this is accomplished by using the translate() method, no surprise there. If you take a look at the documentation on translate() in the online Bioperl Tutorial you'll see that this is a fairly sophisticated method, you can do all sorts of different things with it. So using -alphabet for this purpose won't really work, there are too many different ways to translate. Brian O. On 10/14/06 11:25 AM, "Jay Hannah" wrote: > Would it be a Good Thing if it did what I was expecting? From cjfields at uiuc.edu Sun Oct 15 00:44:04 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sat, 14 Oct 2006 19:44:04 -0500 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? In-Reply-To: <45310151.5050901@jays.net> Message-ID: <000601c6eff3$084663c0$15327e82@pyrimidine> ... > Huh. That's what I assumed when I stumbled into the -alphabet parameter. > So I thought this would read the protein sequences out of my genbank file > and write a fasta file for me: You have to think about it this way: the GenBank record you are using is for the nucleotide sequence only, and all other information in that record describes the sequence. Similarly, if you used a 'GenPept' sequence, the focus would be the protein sequence. Both normally contain annotations which describe the sequence globally, such as references, organism info, etc. Both also may contain features (or SeqFeatures), which describe a feature bound to a particular location on the sequence. However, features are not an absolute requirement for a sequence; they're sort of 'window dressing', albeit almost always essential for describing the main sequence. I would do exactly as Brian suggests. See the Feature/Annotation HOWTO for ideas on how to screen out the particular features you want and either grab the 'translation' tag data or get the sequence object from the feature and translate it directly. You should get the same result either way though getting the tag may be faster. ... > It didn't. Would it be a Good Thing if it did what I was expecting? (Like > I said I rolled my own, but I'm always looking for ways to enhance BioPerl > that other people might find useful... Someday I will contribute something > useful, by golly. -grin-) > > (Background: I'm doing protein BLASTs from genbank files. To make formatdb > happy I have to have fasta files full of the protein sequences.) > > j You could, theoretically, write up a method to only retrieve features which correspond to coding regions only (CDS). You may want to optionally screen out pseudogenes but that's up to you. Chris From avilella at gmail.com Sun Oct 15 11:08:23 2006 From: avilella at gmail.com (Albert Vilella) Date: Sun, 15 Oct 2006 12:08:23 +0100 Subject: [Bioperl-l] no_residues test in SimpleAlign.t Message-ID: <358f4d650610150408r23cdf693t8e3b6b797649ff58@mail.gmail.com> Hi all, Can somebody check the SimpleAlign.t test? perl t/SimpleAlign.t I get a few errors, I am looking at one that deals with no_residues. I don't understand if this is suposed to work: sub no_residues { my $self = shift; my $count = 0; foreach my $seq ($self->each_seq) { my $str = $seq->seq(); $count += ($str =~ s/[^A-Za-z]//g); #is this the same as: # $str =~ s/[^A-Za-z]//g; # $count += length($str); } Cheers, Albert. return $count; } From cjfields at uiuc.edu Sun Oct 15 17:53:50 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 15 Oct 2006 12:53:50 -0500 Subject: [Bioperl-l] no_residues test in SimpleAlign.t In-Reply-To: <358f4d650610150408r23cdf693t8e3b6b797649ff58@mail.gmail.com> References: <358f4d650610150408r23cdf693t8e3b6b797649ff58@mail.gmail.com> Message-ID: Albert, I get all 75 tests passing. SimpleAlign.t was recently switched over to Test::More, so you should be seeing more explicit test descriptions. It looks like test 27 is no_residues(). Were there any more that failed? I usually run 'perl -I. t/test.t' from the main bioperl directory to check individual tests from the local directory. Otherwise you are checking your installed version which may be older (and may not match tests and recent bug fixes). Could that be the problem? Chris On Oct 15, 2006, at 6:08 AM, Albert Vilella wrote: > Hi all, > > Can somebody check the SimpleAlign.t test? > > perl t/SimpleAlign.t > > I get a few errors, I am looking at one that deals with no_residues. I > don't understand if this is suposed to work: > > sub no_residues { > my $self = shift; > my $count = 0; > > foreach my $seq ($self->each_seq) { > my $str = $seq->seq(); > > $count += ($str =~ s/[^A-Za-z]//g); > #is this the same as: > # $str =~ s/[^A-Za-z]//g; > # $count += length($str); > } > > Cheers, > > Albert. > return $count; > } > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From DGroskreutz at twt.com Mon Oct 16 06:00:39 2006 From: DGroskreutz at twt.com (DGroskreutz at twt.com) Date: Mon, 16 Oct 2006 01:00:39 -0500 Subject: [Bioperl-l] CN=Deb Groskreutz/OU=MSN/O=TWT is out of the office. Message-ID: I will be out of the office starting 10/13/2006 and will not return until 10/30/2006. I will be out of the office until October 30, 2006. I will reply to your message at that time. Thanks, Deb NOTICE OF CONFIDENTIALITY: The information contained in this communication, including attachments, is intended for the specific delivery to and use by the individual(s) to whom it is addressed. This email includes confidential information that may be attorney-client privileged. Any review, retransmission, dissemination, or unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please reply to the sender immediately and delete the original communication and any copy of it from your computer system, including all attachments. From bix at sendu.me.uk Mon Oct 16 08:08:34 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 09:08:34 +0100 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net> References: <4521E74E.1040404@infotech.monash.edu.au> <452F54A1.7010908@sendu.me.uk> <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net> Message-ID: <45333E02.9070808@sendu.me.uk> Hilmar Lapp wrote: > What does the POD (and the code) say about instantiating it? =head1 SYNOPSIS # do not use this object directly, it provides the following methods # for its subclasses ... =head1 DESCRIPTION This is a basic module from which to build executable wrapper modules. It has some basic methods to help when implementing new modules. There is no new() method. From bix at sendu.me.uk Mon Oct 16 13:23:41 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 14:23:41 +0100 Subject: [Bioperl-l] Bio::WebAgent sleep warning Message-ID: <453387DD.3040105@sendu.me.uk> Hi, Does anyone think it's appropriate for Bio::WebAgent to issue warnings every time it sleeps? I'd consider the sleeping part of its normal, expected and desired behaviour so I don't need to be warned about it. Perhaps change the $self->warn to a $self->debug? From cjfields at uiuc.edu Mon Oct 16 14:12:10 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 09:12:10 -0500 Subject: [Bioperl-l] Bio::WebAgent sleep warning In-Reply-To: <453387DD.3040105@sendu.me.uk> Message-ID: <000c01c6f12d$121b5000$15327e82@pyrimidine> > Hi, > > Does anyone think it's appropriate for Bio::WebAgent to issue warnings > every time it sleeps? I'd consider the sleeping part of its normal, > expected and desired behaviour so I don't need to be warned about it. > Perhaps change the $self->warn to a $self->debug? That sounds fine. Using debugging output for sleep would be similar behavior to Bio::DB::NCBIHelper and BioDB::GenericWebDBI. You may want to pass it by Heikki (I think that's his module). The only reason I would want to see sleep output, personally, is to make sure it is working properly. Almost looks like that class has the same intent that GenericWebDBI has (even down to using LWP::UserAgent as a superclass). I may look into it to see if I can use this as a superclass for GenericWebDBI. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Mon Oct 16 14:26:21 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Mon, 16 Oct 2006 15:26:21 +0100 Subject: [Bioperl-l] Bioperl Server Reconfig Message-ID: <4533968D.6040009@sheffield.ac.uk> Did anyone reconfigure the bioperl web server (which ever server hosts http://bioperl.org/DIST) by adding the following lines to the httpd.conf file: RedirectMatch /DIST/MSWin32-x86-multi-thread-5.8/(.*) http://ppm.activestate.com/PPMPackages/5.8-windows/MSWin32-x86-multi-thread-5.8/$1 This will be required as a workaround to a bug in ActivePerl 5.8.8.819 which will result in a failed install of Bioperl via PPM. Cheers Nath From n.haigh at sheffield.ac.uk Mon Oct 16 15:30:16 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Mon, 16 Oct 2006 16:30:16 +0100 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533A257.2000207@campus.iztacala.unam.mx> References: <4533968D.6040009@sheffield.ac.uk> <4533A257.2000207@campus.iztacala.unam.mx> Message-ID: <4533A588.9020505@sheffield.ac.uk> Mauricio Herrera Cuadra wrote: > Done. Could you please check if it works as it should? > > Cheers, > Mauricio. Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got someone to pop it in http://bioperl/DIST Volunteers? BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for the PPD? I seem to remember that there was talk about having to maintain a separate Bundle::BioPerl for each release of Bioperl. Any ideas on this front? Nath From arareko at campus.iztacala.unam.mx Mon Oct 16 15:16:39 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Mon, 16 Oct 2006 10:16:39 -0500 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533968D.6040009@sheffield.ac.uk> References: <4533968D.6040009@sheffield.ac.uk> Message-ID: <4533A257.2000207@campus.iztacala.unam.mx> Done. Could you please check if it works as it should? Cheers, Mauricio. Nathan Haigh wrote: > Did anyone reconfigure the bioperl web server (which ever server hosts > http://bioperl.org/DIST) by adding the following lines to the httpd.conf > file: > > RedirectMatch /DIST/MSWin32-x86-multi-thread-5.8/(.*) > http://ppm.activestate.com/PPMPackages/5.8-windows/MSWin32-x86-multi-thread-5.8/$1 > > This will be required as a workaround to a bug in ActivePerl 5.8.8.819 > which will result in a failed install of Bioperl via PPM. > > Cheers > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From arareko at campus.iztacala.unam.mx Mon Oct 16 15:33:33 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Mon, 16 Oct 2006 10:33:33 -0500 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533A588.9020505@sheffield.ac.uk> References: <4533968D.6040009@sheffield.ac.uk> <4533A257.2000207@campus.iztacala.unam.mx> <4533A588.9020505@sheffield.ac.uk> Message-ID: <4533A64D.6040203@campus.iztacala.unam.mx> Nathan Haigh wrote: > Mauricio Herrera Cuadra wrote: >> Done. Could you please check if it works as it should? >> >> Cheers, >> Mauricio. > Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got > someone to pop it in http://bioperl/DIST > > Volunteers? You can send it to me. -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From akarger at CGR.Harvard.edu Mon Oct 16 15:54:33 2006 From: akarger at CGR.Harvard.edu (Amir Karger) Date: Mon, 16 Oct 2006 11:54:33 -0400 Subject: [Bioperl-l] Bio::Location::Split Message-ID: I recently came across bug 2101, where Bio::Location::Split::to_FTstring gives the incorrect order for multi-sublocation locations on the minus strand. That is, I found it by getting incorrect results, and then found it in Bugzilla and in the September archives. I'm converting CDS files from one format to another. E.g., I read an EMBL file with a chromosome and CDS features, and want to output the location in a FASTA header. If I do something like: foreach (<$in>) { foreach my $feat ($seq->getSeqFeatures) { print $feat->location->to_FTstring() } } I get the wrong results for multi-exon CDSs on the -1 strand, as described in the bug report. Is there a relatively easy way around this? I assume I can't get at the original string of the location, which in this case is all I need. Can I just flip the order of the exons in certain cases? Chris F, can you tell me the preliminary solution you mentioned? I must say I'm sort of surprised this wasn't found before. It seems like a not-that-rare occurrence. Oh well. Thanks, - Amir Karger Research Computing Life Sciences Division Harvard University From bix at sendu.me.uk Mon Oct 16 16:14:39 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 17:14:39 +0100 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533A588.9020505@sheffield.ac.uk> References: <4533968D.6040009@sheffield.ac.uk> <4533A257.2000207@campus.iztacala.unam.mx> <4533A588.9020505@sheffield.ac.uk> Message-ID: <4533AFEF.8080103@sendu.me.uk> Nathan Haigh wrote: > Mauricio Herrera Cuadra wrote: >> Done. Could you please check if it works as it should? >> >> Cheers, >> Mauricio. > Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got > someone to pop it in http://bioperl/DIST > > Volunteers? I'm sure Mauricio would be happy to do it, but so am I. You may want to hold off a little while until I release rc2, which may be a few hours away. > BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for > the PPD? I seem to remember that there was talk about having to maintain > a separate Bundle::BioPerl for each release of Bioperl. Any ideas on > this front? It depends on what is in the PPD and what kind of auto-dependency features the ActiveState installer has. Given Perl 5.8 and your current PPD, does Bioperl install with the same or fewer number of skips if you also install Bundle::BioPerl first? That is, does Bundle::BioPerl even do anything useful anymore? If not, obviously don't bother making it a pre-req. If it does, my opinion is that you make it a pre-req. If people really don't want to install the optional stuff they can download the .zip file and install manually without even a make. From Kevin.M.Brown at asu.edu Mon Oct 16 16:14:51 2006 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Mon, 16 Oct 2006 09:14:51 -0700 Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only? Message-ID: <1A4207F8295607498283FE9E93B775B402196FAA@EX02.asurite.ad.asu.edu> > > Yes, people use the -alphabet parameter. If you set it to > something then > > Bioperl will not try to determine whether the sequence is > protein, rna, or > > dna and this is particularly useful when the sequence > contains characters > > that Bioperl would object to (sequences with distasteful > characters can be > > created by various applications, for example, or you might > introduce some > > weird character for some reason). Setting the -alphabet > would also speed up > > Bioperl a bit, for the same reason. > > Huh. That's what I assumed when I stumbled into the -alphabet > parameter. So I thought this would read the protein sequences > out of my genbank file and write a fasta file for me: > > my $seq_in = Bio::SeqIO->new( > -file => "<$file", > -format => "genbank", > -alphabet => "protein" # No effect? > ); > my $seq_out = Bio::SeqIO->new( > -file => ">$outfile", > -format => "fasta", > -alphabet => "protein" # No effect? > ); > while (my $inseq = $seq_in->next_seq) { > $inseq->molecule("protein"); # No effect? > $seq_out->write_seq($inseq); > } > > It didn't. Would it be a Good Thing if it did what I was > expecting? (Like I said I rolled my own, but I'm always > looking for ways to enhance BioPerl that other people might > find useful... Someday I will contribute something useful, by > golly. -grin-) > > (Background: I'm doing protein BLASTs from genbank files. To > make formatdb happy I have to have fasta files full of the > protein sequences.) This might work for your needs (CDS to protein FASTA). my $seq_in = Bio::SeqIO->new( -file => "<$file", -format => "genbank", ); open my $seq_out, ">$outfile"; while (my $inseq = $seq_in->next_seq) { print $seq_out ">". $inseq->display_id(). "\n"; print $seq_out $inseq->translate() ."\n"; } From bix at sendu.me.uk Mon Oct 16 15:44:19 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 16:44:19 +0100 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? Message-ID: <4533A8D3.90709@sendu.me.uk> I think Chris recently deprecated this, but should it be? For me, its POD description justifies its existence, and perhaps more importantly, Bio::Index::Blast relies on it. I took a quick peek at the latter and it didn't seem trivial to move it over to Bio::SearchIO instead. Should it be undeprecated? From n.haigh at sheffield.ac.uk Mon Oct 16 16:39:02 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Mon, 16 Oct 2006 17:39:02 +0100 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533AFEF.8080103@sendu.me.uk> References: <4533968D.6040009@sheffield.ac.uk> <4533A257.2000207@campus.iztacala.unam.mx> <4533A588.9020505@sheffield.ac.uk> <4533AFEF.8080103@sendu.me.uk> Message-ID: <4533B5A6.1070701@sheffield.ac.uk> Sendu Bala wrote: > Nathan Haigh wrote: >> Mauricio Herrera Cuadra wrote: >>> Done. Could you please check if it works as it should? >>> >>> Cheers, >>> Mauricio. >> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got >> someone to pop it in http://bioperl/DIST >> >> Volunteers? > > I'm sure Mauricio would be happy to do it, but so am I. You may want > to hold off a little while until I release rc2, which may be a few > hours away. Just e-mailed Mauricio links to the files off list, It's not a big job for me to remake the bioperl PPD, so Mauricio it's up to you if you want to wait 18hrs for me to make the PPDs for 1.5.2-rc2. > > >> BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for >> the PPD? I seem to remember that there was talk about having to maintain >> a separate Bundle::BioPerl for each release of Bioperl. Any ideas on >> this front? > > It depends on what is in the PPD and what kind of auto-dependency > features the ActiveState installer has. Given Perl 5.8 and your > current PPD, does Bioperl install with the same or fewer number of > skips if you also install Bundle::BioPerl first? That is, does > Bundle::BioPerl even do anything useful anymore? If not, obviously > don't bother making it a pre-req. If it does, my opinion is that you > make it a pre-req. If people really don't want to install the optional > stuff they can download the .zip file and install manually without > even a make. As far as the PPDs are concerned - no tests are run during installation. PPM more or less just copies files into the correct place for Perl to find so both approaches result in the same thing. However, I've not tried making a CPAN distribution file for either Bioperl or Bundle::Bioperl - I wouldn't know where to start! MakeFile.PL now only documents the prereq in one place (%packages), and this is used to add the prereq to the bioperl PPD when issuing "nmake ppd". This way, each release of BioPerl should be up-to-date with prereq as long as developers add their modules prereq to %packages. If we have Bundle::BioPerl, most of those prereq need to be moved from the Bioperl PPD to the Bundle::BioPerl PPD - a bit of a pain because there are no guidelines as to what should/should not go in Bundle::BioPerl. Therefore, as far as the PPDs are concerned, it far easier to do away with Bundel::BioPerl. Nath From hlapp at gmx.net Mon Oct 16 17:04:24 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 16 Oct 2006 13:04:24 -0400 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <45333E02.9070808@sendu.me.uk> References: <4521E74E.1040404@infotech.monash.edu.au> <452F54A1.7010908@sendu.me.uk> <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net> <45333E02.9070808@sendu.me.uk> Message-ID: <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net> So it looks like an abstract base class, not an interface that defines a contract or API? Should use Root.pm then, would be my vote. -hilmar On Oct 16, 2006, at 4:08 AM, Sendu Bala wrote: > Hilmar Lapp wrote: >> What does the POD (and the code) say about instantiating it? > > =head1 SYNOPSIS > > # do not use this object directly, it provides the following > methods > # for its subclasses > > ... > > > =head1 DESCRIPTION > > This is a basic module from which to build executable wrapper modules. > It has some basic methods to help when implementing new modules. > > > There is no new() method. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Mon Oct 16 17:08:28 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 16 Oct 2006 13:08:28 -0400 Subject: [Bioperl-l] Bio::WebAgent sleep warning In-Reply-To: <453387DD.3040105@sendu.me.uk> References: <453387DD.3040105@sendu.me.uk> Message-ID: It depends. What triggers the sleeping? If it's part of every request that it processes then I'd agree. If it is triggered by failure to precede the next try then the failure is probably not expected (though possible), and hence should be reported by warn(). If it is just part of the polling cycle then there should probably be a limit up to which the time waited is considered 'normal' and after which it is considered 'excessive' and hence should be reported through warn(). My $0.02. -hilmar On Oct 16, 2006, at 9:23 AM, Sendu Bala wrote: > Hi, > > Does anyone think it's appropriate for Bio::WebAgent to issue warnings > every time it sleeps? I'd consider the sleeping part of its normal, > expected and desired behaviour so I don't need to be warned about it. > Perhaps change the $self->warn to a $self->debug? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Mon Oct 16 17:13:53 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 18:13:53 +0100 Subject: [Bioperl-l] Bio::WebAgent sleep warning In-Reply-To: References: <453387DD.3040105@sendu.me.uk> Message-ID: <4533BDD1.8060204@sendu.me.uk> Hilmar Lapp wrote: > It depends. What triggers the sleeping? If it's part of every request > that it processes then I'd agree. If it is triggered by failure to > precede the next try then the failure is probably not expected (though > possible), and hence should be reported by warn(). > > If it is just part of the polling cycle then there should probably be a > limit up to which the time waited is considered 'normal' and after which > it is considered 'excessive' and hence should be reported through warn(). =head2 sleep Title : sleep Usage : $self->sleep Function: sleep for a number of seconds indicated by the delay policy Returns : none Args : none NOTE: This method keeps track of the last time it was called and only imposes a sleep if it was called more recently than the delay_policy() allows. =cut It issues a warning every time it actually sleeps. I find it inappropriate that a method warns me that it did what I asked it to do. From arareko at campus.iztacala.unam.mx Mon Oct 16 17:14:06 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Mon, 16 Oct 2006 12:14:06 -0500 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533B5A6.1070701@sheffield.ac.uk> References: <4533968D.6040009@sheffield.ac.uk> <4533A257.2000207@campus.iztacala.unam.mx> <4533A588.9020505@sheffield.ac.uk> <4533AFEF.8080103@sendu.me.uk> <4533B5A6.1070701@sheffield.ac.uk> Message-ID: <4533BDDE.2040801@campus.iztacala.unam.mx> Nathan Haigh wrote: > Sendu Bala wrote: >> Nathan Haigh wrote: >>> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got >>> someone to pop it in http://bioperl/DIST >>> >>> Volunteers? >> I'm sure Mauricio would be happy to do it, but so am I. You may want >> to hold off a little while until I release rc2, which may be a few >> hours away. > > Just e-mailed Mauricio links to the files off list, It's not a big job > for me to remake the bioperl PPD, so Mauricio it's up to you if you want > to wait 18hrs for me to make the PPDs for 1.5.2-rc2. Too late, I've already placed 1.5.2-rc1 in DIST. hehe :) -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From bix at sendu.me.uk Mon Oct 16 16:32:11 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 17:32:11 +0100 Subject: [Bioperl-l] Swissprot problems Message-ID: <4533B40B.2030908@sendu.me.uk> t/Biofetch.t and t/DB.t are skipping their swissprot database fetches. Earlier today http://www.ebi.ac.uk/cgi-bin/dbfetch was down for maintenance but is now back up. However I'm guessing the databases must have changed. I've manually looked for the test case 'YNB3_YEAST' in database 'UniProtKB' and it came back with no result, even though I can find the test case manually at the expasy website. Is this an EBI bug or deliberate change that makes sense to someone? From m.weimer at dkfz-heidelberg.de Mon Oct 16 16:43:38 2006 From: m.weimer at dkfz-heidelberg.de (Marc Weimer) Date: Mon, 16 Oct 2006 18:43:38 +0200 Subject: [Bioperl-l] Bio::DB::SwissProt Problem Message-ID: <1161017019.5203.6.camel@localhost> Dear list members, when running ###################################################################### #! /usr/bin/perl -w use strict; use Bio::DB::SwissProt; my $db_obj = new Bio::DB::SwissProt(-verbose => 1); my $seq_obj = $db_obj->get_Seq_by_acc("O02938"); ###################################################################### using Bioperl 1.5.2 I get the following message: ########################################################################################## request is POST http://www.ebi.ac.uk/cgi-bin/dbfetch Content-Length: 49 Content-Type: application/x-www-form-urlencoded format=swissprot&db=UniProtKB&style=raw&id=O02938 ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: acc O02938 does not exist STACK: Error::throw STACK: Bio::Root::Root::throw /usr/local/share/perl/5.8.7/Bio/Root/Root.pm:350 STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc /usr/local/share/perl/5.8.7/Bio/DB/WebDBSeqI.pm:181 STACK: ./get.test.pl:8 ----------------------------------------------------------- ########################################################################################## But the accession number does exist. Surprisingly, everything worked fine a few days ago. Any ideas of what might have happened? Thanks and best regards, Marc From hlapp at gmx.net Mon Oct 16 17:15:50 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 16 Oct 2006 13:15:50 -0400 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <4533A8D3.90709@sendu.me.uk> References: <4533A8D3.90709@sendu.me.uk> Message-ID: The problem is it is not maintained, and there are outstanding been bug reports. If you un-deprecate it, then we need a response to people who come across problems with it when using it. Either you change the POD to say exactly who and when one should use it (or rather not) and point to the fact that it is unsupported for all other cases. Or what would you suggest? -hilmar On Oct 16, 2006, at 11:44 AM, Sendu Bala wrote: > I think Chris recently deprecated this, but should it be? For me, its > POD description justifies its existence, and perhaps more importantly, > Bio::Index::Blast relies on it. > > I took a quick peek at the latter and it didn't seem trivial to > move it > over to Bio::SearchIO instead. > > Should it be undeprecated? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Mon Oct 16 17:21:46 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 12:21:46 -0500 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <4533A8D3.90709@sendu.me.uk> Message-ID: <000001c6f147$8efdfd60$15327e82@pyrimidine> Bio::Tools::BPlite was placed on the deprecation list a while back (~ rel 1.5); the other related Bio::Tools::BP* modules were also supposed to be on that list as well. If we want to undeprecate (de-deprecate? reprecate?) BPlite we also would need to do the same for the others. They must be updated to parse current BLAST/PSI-BLAST/bl2seq text output, something that Bio::SearchIO::blast is currently capable of (so the functionality is redundant). And someone needs to take them over. In my opinion it may be more trouble than it's worth as they haven't been touched in a while. Seems if we 'revive' BPlite we're not really moving forward esp. since you have added the PullParser recently and made substantial improvements to SearchIO. Maybe Bio::Index::Blast just needs to be deprecated or rewritten to use SearchIO? Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Sendu Bala > Sent: Monday, October 16, 2006 10:44 AM > To: bioperl-l > Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? > > I think Chris recently deprecated this, but should it be? For me, its > POD description justifies its existence, and perhaps more importantly, > Bio::Index::Blast relies on it. > > I took a quick peek at the latter and it didn't seem trivial to move it > over to Bio::SearchIO instead. > > Should it be undeprecated? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From bix at sendu.me.uk Mon Oct 16 17:21:58 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 18:21:58 +0100 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: References: <4533A8D3.90709@sendu.me.uk> Message-ID: <4533BFB6.5070504@sendu.me.uk> Hilmar Lapp wrote: > The problem is it is not maintained, and there are outstanding been bug > reports. > > If you un-deprecate it, then we need a response to people who come > across problems with it when using it. Either you change the POD to say > exactly who and when one should use it (or rather not) and point to the > fact that it is unsupported for all other cases. > > Or what would you suggest? I'm not sure. Does Bio::Index::Blast even work correctly? Does it suffer from whatever bugs Bio::Tools::BPlite has? Is Bio::Index::Blast maintained? Should that be deprecated as well? Ideally I'd like to see Bio::Index::Blast updated to use Bio::SearchIO and then Bio::Tools::BPlite can be deprecated. But like I say, it didn't seem trivial (or even appropriate). Ultimately I just wanted to solve the warnings in the test suite. Thoughts, Chris? From cjfields at uiuc.edu Mon Oct 16 17:30:05 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 12:30:05 -0500 Subject: [Bioperl-l] Bioperl Server Reconfig In-Reply-To: <4533A588.9020505@sheffield.ac.uk> Message-ID: <000101c6f148$b8538b20$15327e82@pyrimidine> > Mauricio Herrera Cuadra wrote: > > Done. Could you please check if it works as it should? > > > > Cheers, > > Mauricio. > Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got > someone to pop it in http://bioperl/DIST > > Volunteers? > > BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for > the PPD? I seem to remember that there was talk about having to maintain > a separate Bundle::BioPerl for each release of Bioperl. Any ideas on > this front? > > Nath Nathan, I think Chris Dagdigian still maintains Bundle::Bioperl on CPAN. That version should be the common basis for prereqs for any Bioperl core installation. It's relatively easy to add/remove modules to the Bundle::Bioperl. Contact Chris D. and let him know if anything needs to be changed. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 16 17:33:50 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 12:33:50 -0500 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net> Message-ID: <000201c6f149$3ed63490$15327e82@pyrimidine> > So it looks like an abstract base class, not an interface that > defines a contract or API? Should use Root.pm then, would be my vote. > > -hilmar Makes sense to me. Maybe another audit is needed to catch similar instances, or has this been done already? Chris > On Oct 16, 2006, at 4:08 AM, Sendu Bala wrote: > > > Hilmar Lapp wrote: > >> What does the POD (and the code) say about instantiating it? > > > > =head1 SYNOPSIS > > > > # do not use this object directly, it provides the following > > methods > > # for its subclasses > > > > ... > > > > > > =head1 DESCRIPTION > > > > This is a basic module from which to build executable wrapper modules. > > It has some basic methods to help when implementing new modules. > > > > > > There is no new() method. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 16 17:57:35 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 12:57:35 -0500 Subject: [Bioperl-l] Bio::Location::Split In-Reply-To: Message-ID: <000301c6f14c$8fb0e060$15327e82@pyrimidine> > I recently came across bug 2101, where Bio::Location::Split::to_FTstring > gives the incorrect order for multi-sublocation locations on the minus > strand. That is, I found it by getting incorrect results, and then found > it in Bugzilla and in the September archives. > > I'm converting CDS files from one format to another. E.g., I read an > EMBL file with a chromosome and CDS features, and want to output the > location in a FASTA header. If I do something like: > > foreach (<$in>) { > foreach my $feat ($seq->getSeqFeatures) { > print $feat->location->to_FTstring() > } > } > > I get the wrong results for multi-exon CDSs on the -1 strand, as > described in the bug report. > > Is there a relatively easy way around this? I assume I can't get at the > original string of the location, which in this case is all I need. Can I > just flip the order of the exons in certain cases? Chris F, can you tell > me the preliminary solution you mentioned? > > I must say I'm sort of surprised this wasn't found before. It seems like > a not-that-rare occurrence. Oh well. > > Thanks, > > - Amir Karger > Research Computing > Life Sciences Division > Harvard University Could you let me know specifically which EMBL file contains the odd locations? The bug report uses theoretical locations, not actual ones, so it would be nice to have a real-world example to test against. As for the lack of catching this, the particular types of locations that cause the issue are quite rare. Note that there are two bugs for that bug report. The first (and more serious) is still unresolved. The second (where remote locations are treated differently in Location::Split, which caused more problems than it was worth) had a fix committed about a month ago. Any fixes I have made for the first bug invariably break several other methods, which use the current Location::Split object logic for retrieving sequences, building feature strings, etc. Since a new RC is imminent and the bug only affects a small number of locations, I have held off until after a final release is made (the last thing I want to do is fix something that breaks ~6-8 other methods), but I'll try looking at it again this week. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 16 18:29:02 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 13:29:02 -0500 Subject: [Bioperl-l] Swissprot problems In-Reply-To: <4533B40B.2030908@sendu.me.uk> Message-ID: <000401c6f150$f57dfc30$15327e82@pyrimidine> > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Sendu Bala > Sent: Monday, October 16, 2006 11:32 AM > To: bioperl-l > Subject: [Bioperl-l] Swissprot problems > > t/Biofetch.t and t/DB.t are skipping their swissprot database fetches. > Earlier today http://www.ebi.ac.uk/cgi-bin/dbfetch was down for > maintenance but is now back up. However I'm guessing the databases must > have changed. I've manually looked for the test case 'YNB3_YEAST' in > database 'UniProtKB' and it came back with no result, even though I can > find the test case manually at the expasy website. > > Is this an EBI bug or deliberate change that makes sense to someone? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l I can confirm that. It's not our end, though. Entering the same data on the DBFetch web page also gets no data. I have emailed EBI about the problem and will let you know if I hear anything; I think it's related to the maintenance issue. Notably, nothing on the web page indicates any database name changes yet. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 16 18:29:52 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 13:29:52 -0500 Subject: [Bioperl-l] Bio::DB::SwissProt Problem In-Reply-To: <1161017019.5203.6.camel@localhost> Message-ID: <000501c6f151$12918710$15327e82@pyrimidine> We think there is a problem on the SwissProt (DBFetch) server. I have contacted them about the problem and will post something when I hear something back. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Marc Weimer > Sent: Monday, October 16, 2006 11:44 AM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] Bio::DB::SwissProt Problem > > Dear list members, > > when running > > ###################################################################### > #! /usr/bin/perl -w > > use strict; > use Bio::DB::SwissProt; > > my $db_obj = new Bio::DB::SwissProt(-verbose => 1); > > my $seq_obj = $db_obj->get_Seq_by_acc("O02938"); > ###################################################################### > > using Bioperl 1.5.2 I get the following message: > > ########################################################################## > ################ > > request is POST http://www.ebi.ac.uk/cgi-bin/dbfetch > Content-Length: 49 > Content-Type: application/x-www-form-urlencoded > > format=swissprot&db=UniProtKB&style=raw&id=O02938 > > > ------------- EXCEPTION: Bio::Root::Exception ------------- > MSG: acc O02938 does not exist > STACK: Error::throw > STACK: > Bio::Root::Root::throw /usr/local/share/perl/5.8.7/Bio/Root/Root.pm:350 > STACK: > Bio::DB::WebDBSeqI::get_Seq_by_acc > /usr/local/share/perl/5.8.7/Bio/DB/WebDBSeqI.pm:181 > STACK: ./get.test.pl:8 > ----------------------------------------------------------- > > ########################################################################## > ################ > > But the accession number does exist. Surprisingly, everything worked > fine a few days ago. Any ideas of what might have happened? > > Thanks and best regards, > > Marc > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Mon Oct 16 18:39:28 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 13:39:28 -0500 Subject: [Bioperl-l] SwissProt Down Message-ID: <000601c6f152$6997dbd0$15327e82@pyrimidine> Looks like the swissprot problem stems from maintenance at EBI. From the EBI page http://www.ebi.ac.uk/Information/ (not on the DBFetch page, BTW): Please Note: Monday October 16th 12:00-15:00 - Due to general maintenance, some services from the EBI may be temporarily unavailable. We apologise for any inconvenience. At least we know that Test::More skips are working! Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Mon Oct 16 18:51:31 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 16 Oct 2006 19:51:31 +0100 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: References: Message-ID: <4533D4B3.2000809@sendu.me.uk> Brian Osborne wrote: > Sendu, > > I just made a commit that makes Bio::Index::Blast use SearchIO instead of > BPlite. I was concerned about the whole id_parser thing. Did you determine that your change still allows for id_parser to be used and have the intended effect, or that id_parser is in someway meaningless and should be removed as a method? From cjfields at uiuc.edu Mon Oct 16 19:03:33 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 14:03:33 -0500 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <4533BFB6.5070504@sendu.me.uk> Message-ID: <000301c6f155$c7029ff0$15327e82@pyrimidine> > Hilmar Lapp wrote: > > The problem is it is not maintained, and there are outstanding been bug > > reports. > > > > If you un-deprecate it, then we need a response to people who come > > across problems with it when using it. Either you change the POD to say > > exactly who and when one should use it (or rather not) and point to the > > fact that it is unsupported for all other cases. > > > > Or what would you suggest? > > I'm not sure. > > Does Bio::Index::Blast even work correctly? Does it suffer from whatever > bugs Bio::Tools::BPlite has? Is Bio::Index::Blast maintained? Should > that be deprecated as well? > > Ideally I'd like to see Bio::Index::Blast updated to use Bio::SearchIO > and then Bio::Tools::BPlite can be deprecated. But like I say, it didn't > seem trivial (or even appropriate). > > Ultimately I just wanted to solve the warnings in the test suite. > Thoughts, Chris? My opinion is we either have to completely support BPlite (and the others) or drop it altogether. I don't think we can state "use BPLite only with Bio::Index::Blast, use SearchIO everywhere else." That's too inconsistent. It seems simpler to deprecate the various Bio::Tools::BP* classes and either fix Bio::Index::Blast to use Bio::SearchIO (which I think Brian is working on) or deprecate Bio::Index::Blast as well. The warnings in the test suite belong to BlastIndex.t, correct? I updated using Brian's Bio::Index::blast fix and it passes now w/o warnings. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From akarger at CGR.Harvard.edu Mon Oct 16 19:00:28 2006 From: akarger at CGR.Harvard.edu (Amir Karger) Date: Mon, 16 Oct 2006 15:00:28 -0400 Subject: [Bioperl-l] Bio::Location::Split Message-ID: > -----Original Message----- > From: Chris Fields [mailto:cjfields at uiuc.edu] > > > > I'm converting CDS files from one format to another. E.g., I read an > > EMBL file with a chromosome and CDS features, and want to output the > > location in a FASTA header.> > > > I get the wrong results for multi-exon CDSs on the -1 strand, as > > described in the bug report. > > > > Could you let me know specifically which EMBL file contains the odd > locations? The bug report uses theoretical locations, not > actual ones, so > it would be nice to have a real-world example to test against. I downloaded candida glabrata chromosome B from EBI: http://www.ebi.ac.uk/genomes/eukaryota.html, CR380948 testportal>perl location.pl new_glabrata_B.embl > bio testportal>perl -wlne 'print $1 if /^FT\s+CDS\s+(.*)/' new_glabrata_B.embl > nonbio testportal>wc bio nonbio 217 217 4537 bio 217 217 4549 nonbio 434 434 9086 total testportal>diff bio nonbio 4c4 < complement(join(10632..11157,10347..10372)) --- > join(complement(10632..11157),complement(10347..10372)) Just one example here, but see below. > As for the lack of catching this, the particular types of > locations that > cause the issue are quite rare. Really? I guess our definitions of rare depend on which sequences we're working with. I'm doing fungal genomes, and here's a grep for a few species' entire genomes: testportal>foreach i ( *.embl ) foreach? echo $i foreach? grep CDS $i | grep join | grep -c complement foreach? end glabrata_orf.embl 29 hansenii_orf.embl 151 lactis_orf.embl 70 lipolytica_orf.embl 337 pombe_orf.embl 1137 You might like to use pombe as a test case, as it has lots of these complement joins, including ones with multiple introns. Anyway, I'd question the "rare" designation. It seems to me like any species that has introns will have situations like this in their CDSs. Not to mention any other sequence that uses Bio::Location::Split. (Since I'm not a Real Biologist, I can't think up mor examples here, but I'm sure they exist.) Or are you saying it's rare to use join (complement(C..D), complement(A..B)) instead of complement(join(A..B, C..D)). In that case, I guess I just got really unlucky in that five fungal genomes I was using decided to use the "rare" syntax. > Note that there are two bugs > for that bug > report. The first (and more serious) is still unresolved. The second > (where remote locations are treated differently in > Location::Split, which > caused more problems than it was worth) had a fix committed > about a month > ago. Sadly, it's the first (and in my case, more common (I have no remote locations.)) bug for me. > Any fixes I have made for the first bug invariably break several other > methods, which use the current Location::Split object logic > for retrieving > sequences, building feature strings, etc. Since a new RC is > imminent and > the bug only affects a small number of locations, I have held > off until > after a final release is made (the last thing I want to do is > fix something > that breaks ~6-8 other methods), but I'll try looking at it > again this week. IMO this is a pretty serious bug (if these kinds of sequences aren't that rare as I've shown above), because you're outputting sequence descriptions that are just plain wrong. Anyone who uses FTLocationFactory to read these output description will have incorrect sequence, incorrect translated proteins, etc. And it's even more serious if other methods are depending on it. I know I can't dictate your time, and should be volunteering to work on fixing it. But if it affects other modules, then I will no doubt break things even more than you have in your attempts. -Amir > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > From bosborne11 at verizon.net Mon Oct 16 18:25:14 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 16 Oct 2006 14:25:14 -0400 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <4533A8D3.90709@sendu.me.uk> Message-ID: Sendu, I just made a commit that makes Bio::Index::Blast use SearchIO instead of BPlite. The BlastIndex.t test is giving a few warnings so I need to take a look at that but all tests are passing. An awful lot of work has gone into the SearchIO system, for more on why its approach is deemed to be superior in the context of Bioperl see the SearchIO HOWTO. One key feature of this upcoming release is an emphasis on removing extraneous modules, I think it's safe to say that BPlite has been considered extraneous for a number of years now. Brian O. On 10/16/06 11:44 AM, "Sendu Bala" wrote: > I think Chris recently deprecated this, but should it be? For me, its > POD description justifies its existence, and perhaps more importantly, > Bio::Index::Blast relies on it. > > I took a quick peek at the latter and it didn't seem trivial to move it > over to Bio::SearchIO instead. > > Should it be undeprecated? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From bosborne11 at verizon.net Mon Oct 16 18:59:38 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 16 Oct 2006 14:59:38 -0400 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <4533D4B3.2000809@sendu.me.uk> Message-ID: Sendu, OK. I _think_ this change shouldn't affect id_parser() but I will test this in BlastIndex.t. The id_parser() method is relevant to all these Index* modules - don't know how much it's used but it certainly is nice to have it available. Brian O. On 10/16/06 2:51 PM, "Sendu Bala" wrote: > Brian Osborne wrote: >> Sendu, >> >> I just made a commit that makes Bio::Index::Blast use SearchIO instead of >> BPlite. > > I was concerned about the whole id_parser thing. Did you determine that > your change still allows for id_parser to be used and have the intended > effect, or that id_parser is in someway meaningless and should be > removed as a method? From cjfields at uiuc.edu Mon Oct 16 20:51:08 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 15:51:08 -0500 Subject: [Bioperl-l] Bio::Location::Split In-Reply-To: Message-ID: <000001c6f164$d1380190$15327e82@pyrimidine> ... > I downloaded candida glabrata chromosome B from EBI: > http://www.ebi.ac.uk/genomes/eukaryota.html, CR380948 > > testportal>perl location.pl new_glabrata_B.embl > bio > testportal>perl -wlne 'print $1 if /^FT\s+CDS\s+(.*)/' > new_glabrata_B.embl > nonbio > testportal>wc bio nonbio > 217 217 4537 bio > 217 217 4549 nonbio > 434 434 9086 total > testportal>diff bio nonbio > 4c4 > < complement(join(10632..11157,10347..10372)) > --- > > join(complement(10632..11157),complement(10347..10372)) > > Just one example here, but see below. > > > As for the lack of catching this, the particular types of > > locations that > > cause the issue are quite rare. > > Really? I guess our definitions of rare depend on which sequences we're > working with. I'm doing fungal genomes, and here's a grep for a few > species' entire genomes: > > testportal>foreach i ( *.embl ) > foreach? echo $i > foreach? grep CDS $i | grep join | grep -c complement > foreach? end > glabrata_orf.embl > 29 > hansenii_orf.embl > 151 > lactis_orf.embl > 70 > lipolytica_orf.embl > 337 > pombe_orf.embl > 1137 > > You might like to use pombe as a test case, as it has lots of these > complement joins, including ones with multiple introns. I'll use those. I'll see if an analogous GenBank file exists as well. I can probably make a preliminary fix for FT_string() so that it arranges the sublocations correctly, but I think the best way to go is to have FTLocationFactory not modify the various sublocations to start with, which it currently does when it sets strand() (strand() propagates the strand info to sublocations). > Anyway, I'd question the "rare" designation. It seems to me like any > species that has introns will have situations like this in their CDSs. > Not to mention any other sequence that uses Bio::Location::Split. (Since > I'm not a Real Biologist, I can't think up mor examples here, but I'm > sure they exist.) I think that additional tests are definitely needed for pulling out sequences. What I mean by 'rare' is that the majority of sequences do not have problems. Also, this seems to be a 'silent' bug since the error shows up in to_FTstring() but the object sublocations seem to beprocessed correctly when using the location object directly (such as via SeqFeatureI). Round-tripping the sequence should pick it up though. Since complement(join(10632..11157,10347..10372)) is not the same as join(complement(10632..11157),complement(10347..10372)). That is essentially what you are doing, correct? i.e. getting the sequences using Bioperl, saving them (which passes them through SeqIO), reading them again (back through SeqIO with the malformed location string). > Or are you saying it's rare to use join (complement(C..D), > complement(A..B)) instead of complement(join(A..B, C..D)). In that case, > I guess I just got really unlucky in that five fungal genomes I was > using decided to use the "rare" syntax. Location::Split is supposed to handle all variations, but apparently it isn't. > > Note that there are two bugs > > for that bug > > report. The first (and more serious) is still unresolved. The second > > (where remote locations are treated differently in > > Location::Split, which > > caused more problems than it was worth) had a fix committed > > about a month > > ago. > > Sadly, it's the first (and in my case, more common (I have no remote > locations.)) bug for me. > > > Any fixes I have made for the first bug invariably break several other > > methods, which use the current Location::Split object logic > > for retrieving > > sequences, building feature strings, etc. Since a new RC is > > imminent and > > the bug only affects a small number of locations, I have held > > off until > > after a final release is made (the last thing I want to do is > > fix something > > that breaks ~6-8 other methods), but I'll try looking at it > > again this week. > > IMO this is a pretty serious bug (if these kinds of sequences aren't > that rare as I've shown above), because you're outputting sequence > descriptions that are just plain wrong. Anyone who uses > FTLocationFactory to read these output description will have incorrect > sequence, incorrect translated proteins, etc. And it's even more serious > if other methods are depending on it. > > I know I can't dictate your time, and should be volunteering to work on > fixing it. But if it affects other modules, then I will no doubt break > things even more than you have in your attempts. > > -Amir I'll give it a look over the next week. Like I mentioned above, I may be able to fix it in Split::to_FTstring() w/o breaking other tests (in which case I'll commit it for the 1.5.2 release), but it would be a temporary hack until I can work out why other tests are failing. Chris From jason at bioperl.org Mon Oct 16 22:45:21 2006 From: jason at bioperl.org (Jason Stajich) Date: Mon, 16 Oct 2006 15:45:21 -0700 Subject: [Bioperl-l] split location problems Message-ID: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org> The whole point of split locations is to represent genes with introns so that is not the "rare" case. I'm confused where the problem is. The locations that I get out with to_FTstring on the location object are exactly the same as those input. I have processed the genbank fungal genomes into GFF3 and have had no problems so I'm confused where you are breaking down. If I write them out as embl I also get the correct thing. This is using the CVS version of bioperl from the HEAD. I've added code to test this to bug 2101 including a C.glabrata chromsome downloaded from genbank. Perhaps the problem is on the EMBL parsing side, I didn't test that. On the technical side, I still am not sure I fully know where the strand information should be stored - the top level container or the sub-features. I'll try and stay up on the discussion if anything has been decided that I should know about. -jason From torsten.seemann at infotech.monash.edu.au Mon Oct 16 22:23:23 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Tue, 17 Oct 2006 08:23:23 +1000 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <000201c6f149$3ed63490$15327e82@pyrimidine> References: <000201c6f149$3ed63490$15327e82@pyrimidine> Message-ID: <4534065B.9020309@infotech.monash.edu.au> Chris Fields wrote: >> So it looks like an abstract base class, not an interface that >> defines a contract or API? Should use Root.pm then, would be my vote. >> -hilmar > > Makes sense to me. Maybe another audit is needed to catch similar > instances, or has this been done already? The purpose of my original (poorly phrased) question was to try and sort out where Root and RootI where being used the wrong way around. I'm currently "all-audited out" so I leave this task to another volunteer. -- Dr Torsten Seemann http://www.vicbioinformatics.com Victorian Bioinformatics Consortium, Monash University, Australia From cjfields at uiuc.edu Tue Oct 17 01:07:55 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 20:07:55 -0500 Subject: [Bioperl-l] split location problems In-Reply-To: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org> References: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org> Message-ID: On Oct 16, 2006, at 5:45 PM, Jason Stajich wrote: > The whole point of split locations is to represent genes with > introns so that is not the "rare" case. > > I'm confused where the problem is. The locations that I get out > with to_FTstring on the location object are exactly the same as > those input. The problem is with the a subset of split locations described in the bug report. The following works: complement(join(2691..4571,4918..5163)) whereas this: join(complement(4918..5163),complement(2691..4571)) gives this: complement(join(4918..5163,2691..4571)) which is not syntactically the same. It should be: complement(join(2691..4571,4918..5163)) since 'join' implies that the order of the segments to be joined is important ('order' and 'bond' do not, I guess). > I have processed the genbank fungal genomes into GFF3 and have had > no problems so I'm confused where you are breaking down. If I > write them out as embl I also get the correct thing. This is using > the CVS version of bioperl from the HEAD. > > I've added code to test this to bug 2101 including a C.glabrata > chromsome downloaded from genbank. Perhaps the problem is on the > EMBL parsing side, I didn't test that. > > On the technical side, I still am not sure I fully know where the > strand information should be stored - the top level container or > the sub-features. I'll try and stay up on the discussion if > anything has been decided that I should know about. > > -jason Split::strand() sets the sublocations as well, which seems to confuse the situation more but it is consistent with LocationI, as Hilmar points out. I'm looking into a few solutions now, including a fix in Split::to_FTstring(). Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Tue Oct 17 02:48:14 2006 From: jason at bioperl.org (Jason Stajich) Date: Mon, 16 Oct 2006 19:48:14 -0700 Subject: [Bioperl-l] split location problems In-Reply-To: References: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org> Message-ID: <8273f6c20610161948w201537a5v2fcfa189eb809283@mail.gmail.com> This probably was exposed by the fact that the Split object used to explicitly sort the features by start*strand always. But with remote locations and needing to be able to explicitly set the order (for features that are not required to be 5' -> 3') that code must have been removed. I think there is just one place that must be missing a 'reverse' on the list of sub-locations when the top-level feature is a complement. I'll wait for your fix before wading in - we probably might want to figure out a 'consolidate' method to shrink redundant and equivalent representations to the shortest possible form. Ugh this really starts to resemble trying to write a boolean logic toolkit.... -jason On 10/16/06, Chris Fields wrote: > > > On Oct 16, 2006, at 5:45 PM, Jason Stajich wrote: > > > The whole point of split locations is to represent genes with > > introns so that is not the "rare" case. > > > > I'm confused where the problem is. The locations that I get out > > with to_FTstring on the location object are exactly the same as > > those input. > > The problem is with the a subset of split locations described in the > bug report. The following works: > > complement(join(2691..4571,4918..5163)) > > whereas this: > > join(complement(4918..5163),complement(2691..4571)) > > gives this: > > complement(join(4918..5163,2691..4571)) > > which is not syntactically the same. It should be: > > complement(join(2691..4571,4918..5163)) > > since 'join' implies that the order of the segments to be joined is > important ('order' and 'bond' do not, I guess). > > > I have processed the genbank fungal genomes into GFF3 and have had > > no problems so I'm confused where you are breaking down. If I > > write them out as embl I also get the correct thing. This is using > > the CVS version of bioperl from the HEAD. > > > > I've added code to test this to bug 2101 including a C.glabrata > > chromsome downloaded from genbank. Perhaps the problem is on the > > EMBL parsing side, I didn't test that. > > > > On the technical side, I still am not sure I fully know where the > > strand information should be stored - the top level container or > > the sub-features. I'll try and stay up on the discussion if > > anything has been decided that I should know about. > > > > -jason > > Split::strand() sets the sublocations as well, which seems to confuse > the situation more but it is consistent with LocationI, as Hilmar > points out. I'm looking into a few solutions now, including a fix in > Split::to_FTstring(). > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > -- Jason Stajich jason at bioperl.org http://www.duke.edu/~jes12/ From cjfields at uiuc.edu Tue Oct 17 03:34:25 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 16 Oct 2006 22:34:25 -0500 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: References: Message-ID: On Oct 16, 2006, at 10:24 PM, Brian Osborne wrote: > Chris and Sendu, > > Sendu was correct in wondering whether id_parser() in Blast.pm > would work > after the module was altered to use SearchIO but what I've found > out from my > local tests is that id_parser() didn't work when BPlite was being used > either. I can continue to work on this but it's safe to say that > removing > BPlite doesn't cause a problem with id_parser, it was already there. > > Brian O. .... It may be one reason (the main reason?) the method wasn't tested. Maybe it should be removed if it can't be easily fixed; I don't think it makes sense keeping it otherwise. Chris Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bosborne11 at verizon.net Tue Oct 17 03:24:59 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 16 Oct 2006 23:24:59 -0400 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <000301c6f155$c7029ff0$15327e82@pyrimidine> Message-ID: Chris and Sendu, Sendu was correct in wondering whether id_parser() in Blast.pm would work after the module was altered to use SearchIO but what I've found out from my local tests is that id_parser() didn't work when BPlite was being used either. I can continue to work on this but it's safe to say that removing BPlite doesn't cause a problem with id_parser, it was already there. Brian O. On 10/16/06 3:03 PM, "Chris Fields" wrote: >> Hilmar Lapp wrote: >>> The problem is it is not maintained, and there are outstanding been bug >>> reports. >>> >>> If you un-deprecate it, then we need a response to people who come >>> across problems with it when using it. Either you change the POD to say >>> exactly who and when one should use it (or rather not) and point to the >>> fact that it is unsupported for all other cases. >>> >>> Or what would you suggest? >> >> I'm not sure. >> >> Does Bio::Index::Blast even work correctly? Does it suffer from whatever >> bugs Bio::Tools::BPlite has? Is Bio::Index::Blast maintained? Should >> that be deprecated as well? >> >> Ideally I'd like to see Bio::Index::Blast updated to use Bio::SearchIO >> and then Bio::Tools::BPlite can be deprecated. But like I say, it didn't >> seem trivial (or even appropriate). >> >> Ultimately I just wanted to solve the warnings in the test suite. >> Thoughts, Chris? > > My opinion is we either have to completely support BPlite (and the others) > or drop it altogether. I don't think we can state "use BPLite only with > Bio::Index::Blast, use SearchIO everywhere else." That's too inconsistent. > > > It seems simpler to deprecate the various Bio::Tools::BP* classes and either > fix Bio::Index::Blast to use Bio::SearchIO (which I think Brian is working > on) or deprecate Bio::Index::Blast as well. > > The warnings in the test suite belong to BlastIndex.t, correct? I updated > using Brian's Bio::Index::blast fix and it passes now w/o warnings. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From bosborne11 at verizon.net Tue Oct 17 03:48:56 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Mon, 16 Oct 2006 23:48:56 -0400 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: Message-ID: Chris, OK. In fact there's no written guarantee that all Bio::Index* modules have an id_parser() method. It happens that most do, and it's useful. I'll fix the documentation in Bio::Index::Blast and add an enhancement request to Bugzilla, may be able to get around to before 1.5.2 release but no promises. Brian O. On 10/16/06 11:34 PM, "Chris Fields" wrote: > > On Oct 16, 2006, at 10:24 PM, Brian Osborne wrote: > >> Chris and Sendu, >> >> Sendu was correct in wondering whether id_parser() in Blast.pm >> would work >> after the module was altered to use SearchIO but what I've found >> out from my >> local tests is that id_parser() didn't work when BPlite was being used >> either. I can continue to work on this but it's safe to say that >> removing >> BPlite doesn't cause a problem with id_parser, it was already there. >> >> Brian O. > > .... > > It may be one reason (the main reason?) the method wasn't tested. > Maybe it should be removed if it can't be easily fixed; I don't think > it makes sense keeping it otherwise. > > Chris > > Christopher Fields > Postdoctoral Researcher > Lab of Dr. Robert Switzer > Dept of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From n.haigh at sheffield.ac.uk Tue Oct 17 06:35:43 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 17 Oct 2006 07:35:43 +0100 Subject: [Bioperl-l] INSTALL and INSTALL.WIN Message-ID: <453479BF.90408@sheffield.ac.uk> I'm a bit unclear as to what is happening with these files. Are these files now superseded by the wikified versions? If so, should these files now just simply contain a link to the wikified versions - otherwise things could get in a mess since I updated the wiki version of INSTALL.WIN and I see Sendu updated the cvs versions a couple of weeks ago - hopefully these differences aren't that big. Nath From faruque at ebi.ac.uk Tue Oct 17 08:19:44 2006 From: faruque at ebi.ac.uk (Nadeem Faruque) Date: Tue, 17 Oct 2006 09:19:44 +0100 Subject: [Bioperl-l] split location problems Message-ID: EMBL' currently outputs join-complements in the format join(complement(30..40),complement(10..20)) instead of the Genbank preferred complement(join(10..20,30..40)) EMBL's may reflect what happens in the cell a little more than Genbank's, but it is less readable and less concise. NB I've also seen a couple of people construct these incorrectly eg join(complement(10..20),complement(30..40)) I believe we are moving to the complement-join format but I can't give a date for the transition. Having said that, trans-splicing will still give us the joys of complex locations, eg join(1..5,complement(join(10..20,30..40))) complement(join(30..40,10..20)) <- looks wrong (unless it is a very small circle) but mis-ordered exons are resolved by the trans- splicing machinery. Nadeem -- S.M. Nadeem N. Faruque EMBL Nucleotide Database Curation Team EMBL Outstation Tel: +44 1223 494611 Fax: +44 1223 494472 The European Bioinformatics Institute URL: http://www.ebi.ac.uk/ Email for data submissions: datasubs at ebi.ac.uk Email for updates: update at ebi.ac.uk ======================================================== From bix at sendu.me.uk Tue Oct 17 08:59:36 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 17 Oct 2006 09:59:36 +0100 Subject: [Bioperl-l] Use of Root.pm versus RootI.pm In-Reply-To: <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net> References: <4521E74E.1040404@infotech.monash.edu.au> <452F54A1.7010908@sendu.me.uk> <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net> <45333E02.9070808@sendu.me.uk> <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net> Message-ID: <45349B78.8090905@sendu.me.uk> Hilmar Lapp wrote: > So it looks like an abstract base class, not an interface that > defines a contract or API? Should use Root.pm then, would be my vote. Agreed, that was actually what I did in my local copy when I made a new inheriting class (so discovering the problem). This change is harmless to other modules, but does mean they'll have redundant use of Bio::Root::Root which will want cleaning up at some stage. From bix at sendu.me.uk Tue Oct 17 10:32:54 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 17 Oct 2006 11:32:54 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 Message-ID: <4534B156.4090501@sendu.me.uk> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. See http://www.bioperl.org/wiki/Release_1.5.2 for instructions on getting and testing this RC. Developers: This should be the last RC before release ~next monday. Now would be a good time for last minute documentaiton updates and additions. Users: Even though 1.5.2 is a 'developer' release, we consider it the most stable and capable version of Bioperl, and recommend that you use it in all but the most critical production environments. Please try it out and let us know of any problems or difficulties you run into. Thank you, Sendu. From cjfields at uiuc.edu Tue Oct 17 11:16:47 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 06:16:47 -0500 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <453479BF.90408@sheffield.ac.uk> References: <453479BF.90408@sheffield.ac.uk> Message-ID: <72C437E7-8F69-416D-996B-FF3DD1498E78@uiuc.edu> The general consensus was to keep text versions available; we could add URL links to the wiki pages for the most up-to-dat version. BTW, I have modified INSTALL already. INSTALL.WIN is next in line (I was waiting for your changes). Chris On Oct 17, 2006, at 1:35 AM, Nathan S. Haigh wrote: > I'm a bit unclear as to what is happening with these files. > > Are these files now superseded by the wikified versions? If so, should > these files now just simply contain a link to the wikified versions - > otherwise things could get in a mess since I updated the wiki > version of > INSTALL.WIN and I see Sendu updated the cvs versions a couple of weeks > ago - hopefully these differences aren't that big. > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Tue Oct 17 11:45:45 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 17 Oct 2006 12:45:45 +0100 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <72C437E7-8F69-416D-996B-FF3DD1498E78@uiuc.edu> References: <453479BF.90408@sheffield.ac.uk> <72C437E7-8F69-416D-996B-FF3DD1498E78@uiuc.edu> Message-ID: <4534C269.5050704@sheffield.ac.uk> Chris Fields wrote: > The general consensus was to keep text versions available; we could > add URL links to the wiki pages for the most up-to-dat version. BTW, > I have modified INSTALL already. INSTALL.WIN is next in line (I was > waiting for your changes). > Is it possible to generate these files from the wiki whenever there is a release? I now edits shouldn't be too severe or too often - but I can see things getting a little messy/annoying if edits have to be made in 2 places. Nath From cjfields at uiuc.edu Tue Oct 17 14:04:32 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 09:04:32 -0500 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <4534C269.5050704@sheffield.ac.uk> Message-ID: <000301c6f1f5$2d3509d0$15327e82@pyrimidine> There isn't a very easy way since so many links have to be removed/modified. I have found a few CPAN modules that could help, but for now I just dump the text output from a text browser (elinks) using the 'printable version' page and hand-edit, which works very quickly. That works for the time being until I can find another more automated solution. Fortunately there have been very few edits to either INSTALL wiki page so they should remain relatively stable. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk] > Sent: Tuesday, October 17, 2006 6:46 AM > To: Chris Fields > Cc: bioperl-l > Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN > > Chris Fields wrote: > > The general consensus was to keep text versions available; we could > > add URL links to the wiki pages for the most up-to-dat version. BTW, > > I have modified INSTALL already. INSTALL.WIN is next in line (I was > > waiting for your changes). > > > Is it possible to generate these files from the wiki whenever there is a > release? I now edits shouldn't be too severe or too often - but I can > see things getting a little messy/annoying if edits have to be made in 2 > places. > > Nath From cjfields at uiuc.edu Tue Oct 17 14:12:09 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 09:12:09 -0500 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: Message-ID: <000401c6f1f6$424b5580$15327e82@pyrimidine> > Chris, > > OK. In fact there's no written guarantee that all Bio::Index* modules have > an id_parser() method. It happens that most do, and it's useful. I'll fix > the documentation in Bio::Index::Blast and add an enhancement request to > Bugzilla, may be able to get around to before 1.5.2 release but no > promises. > > Brian O. Do the various Bio::Index* modules share a common interface? I wouldn't worry too much about it for this release, unless you really have time. It is still, after all, a developer's release, and you've noted it in Bugzilla. We could try for another dev release in winter (rel 1.5.3, I guess) to get any bug fixes or new modules added. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > On 10/16/06 11:34 PM, "Chris Fields" wrote: > > > > > On Oct 16, 2006, at 10:24 PM, Brian Osborne wrote: > > > >> Chris and Sendu, > >> > >> Sendu was correct in wondering whether id_parser() in Blast.pm > >> would work > >> after the module was altered to use SearchIO but what I've found > >> out from my > >> local tests is that id_parser() didn't work when BPlite was being used > >> either. I can continue to work on this but it's safe to say that > >> removing > >> BPlite doesn't cause a problem with id_parser, it was already there. > >> > >> Brian O. > > > > .... > > > > It may be one reason (the main reason?) the method wasn't tested. > > Maybe it should be removed if it can't be easily fixed; I don't think > > it makes sense keeping it otherwise. > > > > Chris > > > > Christopher Fields > > Postdoctoral Researcher > > Lab of Dr. Robert Switzer > > Dept of Biochemistry > > University of Illinois Urbana-Champaign > > > > > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From n.haigh at sheffield.ac.uk Tue Oct 17 14:15:17 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 17 Oct 2006 15:15:17 +0100 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <000301c6f1f5$2d3509d0$15327e82@pyrimidine> References: <000301c6f1f5$2d3509d0$15327e82@pyrimidine> Message-ID: <4534E575.5050308@sheffield.ac.uk> Chris Fields wrote: > There isn't a very easy way since so many links have to be removed/modified. > I have found a few CPAN modules that could help, but for now I just dump the > text output from a text browser (elinks) using the 'printable version' page > and hand-edit, which works very quickly. That works for the time being > until I can find another more automated solution. > > Fortunately there have been very few edits to either INSTALL wiki page so > they should remain relatively stable. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > So am I correct in saying that the best way is to make all updates to the wikified versions of these files, and then at regular intervals/major releases you (or someone else) will update the CVS version of the files in the way describe above? Cheers Nath From bix at sendu.me.uk Tue Oct 17 14:00:39 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 17 Oct 2006 15:00:39 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4534E09C.9030707@genomics.dk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> Message-ID: <4534E207.8030508@sendu.me.uk> Niels Larsen wrote: > Greetings, > > I am no perl beginner, but I am a BioPerl beginner. Today I looked > for remote similarity services that can be used from Perl. I found > the EBI SOAP interface where their example script returns > > Can't find method element in the message at > /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311. What script exactly? There was a problem with the SOAP server that was fixed earlier today. > and the DDBJ service which (from Denmark) returns > > undef What returned undef? Specifics please. > and then the NCBI server accessed through BioPerls RemoteBlast which > seems to spin in a loop that fills TMPDIR with many tempfiles. Will > release 1.5.2 include improvements to RemoteBlast? (I see Roger Hall > is working towards that). What version of Bioperl were you testing with? What did you do to get it to 'spin in a loop'? I can tell you that remote blasting certainly works in Bioperl 1.5.2, but you'll have to give more details on the things you tried and the problems you encountered. You can also answer the questions yourself by trying the release candidate. From B.Beckert at ibmc.u-strasbg.fr Tue Oct 17 13:59:30 2006 From: B.Beckert at ibmc.u-strasbg.fr (Bertrand Beckert) Date: Tue, 17 Oct 2006 15:59:30 +0200 Subject: [Bioperl-l] problems with: Bio::Tools::Run::RemoteBlast Message-ID: hi, I am running a large number of blasts via a connexion to ncbi blast page ('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi'). I try to use 'Bio::Tools::Run::RemoteBlast' but unfortunately I have some problems. I make a simple example with only one sequence in order to understand how work this module. This is my simple input file, a DNA sequence in fasta form: > test > TTTTGATGAGGCGCATCAATCATGAGTAAAGTTTAGATTACTGTCTGCTAACAGCTGAAT TTGAAAGGGTGCGATGCCGAAGCGATTATAATAGCAGTTATAATTTGTTGGACTTTTTGG TTAAGAGCTGAGAGTTTGTCATTATTTAAAAATAATGGAGTGCATCACTTGTA I have made some modification of the example available in doc of bioperl. It give me a RID which contain the results of my blast but I have a problem with the "$result=$factory->retrieve_blast($rid)" in my script. In the documentation it wrote that $result=$factory->retrieve_blast ($rid) return when it work a Bio::Tools::Bplite or Bio::Tools::Blast object. In my case it returns a Bio::SearchIO::blast... I don't understand why I don't have the good type of object return (see PART I). I also try to resolve the problem by replace the foreach loop in my script by a new one in order to explore the blast page result but it also don't work (see part II). could you help me please. Thank you Bertrand Beckert. PART I: Here is my script with a little annotation and also the shell window printing: ------------------------------------------------------------------------ ---------------------------- #!/usr/bin/perl -w use Bio::Tools::Run::RemoteBlast; use Bio::SearchIO; sub blast { my $prog='blastn'; my $db='refseq_genomic'; my $e_val='1e-10'; my $Input='Seq.fasta'; my @params = ('-prog' => $prog, '-data' => $db, '-expect' => $e_val, '-readmethod' => 'SearchIO'); my $factory = Bio::Tools::Run::RemoteBlast->new(@params); #changes parameters $Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'}='Bacteria [ORGN]'; $Bio::Tools::Run::RemoteBlast::HEADER{'MATRIX_NAME'}='BLOSUM25'; $factory->submit_blast($Input); print STDERR "waiting...\n"; while (my @rids=$factory->each_rid) { print "my rid: ", at rids,"\n"; #return me the ID of the submited blast i.e. RID: 1161079157-766-185099855365.BLASTQ2 #this page contains the result of my blast... foreach my $rid (@rids) { $result=$factory->retrieve_blast($rid); #line in order to understand what type of object is return by retrieve_blast print "rc:", $result,"\n"; } } } &blast; ------------------------------------------------------------------------ ---------------------------- here you can see the shell window: bbeckert at tatooine:~/Script_perl$ ./test.pl waiting... my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x890bc54) my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x890bc30) my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x89eb7f4) my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x8a2cc74) my rid: 1161079157-766-185099855365.BLASTQ2 ... my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x886bbac) my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x89eb5f0) my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x8a2d2d4) my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x84fa054) ... PARTII: I also try to resolve the problem by replace the foreach loop in my script by: ------------------------------------------------------------------------ ---------------------------- foreach my $rid (@rids) { while(1) { $result=$factory->retrieve_blast($rid)->next_result(); print "rc:", $result,"\n"; if ($result) { print $result->num_hits(),"\n"; } ------------------------------------------------------------------------ ---------------------------- With tis loop I could explore the result Blast page. that is what I obtain in the shell window: bbeckert at tatooine:~/Script_perl$ ./test.pl waiting... my rid: 1161088606-9905-123050755601.BLASTQ4 Use of uninitialized value in print at ./retrieve_blast.pl line 30. rc: Use of uninitialized value in print at ./retrieve_blast.pl line 30. rc: Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::Search::Result::BlastResult=HASH(0x84fb8b8) 0 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::Search::Result::BlastResult=HASH(0x84fba8c) 0 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::Search::Result::BlastResult=HASH(0x84fb834) ---- -- Berrtrand BECKERT PhD student IBMC - UPR 9002 du CNRS - ARN 15, rue Rene Descartes F-67084 STRASBOURG Cedex b.beckert at ibmc.u-strasbg.fr From niels at genomics.dk Tue Oct 17 13:54:36 2006 From: niels at genomics.dk (Niels Larsen) Date: Tue, 17 Oct 2006 15:54:36 +0200 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4534B156.4090501@sendu.me.uk> References: <4534B156.4090501@sendu.me.uk> Message-ID: <4534E09C.9030707@genomics.dk> Greetings, I am no perl beginner, but I am a BioPerl beginner. Today I looked for remote similarity services that can be used from Perl. I found the EBI SOAP interface where their example script returns Can't find method element in the message at /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311. and the DDBJ service which (from Denmark) returns undef and then the NCBI server accessed through BioPerls RemoteBlast which seems to spin in a loop that fills TMPDIR with many tempfiles. Will release 1.5.2 include improvements to RemoteBlast? (I see Roger Hall is working towards that). Niels L ------------------------------------------------------------------------ Niels Larsen Danish Genome Institute Gustav Wieds vej 10 C DK-8000 Aarhus C Denmark Electronic mail: niels at genomics.dk Skype: niels_larsen_denmark Telephone: +45-8942-5268 Telefax: +45-8620-1222 ------------------------------------------------------------------------ From cjfields at uiuc.edu Tue Oct 17 14:28:40 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 09:28:40 -0500 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <4534E575.5050308@sheffield.ac.uk> Message-ID: <000501c6f1f8$8b78efe0$15327e82@pyrimidine> ... > So am I correct in saying that the best way is to make all updates to > the wikified versions of these files, and then at regular > intervals/major releases you (or someone else) will update the CVS > version of the files in the way describe above? > > Cheers > Nath Yes. I think the online docs will stay relatively stable. A week or so ago Mauricio and I were discussing moving the dependencies list to it's own CVS document (since they pertain to all Bioperl installations, not just UNIX'y flavors). I haven't done that yet since I was waiting on the INSTALL.WIN changes before I made any more changes. Well, that and I've been really busy doing other things. One way we could make sure that changes to the online docs would match the CVS docs would be to only allow certain wiki users (such as sysadmins) make modifications to those pages. That way any changes would have to go through someone who also has CVS access and could make similar changes to the distribution docs. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Tue Oct 17 14:37:38 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 17 Oct 2006 15:37:38 +0100 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <000501c6f1f8$8b78efe0$15327e82@pyrimidine> References: <000501c6f1f8$8b78efe0$15327e82@pyrimidine> Message-ID: <4534EAB2.50609@sheffield.ac.uk> Chris Fields wrote: > ... > >> So am I correct in saying that the best way is to make all updates to >> the wikified versions of these files, and then at regular >> intervals/major releases you (or someone else) will update the CVS >> version of the files in the way describe above? >> >> Cheers >> Nath >> > > Yes. I think the online docs will stay relatively stable. A week or so ago > Mauricio and I were discussing moving the dependencies list to it's own CVS > document (since they pertain to all Bioperl installations, not just UNIX'y > flavors). I haven't done that yet since I was waiting on the INSTALL.WIN > changes before I made any more changes. Well, that and I've been really > busy doing other things. > Sounds good. > One way we could make sure that changes to the online docs would match the > CVS docs would be to only allow certain wiki users (such as sysadmins) make > modifications to those pages. That way any changes would have to go through > someone who also has CVS access and could make similar changes to the > distribution docs. > Ugh, not sure I like the sound of maintaining 2 copies of any files - sounds like a future headache even if they are pretty stable. It also makes it unclear which of the two file should be considered first (i.e. is the most up-to-date) on pages such as: http://www.bioperl.org/wiki/Installing_BioPerl It suggests that INSTALL and INSTALL.WIN should be looked at first, but there are online copies of those files available - this should now be the other way around - shouldn't it? I might just be making a mountain out of a molehill, so I'll shut up on this topic and make any future edits to the wiki pages instead. > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > From bosborne11 at verizon.net Tue Oct 17 14:48:54 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 17 Oct 2006 10:48:54 -0400 Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated? In-Reply-To: <000401c6f1f6$424b5580$15327e82@pyrimidine> Message-ID: Chris, The Bio::Index modules either 'use base qw(Bio::Index::Abstract)' or 'use base qw(Bio::Index::AbstractSeq)'. Neither of these modules has an id_parser() method. Brian O. On 10/17/06 10:12 AM, "Chris Fields" wrote: > Do the various Bio::Index* modules share a common interface? From cjfields at uiuc.edu Tue Oct 17 14:45:53 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 09:45:53 -0500 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <4534EAB2.50609@sheffield.ac.uk> Message-ID: <000601c6f1fa$f260b560$15327e82@pyrimidine> ... > > One way we could make sure that changes to the online docs would match > the > > CVS docs would be to only allow certain wiki users (such as sysadmins) > make > > modifications to those pages. That way any changes would have to go > through > > someone who also has CVS access and could make similar changes to the > > distribution docs. > > > Ugh, not sure I like the sound of maintaining 2 copies of any files - > sounds like a future headache even if they are pretty stable. It also > makes it unclear which of the two file should be considered first (i.e. > is the most up-to-date) on pages such as: > http://www.bioperl.org/wiki/Installing_BioPerl > > It suggests that INSTALL and INSTALL.WIN should be looked at first, but > there are online copies of those files available - this should now be > the other way around - shouldn't it? I might just be making a mountain > out of a molehill, so I'll shut up on this topic and make any future > edits to the wiki pages instead. Yes that should be the other way around (the wiki would be the most up-to-date), so the CVS docs should point to the wiki, not vice-versa. Getting the docs right is as important as getting the code to work. So I don't consider it a 'mountain-out-of-a-molehill' problem. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 17 15:07:49 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 10:07:49 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4534E207.8030508@sendu.me.uk> Message-ID: <001001c6f1fe$02fd4de0$15327e82@pyrimidine> > Niels Larsen wrote: > > Greetings, > > > > I am no perl beginner, but I am a BioPerl beginner. Today I looked > > for remote similarity services that can be used from Perl. I found > > the EBI SOAP interface where their example script returns > > > > Can't find method element in the message at > > /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311. > > What script exactly? There was a problem with the SOAP server that was > fixed earlier today. > > > > and the DDBJ service which (from Denmark) returns > > > > undef > > What returned undef? Specifics please. > The first problem, like Sendu mentions, was fixed on the remote server (I get them to pass now). Those were from bioperl-run, though, not the bioperl core distribution. As for DDBJ, do you mean EBI or SwissProt? I ask b/c you mention Denmark. EBI were having server maintenance outages yesterday, which was announced here. As Sendu mentions, please be more specific. > > and then the NCBI server accessed through BioPerls RemoteBlast which > > seems to spin in a loop that fills TMPDIR with many tempfiles. Will > > release 1.5.2 include improvements to RemoteBlast? (I see Roger Hall > > is working towards that). > > What version of Bioperl were you testing with? What did you do to get it > to 'spin in a loop'? I can tell you that remote blasting certainly works > in Bioperl 1.5.2, but you'll have to give more details on the things you > tried and the problems you encountered. > > You can also answer the questions yourself by trying the release > candidate. The tempfiles showing up are from the repeated RID requests and are deleted after the BLAST run (at least they should be); this is quite normal. They don't 'spin in a loop' unless the BLAST query is taking a particularly long time, which can happen depending on how the BLAST query is set up, i.e. what type of BLAST program is requested, if comp-based stats are requested, length of query, database requested, etc. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Tue Oct 17 15:14:07 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 17 Oct 2006 16:14:07 +0100 Subject: [Bioperl-l] problems with: Bio::Tools::Run::RemoteBlast In-Reply-To: References: Message-ID: <4534F33F.3070809@sendu.me.uk> Bertrand Beckert wrote: > hi, > > I am running a large number of blasts via a connexion to ncbi blast > page ('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi'). > I try to use 'Bio::Tools::Run::RemoteBlast' but unfortunately I have > some problems. [snip] > In the documentation it wrote that $result=$factory->retrieve_blast > ($rid) return when it work a Bio::Tools::Bplite or Bio::Tools::Blast > object. In my case it returns a Bio::SearchIO::blast... I don't > understand why I don't have the good type of object return (see PART I). I take it you're using some old version of Bioperl where unfortunately the documentation was incorrect. In fact you're supposed to get a Bio::SearchIO object, so it is a good thing that you are. The latest version of Bioperl has (as far as I can see) correct documentation and behaviour. Bio::Tools::Bplite and Bio::Tools::Blast are deprecated. You want Bio::SearchIO::blast. All is well. > I also try to resolve the problem by replace the foreach loop in my > script by a new one in order to explore the blast page result but it > also don't work (see part II). I'm not really sure what problem you might be facing there, but take a look at some up-to-date documentation, using the new example code: http://doc.bioperl.org/bioperl-live/Bio/Tools/Run/RemoteBlast.html From n.haigh at sheffield.ac.uk Tue Oct 17 16:10:15 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 17 Oct 2006 17:10:15 +0100 Subject: [Bioperl-l] [Fwd: Re: Bundle::BioPerl] Message-ID: <45350067.6070604@sheffield.ac.uk> FYI on Bundle::BioPerl Nathan -------- Original Message -------- Subject: Re: Bundle::BioPerl Date: Tue, 17 Oct 2006 11:52:00 -0400 From: Chris Dagdigian To: Nathan S. Haigh References: <45348FB8.4050009 at sheffield.ac.uk> Hi Nathan, I've updated the Bundle and uploaded it to CPAN. I *think* the rationale for keeping it still exists but I'm removed enough from Bioperl now that I'll defer to others on the decision. The basic idea was that BioPerl has a heck of a lot of dependencies that it requires of (other perl modules) in order to get all the functionality out of it. Many of these dependencies may not be present in default Perl installations. Tracking down all of the dependencies and installing them (along with all of the dependencies- of-the-dependencies) by hand is a massive pain. The nice thing about the Bundle is that it lists the core module dependencies and it works great with the CPAN.pm module to automate the downloading and installation of everything that BioPerl requires. The CPAN module is smart enough that when processing *our* bundle it will also track down and install anything that our bundle entries themselves list as a dependency. So for unix/Linux systems the Bundle is a great one-liner ("perl - MCPAN -e 'install Bundle::BioPerl'" ) way to auto-install or update the many perl modules that BioPerl makes use of. On the windows side, not sure if it is of any help though. Regards, Chris On Oct 17, 2006, at 4:09 AM, Nathan S. Haigh wrote: > Hi Chris > > I've been working on making a PPD for the upcoming Bioperl 1.5.2 > release. During this time I also updated Bundle::BioPerl to include > up-to-date prereqs. I was wondering if you could update the CPAN > package? The updated BioPerl.pm file is attached. > > There is some talk about why and if we need Bundle::BioPerl > anymore. What was the rationale for having it in the first place, > and does it still hold true now? > > Cheers > Nath > From plu5even at gmail.com Tue Oct 17 16:26:34 2006 From: plu5even at gmail.com (Peter H. Baenziger) Date: Tue, 17 Oct 2006 12:26:34 -0400 Subject: [Bioperl-l] LocatableSeq object vs Sequence Object Message-ID: <830b42e70610170926h5bdaabd6hc1352b8d4f3c8c4c@mail.gmail.com> All, This is my first bioperl script (but not my first Perl script) so please forgive my naivety. I've read through documentation and looked through cookbooks and the like but to no avail. Any advice is appreciated. So...I am working with an alignment object of several sequences. My intentions is to loop through all the sequences of the alignment to find what amino acid they have at a known position in the alignment (not the position in the sequence). I was thinking I could use: foreach $seq ($alignment->each_seq()) to loop through the sequences and call: $seq->location_from_column($pos) on each of the sequences. However, I don't think I have "LocatableSequences" (the type of object that has method "location_from_columns") being returned by $alignment->each_seq(). So, how do I bridge this gap here? Or is there a better way? My appreciation in advance! Peter code: my $swissObj = $swissdb->get_Seq_by_acc($query); //put several of these in @sequenceObjects ... my $alignFactory = Bio::Tools::Run::Alignment::Clustalw->new(); my $alignment = $alignFactory->align(\@sequenceObjects); #print $alignment->overall_percentage_identity(); #works #now we find the "alignment position" of the mutation we have on the human version and get the amino acid at that "alignment position" for all seq my $humanSequence = $prefix."HUMAN"; my $pos = $alignment->column_from_residue_number($humanSequence, $aa_seqpos); #this is the "alignment position" equivalent to the mutation position #we'll keep track of what amino acid each species has at the "alignment equivalent" location listed as being a mutation on the the human version foreach $seq ($alignment->each_seq()) { #print $seq->species() . "\n"; #won't work because $alignment->each_seq() actually returns a locatableSeq object, not a normal sequence object $speciesAA{$species} = $seq->locatation_from_column($pos); } -- <<->> Peter H. Baenziger From akarger at CGR.Harvard.edu Tue Oct 17 16:53:19 2006 From: akarger at CGR.Harvard.edu (Amir Karger) Date: Tue, 17 Oct 2006 12:53:19 -0400 Subject: [Bioperl-l] split location problems Message-ID: > From: Jason Stajich [mailto:jason.stajich at gmail.com] > > The whole point of split locations is to represent genes with > introns > so that is not the "rare" case. Absolutely. > I have processed the genbank fungal genomes into GFF3 and > have had no > problems so I'm confused where you are breaking down. If I write > them out as embl I also get the correct thing. This is using > the CVS > version of bioperl from the HEAD. > > I've added code to test this to bug 2101 including a C.glabrata > chromsome downloaded from genbank. Perhaps the problem is on the > EMBL parsing side, I didn't test that. Well, I don't know whether it's EMBL parsing, or a bit further down the pipe, but I downloaded C.glabrata chromosome B for GenBank (NC_005968), and it describes the complement/joins in the way that Bioperl is handling correctly. GenBank: CDS complement(join(10347..10372,10632..11157)) /locus_tag="CAGL0B00242g" EMBL: FT CDS join(complement(10632..11157),complement(10347..10372)) FT /locus_tag="CAGL0B00242g" Here's the diff when I run the location-printing script I posted yesterday: diff biogb bio 1c1,5 < complement(join(10347..10372,10632..11157)) --- > complement(1701..2651) > complement(2635..3345) > complement(3980..4408) > complement(join(10632..11157,10347..10372)) > 10379..10615 209a214,217 > 498198..498890 > 499712..500062 > 499851..500702 > 500579..501364 As you can see, the complement/join CDS is written out in a different order, which is Bad. (I looked at at least one of the other differences: the GB file says it's a "misc feature" and EMBL says it's a CDS. But they don't seem to be relevant here.) -Amir > > On the technical side, I still am not sure I fully know where the > strand information should be stored - the top level container or the > sub-features. I'll try and stay up on the discussion if > anything has > been decided that I should know about. > > -jason > > > > From paul.boutros at utoronto.ca Tue Oct 17 16:57:19 2006 From: paul.boutros at utoronto.ca (Paul Boutros) Date: Tue, 17 Oct 2006 12:57:19 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 Message-ID: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca> Hi, Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two failed tests, the first seems to be just a result of me not having DBD::mysql installed. Paul Test Summary ============ Failed Test Stat Wstat Total Fail List of Failed ------------------------------------------------------------------------------- t/BioDBSeqFeature_mysql.t 46 46 1-46 t/SearchIO.t 22 5632 1337 2671 2-1337 2 tests and 106 subtests skipped. Failed 2/236 test scripts. 1382/11688 subtests failed. Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys = 159.61 CPU) BioDBSeqFeature_mysql ===================== pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t 1..46 install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8 /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at (eval 37) line 3. Perhaps the DBD::mysql perl module hasn't been fully installed, or perhaps the capitalisation of 'mysql' isn't right. Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge. at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208 SearchIO ======== pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more 1..1337 ok 1 -------------------- WARNING --------------------- MSG: XML::SAX::Expat not currently supported; must have local copies of NCBI DTD docs! --------------------------------------------------- -------------------- WARNING --------------------- MSG: error in parsing a report: 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' does not exist file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd Handler couldn't resolve external entity at line 2, column 82, byte 104 error in processing external entity reference at line 2, column 82, byte 104 at /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line 187 --------------------------------------------------- not ok 2 # Failed test 2 in t/SearchIO.t at line 68 Can't call method "database_name" on an undefined value at t/SearchIO.t line 69. ------------------------------ Message: 10 Date: Tue, 17 Oct 2006 11:32:54 +0100 From: Sendu Bala Subject: [Bioperl-l] Bioperl 1.5.2 RC2 To: bioperl-l at bioperl.org Message-ID: <4534B156.4090501 at sendu.me.uk> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. See http://www.bioperl.org/wiki/Release_1.5.2 for instructions on getting and testing this RC. Developers: This should be the last RC before release ~next monday. Now would be a good time for last minute documentaiton updates and additions. Users: Even though 1.5.2 is a 'developer' release, we consider it the most stable and capable version of Bioperl, and recommend that you use it in all but the most critical production environments. Please try it out and let us know of any problems or difficulties you run into. Thank you, Sendu. From barry.moore at genetics.utah.edu Tue Oct 17 16:57:48 2006 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Tue, 17 Oct 2006 10:57:48 -0600 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <000301c6f1f5$2d3509d0$15327e82@pyrimidine> References: <000301c6f1f5$2d3509d0$15327e82@pyrimidine> Message-ID: <3CACE1AB-D7F2-4F54-A1C3-35E1AE444CD1@genetics.utah.edu> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix does a reasonable job of textifying html. You get the links as numbered references at the bottom or: lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix | perl -ane 's/\[?\[\d+\](edit\])?//g;print' to remove the links all together. Barry P.S. Looks like this: #Creative Commons copyright Installing Bioperl for Unix From BioPerl Jump to: navigation, search Contents * 1 BIOPERL INSTALLATION * 2 SYSTEM REQUIREMENTS * 3 OPTIONAL * 4 ADDITIONAL INSTALLATION INFORMATION * 5 THE BIOPERL BUNDLE * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make' * 8 WHERE ARE THE MAN PAGES? * 9 EXTERNAL PROGRAMS + 9.1 Environment Variables * 10 INSTALLING BIOPERL SCRIPTS * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA * 12 INSTALLING BIOPERL MODULES THE HARD WAY * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION * 14 THE TEST SYSTEM * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE + 15.1 CONFIGURING for BSD and Solaris boxes + 15.2 INSTALLATION * 16 DEPENDENCIES AND Bundle::BioPerl BIOPERL INSTALLATION Bioperl has been installed on many forms of Unix, Win9X/NT/2000/XP, and on Mac OS X (see the PLATFORMS file for more details). Following are instructions for installing Bioperl for Unix/Linux/Mac OS X; Windows installation instructions can be found here. For installing Bioperl for Mac OS X using Fink, see Getting BioPerl. SYSTEM REQUIREMENTS * Perl 5.005 or later; version 5.6 and greater are recommended. Note that most modules will work with earlier versions of Perl. The only ones that will not are Bio::SimpleAlign and the Bio::Index::* modules. If you don't need these modules and you want to install Bioperl using an earlier version of Perl, edit the "require 5.005;" line in Makefile.PL as necessary. * External modules: Bioperl uses functionality provided in other Perl modules. Some of these are included in the standard perl package but some need to be obtained from the CPAN site. The list of external modules is included at the bottom of this document. The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of these external modules easy. Simply install the bundle using your CPAN shell and all necessary modules will be installed. See THE BIOPERL BUNDLE, below. OPTIONAL * ANSI C or GNU C compiler (gcc) for XS extensions (the bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext PACKAGE, below). ADDITIONAL INSTALLATION INFORMATION * Additional information on Bioperl and MAC OS: + OS 9 - http://bioperl.org/Core/mac-bioperl.html + OSX-http://www.tc.umn.edu/~cann0010/ Bioperl_OSX_install.html + OS X - Installing using Fink (in Getting BioPerl) THE BIOPERL BUNDLE You typically need root privileges to install using CPAN. If you don't have these privileges please see INSTALLING BIOPERL IN A PERSONAL MODULE AREA for additional information. Install Bundle::Bioperl using CPAN. One way: >perl -MCPAN -e "install Bundle::BioPerl" Another way: >perl -MCPAN -e shell cpan>install Bundle::BioPerl On Oct 17, 2006, at 8:04 AM, Chris Fields wrote: > There isn't a very easy way since so many links have to be removed/ > modified. > I have found a few CPAN modules that could help, but for now I just > dump the > text output from a text browser (elinks) using the 'printable > version' page > and hand-edit, which works very quickly. That works for the time > being > until I can find another more automated solution. > > Fortunately there have been very few edits to either INSTALL wiki > page so > they should remain relatively stable. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > >> -----Original Message----- >> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk] >> Sent: Tuesday, October 17, 2006 6:46 AM >> To: Chris Fields >> Cc: bioperl-l >> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN >> >> Chris Fields wrote: >>> The general consensus was to keep text versions available; we could >>> add URL links to the wiki pages for the most up-to-dat version. >>> BTW, >>> I have modified INSTALL already. INSTALL.WIN is next in line (I was >>> waiting for your changes). >>> >> Is it possible to generate these files from the wiki whenever >> there is a >> release? I now edits shouldn't be too severe or too often - but I can >> see things getting a little messy/annoying if edits have to be >> made in 2 >> places. >> >> Nath > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From niels at genomics.dk Tue Oct 17 16:58:14 2006 From: niels at genomics.dk (Niels Larsen) Date: Tue, 17 Oct 2006 18:58:14 +0200 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4534E207.8030508@sendu.me.uk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> Message-ID: <45350BA6.3040102@genomics.dk> Ok, here are ways to reproduce; I sure apologize if I made the test scripts wrong. And I suppose EBI/DDBJ's interfaces are not a bioperl issue really. Niels ------------ EBI I invoked the EBI script http://www.ebi.ac.uk/Tools/webservices/downloads/WSWUBlastSOAPLite.zip like this WSWUBlastClient.pl -p blastn -D embl test.fasta where the content of test.fasta is below, and got Can't find method element in the message at /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311. >Planctomyces sp. 282; Genbank Taxonomy ID: 79927 AATGAACGTTGGCGGCATGGATTAGGCATGCAAGTCGAGGGAGAACCCGCAAGGGGACACCGGCG AACGGGGTAGGAATACATAGGTAACGTACCCTCAGGACGGGGATAGCCAAGGGAAACTTTGGGTA ATACCCGATGTGATGGCAAGATGTGAATGCTTGTCATCAAAGGTGAGATTCCACCTGAGGAGCGG CTTATGCATCATTAGCTTGTTGGCGGGGTAACGGCCCACCAAGGCTGCGATGATTAGGGGGTGTG AGAGCATGGCCCCCACCACTGGCACTGAGACACTGGCCAGACACCTACGGGTGGCTGCAGTCGAG I tried with this test sequence in fasta format and with just the sequence. ------------ DDBJ Inspired by this page, http://xml.nig.ac.jp/doc/Blast.txt I made this test script ------ cut -- #!/usr/bin/env perl use strict; use warnings FATAL => qw ( all ); my ( $service, $seqstr, $result ); use SOAP::Lite; use Data::Dumper; $service = SOAP::Lite->service('http://xml.nig.ac.jp/wsdl/Blast.wsdl'); $seqstr = "MSSRIARALALVVTLLHLTRLALSTCPAACHCPLEAPKCAPGVGLVRDGCGCCKVCAKQL"; $result = $service->searchSimple( "blastp", "SWISS", $seqstr ); print Dumper( $result ); ------ cut -- which for me prints undef. ------------- NCBI/Bioperl I installed 1.5.2-RC2, looked at the RemoteBlast example in http://www.bioperl.org/wiki/Bptutorial.pl and then put that into this test code, more or less cut/paste, --- cut -- #!/usr/bin/env perl use strict; use warnings FATAL => qw ( all ); use Bio::Tools::Run::RemoteBlast; use Data::Dumper; my ( $remote_blast, $r, $rc, $rid, @rids ); $remote_blast = Bio::Tools::Run::RemoteBlast->new ( -prog => 'blastn', -data => 'ecoli', -expect => '1e-10' ); $r = $remote_blast->submit_blast("ecoli.fasta"); while ( @rids = $remote_blast->each_rid ) { # print Dumper( \@rids ); for $rid ( @rids ) { $rc = $remote_blast->retrieve_blast($rid); # print Dumper( $rc ); } sleep 10; } --- cut -- which saves the same blast report to TMPDIR for every 10 seconds. The "ecoli.fasta" file contains this >test gggggctctgttggttctcccgcaacgctactctgtttaccaggtcaggtccggaaggaa gcagccaaggcagatgacgcgtgtgccgggatgtagctggcagggcccccaccc Maybe I am supposed to add a check for content in $rc and then stop the inner loop? I could figure that out maybe, but I wish there was a function which simply takes a single sequence + arguments and only returns a list of matches when done, and does not return until then (or until a specified timeout). ------------------------------------------------------------------------ Niels Larsen Danish Genome Institute Gustav Wieds vej 10 C DK-8000 Aarhus C Denmark Electronic mail: niels at genomics.dk Skype: niels_larsen_denmark Telephone: +45-8942-5268 Telefax: +45-8620-1222 ------------------------------------------------------------------------ From bertrand.beckert at gmail.com Tue Oct 17 14:52:36 2006 From: bertrand.beckert at gmail.com (bertrand beckert) Date: Tue, 17 Oct 2006 16:52:36 +0200 Subject: [Bioperl-l] problems with: Bio::Tools::Run::RemoteBlast Message-ID: <500217090610170752q565cfc08t5208e3b64f99ef7f@mail.gmail.com> hi, I am running a large number of blasts via a connexion to ncbi blast page ('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi'). I try to use 'Bio::Tools::Run::RemoteBlast' but unfortunately I have some problems. I make a simple example with only one sequence in order to understand how work this module. This is my simple input file, a DNA sequence in fasta form: >test TTTTGATGAGGCGCATCAATCATGAGTAAAGTTTAGATTACTGTCTGCTAACAGCTGAAT TTGAAAGGGTGCGATGCCGAAGCGATTATAATAGCAGTTATAATTTGTTGGACTTTTTGG TTAAGAGCTGAGAGTTTGTCATTATTTAAAAATAATGGAGTGCATCACTTGTA I have made some modification of the example available in doc of bioperl. It give me a RID which contain the results of my blast but I have a problem with the "$result=$factory->retrieve_blast($rid)" in my script. In the documentation it wrote that $result=$factory->retrieve_blast ($rid) return when it work a Bio::Tools::Bplite or Bio::Tools::Blast object. In my case it returns a Bio::SearchIO::blast... I don't understand why I don't have the good type of object return (see PART I). I also try to resolve the problem by replace the foreach loop in my script by a new one in order to explore the blast page result but it also don't work (see part II). could you help me please. Thank you Bertrand Beckert. PART I: Here is my script with a little annotation and also the shell window printing: ------------------------------------------------------------------------ #!/usr/bin/perl -w use Bio::Tools::Run::RemoteBlast; use Bio::SearchIO; sub blast { my $prog='blastn'; my $db='refseq_genomic'; my $e_val='1e-10'; my $Input='Seq.fasta'; my @params = ('-prog' => $prog, '-data' => $db, '-expect' => $e_val, '-readmethod' => 'SearchIO'); my $factory = Bio::Tools::Run::RemoteBlast->new(@params); #changes parameters $Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'}='Bacteria [ORGN]'; $Bio::Tools::Run::RemoteBlast::HEADER{'MATRIX_NAME'}='BLOSUM25'; $factory->submit_blast($Input); print STDERR "waiting...\n"; while (my @rids=$factory->each_rid) { print "my rid: ", at rids,"\n"; #return me the ID of the submited blast i.e. RID: 1161079157-766-185099855365.BLASTQ2 #this page contains the result of my blast... foreach my $rid (@rids) { $result=$factory->retrieve_blast($rid); #line in order to understand what type of object is return by retrieve_blast print "rc:", $result,"\n"; } } } &blast; ------------------------------------------------------------------------ here you can see the shell window: bbeckert at tatooine:~/Script_perl$ ./test.pl waiting... my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x890bc54) my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x890bc30) my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x89eb7f4) my rid: 1161079157-766-185099855365.BLASTQ2 rc:Bio::SearchIO::blast=HASH(0x8a2cc74) my rid: 1161079157-766-185099855365.BLASTQ2 ... my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x886bbac) my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x89eb5f0) my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x8a2d2d4) my rid: 1161079157-766-185099855365.BLASTQ2 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::SearchIO::blast=HASH(0x84fa054) ... PARTII: I also try to resolve the problem by replace the foreach loop in my script by: ------------------------------------------------------------------------ foreach my $rid (@rids) { while(1) { $result=$factory->retrieve_blast($rid)->next_result(); print "rc:", $result,"\n"; if ($result) { print $result->num_hits(),"\n"; } ------------------------------------------------------------------------ With tis loop I could explore the result Blast page. that is what I obtain in the shell window: bbeckert at tatooine:~/Script_perl$ ./test.pl waiting... my rid: 1161088606-9905-123050755601.BLASTQ4 Use of uninitialized value in print at ./retrieve_blast.pl line 30. rc: Use of uninitialized value in print at ./retrieve_blast.pl line 30. rc: Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::Search::Result::BlastResult=HASH(0x84fb8b8) 0 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::Search::Result::BlastResult=HASH(0x84fba8c) 0 Parsing of undecoded UTF-8 will give garbage when decoding entities at /usr/share/perl5/LWP/Protocol.pm line 137. rc:Bio::Search::Result::BlastResult=HASH(0x84fb834) ---- -- Berrtrand BECKERT PhD student IBMC - UPR 9002 du CNRS - ARN 15, rue Rene Descartes F-67084 STRASBOURG Cedex b.beckert at ibmc.u-strasbg.fr bertrand.beckert at gmail.com From cjfields at uiuc.edu Tue Oct 17 17:50:49 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 12:50:49 -0500 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <3CACE1AB-D7F2-4F54-A1C3-35E1AE444CD1@genetics.utah.edu> Message-ID: <001201c6f214$c8934440$15327e82@pyrimidine> (Apologies for the top post, but I thought my response might get lost below) I use elinks in a similar fashion. It tends to format the tables a bit better than lynx. Chris > -----Original Message----- > From: Barry Moore [mailto:barry.moore at genetics.utah.edu] > Sent: Tuesday, October 17, 2006 11:58 AM > To: Chris Fields > Cc: 'Nathan S. Haigh'; 'bioperl-l' > Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN > > lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix > > does a reasonable job of textifying html. You get the links as > numbered references at the bottom or: > > lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix | > perl -ane 's/\[?\[\d+\](edit\])?//g;print' > > to remove the links all together. > > Barry > > P.S. Looks like this: > > #Creative Commons copyright > > Installing Bioperl for Unix > > From BioPerl > > Jump to: navigation, search > > Contents > > * 1 BIOPERL INSTALLATION > * 2 SYSTEM REQUIREMENTS > * 3 OPTIONAL > * 4 ADDITIONAL INSTALLATION INFORMATION > * 5 THE BIOPERL BUNDLE > * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN > * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make' > * 8 WHERE ARE THE MAN PAGES? > * 9 EXTERNAL PROGRAMS > + 9.1 Environment Variables > * 10 INSTALLING BIOPERL SCRIPTS > * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA > * 12 INSTALLING BIOPERL MODULES THE HARD WAY > * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION > * 14 THE TEST SYSTEM > * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE > + 15.1 CONFIGURING for BSD and Solaris boxes > + 15.2 INSTALLATION > * 16 DEPENDENCIES AND Bundle::BioPerl > > > BIOPERL INSTALLATION > > Bioperl has been installed on many forms of Unix, Win9X/NT/2000/XP, > and on Mac OS X (see the PLATFORMS file for more details). > Following are > instructions for installing Bioperl for Unix/Linux/Mac OS X; > Windows > installation instructions can be found here. For installing > Bioperl for > Mac OS X using Fink, see Getting BioPerl. > > > SYSTEM REQUIREMENTS > > * Perl 5.005 or later; version 5.6 and greater are recommended. > Note > that most modules will work with earlier versions of Perl. > The only ones > that will not are Bio::SimpleAlign and the Bio::Index::* > modules. If > you don't need these modules and you want to install Bioperl > using an > earlier version of Perl, edit the "require 5.005;" line in > Makefile.PL > as necessary. > > * External modules: Bioperl uses functionality provided in > other Perl > modules. Some of these are included in the standard perl > package but > some need to be obtained from the CPAN site. The list of > external > modules is included at the bottom of this document. > > The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of > these > external modules easy. Simply install the bundle using your CPAN > shell and > all necessary modules will be installed. See THE BIOPERL BUNDLE, > below. > > > OPTIONAL > > * ANSI C or GNU C compiler (gcc) for XS extensions (the > bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext > PACKAGE, below). > > > > ADDITIONAL INSTALLATION INFORMATION > > * Additional information on Bioperl and MAC OS: > + OS 9 - http://bioperl.org/Core/mac-bioperl.html > + OSX-http://www.tc.umn.edu/~cann0010/ > Bioperl_OSX_install.html > + OS X - Installing using Fink (in Getting BioPerl) > > > > THE BIOPERL BUNDLE > > You typically need root privileges to install using CPAN. If you > don't > have these privileges please see INSTALLING BIOPERL IN A PERSONAL > MODULE > AREA for additional information. > > Install Bundle::Bioperl using CPAN. One way: > >perl -MCPAN -e "install Bundle::BioPerl" > > Another way: > >perl -MCPAN -e shell > cpan>install Bundle::BioPerl > > > > On Oct 17, 2006, at 8:04 AM, Chris Fields wrote: > > > There isn't a very easy way since so many links have to be removed/ > > modified. > > I have found a few CPAN modules that could help, but for now I just > > dump the > > text output from a text browser (elinks) using the 'printable > > version' page > > and hand-edit, which works very quickly. That works for the time > > being > > until I can find another more automated solution. > > > > Fortunately there have been very few edits to either INSTALL wiki > > page so > > they should remain relatively stable. > > > > Christopher Fields > > Postdoctoral Researcher - Switzer Lab > > Dept. of Biochemistry > > University of Illinois Urbana-Champaign > > > >> -----Original Message----- > >> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk] > >> Sent: Tuesday, October 17, 2006 6:46 AM > >> To: Chris Fields > >> Cc: bioperl-l > >> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN > >> > >> Chris Fields wrote: > >>> The general consensus was to keep text versions available; we could > >>> add URL links to the wiki pages for the most up-to-dat version. > >>> BTW, > >>> I have modified INSTALL already. INSTALL.WIN is next in line (I was > >>> waiting for your changes). > >>> > >> Is it possible to generate these files from the wiki whenever > >> there is a > >> release? I now edits shouldn't be too severe or too often - but I can > >> see things getting a little messy/annoying if edits have to be > >> made in 2 > >> places. > >> > >> Nath > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Tue Oct 17 17:52:36 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 12:52:36 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca> Message-ID: <001301c6f215$07a9a070$15327e82@pyrimidine> What do you get when you run the SearchIO.t test by itself using 'perl -I. t/SearchIO.t'? It looks like something pretty catastrophic happened. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Paul Boutros > Sent: Tuesday, October 17, 2006 11:57 AM > To: bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2 > > Hi, > Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two failed > tests, the first seems to be just a result of me not having DBD::mysql > installed. > Paul > > Test Summary > ============ > > Failed Test Stat Wstat Total Fail List of Failed > -------------------------------------------------------------------------- > ----- > t/BioDBSeqFeature_mysql.t 46 46 1-46 > t/SearchIO.t 22 5632 1337 2671 2-1337 > 2 tests and 106 subtests skipped. > Failed 2/236 test scripts. 1382/11688 subtests failed. > Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys = > 159.61 CPU) > > BioDBSeqFeature_mysql > ===================== > pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t > 1..46 > install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC > contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t > /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8 > /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi > /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at > (eval 37) line 3. > Perhaps the DBD::mysql perl module hasn't been fully installed, > or perhaps the capitalisation of 'mysql' isn't right. > Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge. > at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208 > > SearchIO > ======== > pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more > 1..1337 > ok 1 > > -------------------- WARNING --------------------- > MSG: XML::SAX::Expat not currently supported; must have local copies > of NCBI DTD docs! > --------------------------------------------------- > > -------------------- WARNING --------------------- > MSG: error in parsing a report: > > 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' > does not exist > file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd > Handler couldn't resolve external entity at line 2, column 82, byte 104 > error in processing external entity reference at line 2, column 82, > byte 104 at > /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line > 187 > > --------------------------------------------------- > not ok 2 > # Failed test 2 in t/SearchIO.t at line 68 > Can't call method "database_name" on an undefined value at > t/SearchIO.t line 69. > > ------------------------------ > > Message: 10 > Date: Tue, 17 Oct 2006 11:32:54 +0100 > From: Sendu Bala > Subject: [Bioperl-l] Bioperl 1.5.2 RC2 > To: bioperl-l at bioperl.org > Message-ID: <4534B156.4090501 at sendu.me.uk> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. > See http://www.bioperl.org/wiki/Release_1.5.2 for > instructions on getting and testing this RC. > > Developers: > This should be the last RC before release ~next monday. Now would > be a good time for last minute documentaiton updates and additions. > > Users: > Even though 1.5.2 is a 'developer' release, we consider it the most > stable and capable version of Bioperl, and recommend that you use > it in all but the most critical production environments. Please > try it out and let us know of any problems or difficulties you run > into. > > > Thank you, > Sendu. > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From paul.boutros at utoronto.ca Tue Oct 17 17:59:33 2006 From: paul.boutros at utoronto.ca (Paul Boutros) Date: Tue, 17 Oct 2006 13:59:33 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <001301c6f215$07a9a070$15327e82@pyrimidine> References: <001301c6f215$07a9a070$15327e82@pyrimidine> Message-ID: <20061017135933.cu3li7pz8fzk8cko@webmail.utoronto.ca> Hi Chris, Here it is: pcboutro at ccb690[643] >> perl -I. t/SearchIO.t 1..1337 ok 1 -------------------- WARNING --------------------- MSG: XML::SAX::Expat not currently supported; must have local copies of NCBI DTD docs! --------------------------------------------------- -------------------- WARNING --------------------- MSG: error in parsing a report: 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' does not exist file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd Handler couldn't resolve external entity at line 2, column 82, byte 104 error in processing external entity reference at line 2, column 82, byte 104 at /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line 187 --------------------------------------------------- not ok 2 # Failed test 2 in t/SearchIO.t at line 68 Can't call method "database_name" on an undefined value at t/SearchIO.t line 69. Quoting Chris Fields : > What do you get when you run the SearchIO.t test by itself using 'perl -I. > t/SearchIO.t'? It looks like something pretty catastrophic happened. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > >> -----Original Message----- >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >> bounces at lists.open-bio.org] On Behalf Of Paul Boutros >> Sent: Tuesday, October 17, 2006 11:57 AM >> To: bioperl-l at lists.open-bio.org >> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2 >> >> Hi, >> Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two failed >> tests, the first seems to be just a result of me not having DBD::mysql >> installed. >> Paul >> >> Test Summary >> ============ >> >> Failed Test Stat Wstat Total Fail List of Failed >> -------------------------------------------------------------------------- >> ----- >> t/BioDBSeqFeature_mysql.t 46 46 1-46 >> t/SearchIO.t 22 5632 1337 2671 2-1337 >> 2 tests and 106 subtests skipped. >> Failed 2/236 test scripts. 1382/11688 subtests failed. >> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys = >> 159.61 CPU) >> >> BioDBSeqFeature_mysql >> ===================== >> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t >> 1..46 >> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC >> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t >> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8 >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi >> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at >> (eval 37) line 3. >> Perhaps the DBD::mysql perl module hasn't been fully installed, >> or perhaps the capitalisation of 'mysql' isn't right. >> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge. >> at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208 >> >> SearchIO >> ======== >> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more >> 1..1337 >> ok 1 >> >> -------------------- WARNING --------------------- >> MSG: XML::SAX::Expat not currently supported; must have local copies >> of NCBI DTD docs! >> --------------------------------------------------- >> >> -------------------- WARNING --------------------- >> MSG: error in parsing a report: >> >> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' >> does not exist >> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd >> Handler couldn't resolve external entity at line 2, column 82, byte 104 >> error in processing external entity reference at line 2, column 82, >> byte 104 at >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line >> 187 >> >> --------------------------------------------------- >> not ok 2 >> # Failed test 2 in t/SearchIO.t at line 68 >> Can't call method "database_name" on an undefined value at >> t/SearchIO.t line 69. >> >> ------------------------------ >> >> Message: 10 >> Date: Tue, 17 Oct 2006 11:32:54 +0100 >> From: Sendu Bala >> Subject: [Bioperl-l] Bioperl 1.5.2 RC2 >> To: bioperl-l at bioperl.org >> Message-ID: <4534B156.4090501 at sendu.me.uk> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed >> >> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. >> See http://www.bioperl.org/wiki/Release_1.5.2 for >> instructions on getting and testing this RC. >> >> Developers: >> This should be the last RC before release ~next monday. Now would >> be a good time for last minute documentaiton updates and additions. >> >> Users: >> Even though 1.5.2 is a 'developer' release, we consider it the most >> stable and capable version of Bioperl, and recommend that you use >> it in all but the most critical production environments. Please >> try it out and let us know of any problems or difficulties you run >> into. >> >> >> Thank you, >> Sendu. >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From barry.moore at genetics.utah.edu Tue Oct 17 18:07:12 2006 From: barry.moore at genetics.utah.edu (Barry Moore) Date: Tue, 17 Oct 2006 12:07:12 -0600 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: References: Message-ID: <588DE26B-8F18-4540-BAEE-2B479CBDE8B3@genetics.utah.edu> In fact, I think it was you who taught me that trick in the first place. B On Oct 17, 2006, at 11:40 AM, Brian Osborne wrote: > Barry, > > I second that. lynx does the best job of converting HTML to text > I've seen. > > Brian O. > > > On 10/17/06 12:57 PM, "Barry Moore" > wrote: > >> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix >> >> does a reasonable job of textifying html. You get the links as >> numbered references at the bottom or: >> >> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix | >> perl -ane 's/\[?\[\d+\](edit\])?//g;print' >> >> to remove the links all together. >> >> Barry >> >> P.S. Looks like this: >> >> #Creative Commons copyright >> >> Installing Bioperl for Unix >> >> From BioPerl >> >> Jump to: navigation, search >> >> Contents >> >> * 1 BIOPERL INSTALLATION >> * 2 SYSTEM REQUIREMENTS >> * 3 OPTIONAL >> * 4 ADDITIONAL INSTALLATION INFORMATION >> * 5 THE BIOPERL BUNDLE >> * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN >> * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make' >> * 8 WHERE ARE THE MAN PAGES? >> * 9 EXTERNAL PROGRAMS >> + 9.1 Environment Variables >> * 10 INSTALLING BIOPERL SCRIPTS >> * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA >> * 12 INSTALLING BIOPERL MODULES THE HARD WAY >> * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION >> * 14 THE TEST SYSTEM >> * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE >> + 15.1 CONFIGURING for BSD and Solaris boxes >> + 15.2 INSTALLATION >> * 16 DEPENDENCIES AND Bundle::BioPerl >> >> >> BIOPERL INSTALLATION >> >> Bioperl has been installed on many forms of Unix, Win9X/NT/ >> 2000/XP, >> and on Mac OS X (see the PLATFORMS file for more details). >> Following are >> instructions for installing Bioperl for Unix/Linux/Mac OS X; >> Windows >> installation instructions can be found here. For installing >> Bioperl for >> Mac OS X using Fink, see Getting BioPerl. >> >> >> SYSTEM REQUIREMENTS >> >> * Perl 5.005 or later; version 5.6 and greater are recommended. >> Note >> that most modules will work with earlier versions of Perl. >> The only ones >> that will not are Bio::SimpleAlign and the Bio::Index::* >> modules. If >> you don't need these modules and you want to install Bioperl >> using an >> earlier version of Perl, edit the "require 5.005;" line in >> Makefile.PL >> as necessary. >> >> * External modules: Bioperl uses functionality provided in >> other Perl >> modules. Some of these are included in the standard perl >> package but >> some need to be obtained from the CPAN site. The list of >> external >> modules is included at the bottom of this document. >> >> The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of >> these >> external modules easy. Simply install the bundle using your CPAN >> shell and >> all necessary modules will be installed. See THE BIOPERL BUNDLE, >> below. >> >> >> OPTIONAL >> >> * ANSI C or GNU C compiler (gcc) for XS extensions >> (the >> bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext >> PACKAGE, below). >> >> >> >> ADDITIONAL INSTALLATION INFORMATION >> >> * Additional information on Bioperl and MAC OS: >> + OS 9 - http://bioperl.org/Core/mac-bioperl.html >> + OSX-http://www.tc.umn.edu/~cann0010/ >> Bioperl_OSX_install.html >> + OS X - Installing using Fink (in Getting BioPerl) >> >> >> >> THE BIOPERL BUNDLE >> >> You typically need root privileges to install using CPAN. If you >> don't >> have these privileges please see INSTALLING BIOPERL IN A PERSONAL >> MODULE >> AREA for additional information. >> >> Install Bundle::Bioperl using CPAN. One way: >>> perl -MCPAN -e "install Bundle::BioPerl" >> >> Another way: >>> perl -MCPAN -e shell >> cpan>install Bundle::BioPerl >> >> >> >> On Oct 17, 2006, at 8:04 AM, Chris Fields wrote: >> >>> There isn't a very easy way since so many links have to be removed/ >>> modified. >>> I have found a few CPAN modules that could help, but for now I just >>> dump the >>> text output from a text browser (elinks) using the 'printable >>> version' page >>> and hand-edit, which works very quickly. That works for the time >>> being >>> until I can find another more automated solution. >>> >>> Fortunately there have been very few edits to either INSTALL wiki >>> page so >>> they should remain relatively stable. >>> >>> Christopher Fields >>> Postdoctoral Researcher - Switzer Lab >>> Dept. of Biochemistry >>> University of Illinois Urbana-Champaign >>> >>>> -----Original Message----- >>>> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk] >>>> Sent: Tuesday, October 17, 2006 6:46 AM >>>> To: Chris Fields >>>> Cc: bioperl-l >>>> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN >>>> >>>> Chris Fields wrote: >>>>> The general consensus was to keep text versions available; we >>>>> could >>>>> add URL links to the wiki pages for the most up-to-dat version. >>>>> BTW, >>>>> I have modified INSTALL already. INSTALL.WIN is next in line >>>>> (I was >>>>> waiting for your changes). >>>>> >>>> Is it possible to generate these files from the wiki whenever >>>> there is a >>>> release? I now edits shouldn't be too severe or too often - but >>>> I can >>>> see things getting a little messy/annoying if edits have to be >>>> made in 2 >>>> places. >>>> >>>> Nath >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > From bix at sendu.me.uk Tue Oct 17 18:07:04 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 17 Oct 2006 19:07:04 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca> References: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca> Message-ID: <45351BC8.9080507@sendu.me.uk> Paul Boutros wrote: > Hi, > Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two failed > tests, the first seems to be just a result of me not having DBD::mysql > installed. [snip] Thanks for those, very useful. Not something that's come up before afaik; I'll look into them. From cjfields at uiuc.edu Tue Oct 17 18:31:51 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 13:31:51 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <20061017135933.cu3li7pz8fzk8cko@webmail.utoronto.ca> Message-ID: <001401c6f21a$836f9fc0$15327e82@pyrimidine> Your local copy of XML::SAX has XML::SAX::Expat set as the default SAX backend parser. For some reason BLAST XML parsing doesn't work with that parser (it tries to verify the XML first before parsing, hence the DTD error). I may try getting this to work again, but so far I haven't found an easy way to prevent XML verification via XML::SAX::Expat. There are two options: 1) install XML::SAX::ExpatXS (the better option), which works AND is 4x faster than XML::SAX::Expat, or 2) set the default parser in the PareserDetails.ini file in your local to use XML::SAX::PurePerl. BTW, XML::SAX::ExpatXS is intended to replace XML::SAX::Expat (it just hasn't officially happened yet); the latter hasn't had significant development in about three years. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: Paul Boutros [mailto:paul.boutros at utoronto.ca] > Sent: Tuesday, October 17, 2006 1:00 PM > To: Chris Fields > Cc: bioperl-l at lists.open-bio.org > Subject: RE: [Bioperl-l] Bioperl 1.5.2 RC2 > > Hi Chris, > > Here it is: > pcboutro at ccb690[643] >> perl -I. t/SearchIO.t > 1..1337 > ok 1 > > -------------------- WARNING --------------------- > MSG: XML::SAX::Expat not currently supported; must have local copies > of NCBI DTD docs! > --------------------------------------------------- > > -------------------- WARNING --------------------- > MSG: error in parsing a report: > > 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' > does not exist > file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd > Handler couldn't resolve external entity at line 2, column 82, byte 104 > error in processing external entity reference at line 2, column 82, > byte 104 at > /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line > 187 > > --------------------------------------------------- > not ok 2 > # Failed test 2 in t/SearchIO.t at line 68 > Can't call method "database_name" on an undefined value at > t/SearchIO.t line 69. > > > Quoting Chris Fields : > > > What do you get when you run the SearchIO.t test by itself using 'perl - > I. > > t/SearchIO.t'? It looks like something pretty catastrophic happened. > > > > Christopher Fields > > Postdoctoral Researcher - Switzer Lab > > Dept. of Biochemistry > > University of Illinois Urbana-Champaign > > > > > >> -----Original Message----- > >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > >> bounces at lists.open-bio.org] On Behalf Of Paul Boutros > >> Sent: Tuesday, October 17, 2006 11:57 AM > >> To: bioperl-l at lists.open-bio.org > >> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2 > >> > >> Hi, > >> Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two failed > >> tests, the first seems to be just a result of me not having DBD::mysql > >> installed. > >> Paul > >> > >> Test Summary > >> ============ > >> > >> Failed Test Stat Wstat Total Fail List of Failed > >> ----------------------------------------------------------------------- > --- > >> ----- > >> t/BioDBSeqFeature_mysql.t 46 46 1-46 > >> t/SearchIO.t 22 5632 1337 2671 2-1337 > >> 2 tests and 106 subtests skipped. > >> Failed 2/236 test scripts. 1382/11688 subtests failed. > >> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys = > >> 159.61 CPU) > >> > >> BioDBSeqFeature_mysql > >> ===================== > >> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t > >> 1..46 > >> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC > >> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t > >> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8 > >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi > >> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at > >> (eval 37) line 3. > >> Perhaps the DBD::mysql perl module hasn't been fully installed, > >> or perhaps the capitalisation of 'mysql' isn't right. > >> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge. > >> at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208 > >> > >> SearchIO > >> ======== > >> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more > >> 1..1337 > >> ok 1 > >> > >> -------------------- WARNING --------------------- > >> MSG: XML::SAX::Expat not currently supported; must have local copies > >> of NCBI DTD docs! > >> --------------------------------------------------- > >> > >> -------------------- WARNING --------------------- > >> MSG: error in parsing a report: > >> > >> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' > >> does not exist > >> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd > >> Handler couldn't resolve external entity at line 2, column 82, byte 104 > >> error in processing external entity reference at line 2, column 82, > >> byte 104 at > >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line > >> 187 > >> > >> --------------------------------------------------- > >> not ok 2 > >> # Failed test 2 in t/SearchIO.t at line 68 > >> Can't call method "database_name" on an undefined value at > >> t/SearchIO.t line 69. > >> > >> ------------------------------ > >> > >> Message: 10 > >> Date: Tue, 17 Oct 2006 11:32:54 +0100 > >> From: Sendu Bala > >> Subject: [Bioperl-l] Bioperl 1.5.2 RC2 > >> To: bioperl-l at bioperl.org > >> Message-ID: <4534B156.4090501 at sendu.me.uk> > >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed > >> > >> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. > >> See http://www.bioperl.org/wiki/Release_1.5.2 for > >> instructions on getting and testing this RC. > >> > >> Developers: > >> This should be the last RC before release ~next monday. Now would > >> be a good time for last minute documentaiton updates and additions. > >> > >> Users: > >> Even though 1.5.2 is a 'developer' release, we consider it the most > >> stable and capable version of Bioperl, and recommend that you use > >> it in all but the most critical production environments. Please > >> try it out and let us know of any problems or difficulties you run > >> into. > >> > >> > >> Thank you, > >> Sendu. > >> > >> > >> > >> _______________________________________________ > >> Bioperl-l mailing list > >> Bioperl-l at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > From cjfields at uiuc.edu Tue Oct 17 19:05:59 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 14:05:59 -0500 Subject: [Bioperl-l] split location problems In-Reply-To: Message-ID: <001b01c6f21f$48640420$15327e82@pyrimidine> > > From: Jason Stajich [mailto:jason.stajich at gmail.com] > > > > The whole point of split locations is to represent genes with > > introns > > so that is not the "rare" case. > > Absolutely. Right, but that specific kind of join statement is not commonly used in GenBank files, which seems to be the format predominately used (no offense to EBI). This may explain why we haven't seen this pop up more often. I believe we're seeing is a difference in the way these locations are described at NCBI vs EBI, which Nadeem Faruque seems to corroborate. He indicated that EBI may move to using similar GenBank-like location strings. Regardless, FTlocationFactory and Bio::Location::Split should handle both if they are present but only seems to like the GenBank version. > > I've added code to test this to bug 2101 including a C.glabrata > > chromsome downloaded from genbank. Perhaps the problem is on the > > EMBL parsing side, I didn't test that. > > Well, I don't know whether it's EMBL parsing, or a bit further down the > pipe, but I downloaded C.glabrata chromosome B for GenBank (NC_005968), > and it describes the complement/joins in the way that Bioperl is > handling correctly. > > GenBank: > CDS complement(join(10347..10372,10632..11157)) > /locus_tag="CAGL0B00242g" > > EMBL: > FT CDS > join(complement(10632..11157),complement(10347..10372)) > FT /locus_tag="CAGL0B00242g" Yes, something that I found out independently (and corroborated by Nadeem). > Here's the diff when I run the location-printing script I posted > yesterday: > > diff biogb bio > 1c1,5 > < complement(join(10347..10372,10632..11157)) > --- > > complement(1701..2651) > > complement(2635..3345) > > complement(3980..4408) > > complement(join(10632..11157,10347..10372)) > > 10379..10615 > 209a214,217 > > 498198..498890 > > 499712..500062 > > 499851..500702 > > 500579..501364 > > As you can see, the complement/join CDS is written out in a different > order, which is Bad. I think this can be handled directly in to_FTstring(). I'll have to add a method to get the strand info from the Split object w/o going through strand(). However, I'm thinking about trying a different tact which is a bit simpler and, if it proves fruitful, may simplify Split locations somewhat. It won't be ready for 1.5.2 but maybe the next release. > (I looked at at least one of the other differences: the GB file says > it's a "misc feature" and EMBL says it's a CDS. But they don't seem to > be relevant here.) > -Amir Probably not but something to keep in mind. -c Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From er at xs4all.nl Tue Oct 17 19:01:48 2006 From: er at xs4all.nl (Erikjan) Date: Tue, 17 Oct 2006 21:01:48 +0200 (CEST) Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <001301c6f215$07a9a070$15327e82@pyrimidine> References: <001301c6f215$07a9a070$15327e82@pyrimidine> Message-ID: <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> Hello, I noticed a little problem with the Annotation "DBLink" from GenBank entries When I run: perl -MBio::DB::GenBank -e 'my $gi = 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio = $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my $ac=$seq->annotation(); my @annotations = $ac->get_Annotations("dblink"); for(@annotations) { print $_, "\n";} print $INC{ "Bio/Annotation/DBLink.pm" }, "\n"; ' This yields: GenBank:AL591065.17.17 and the place where the used Bio/Annotation/DBLink.pm resides. Can others repeat this? I have dug into the source a little and Bio::Annotation::DBLink seems to be the place where this happens: it has a concatenation which leads to that repeated version number. It this something that I should fix "client-side", so to speak, or is it worthwhile to add some logic to that concatenation to prevent this? Thanks, Eric From bosborne11 at verizon.net Tue Oct 17 17:40:54 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 17 Oct 2006 13:40:54 -0400 Subject: [Bioperl-l] INSTALL and INSTALL.WIN In-Reply-To: <3CACE1AB-D7F2-4F54-A1C3-35E1AE444CD1@genetics.utah.edu> Message-ID: Barry, I second that. lynx does the best job of converting HTML to text I've seen. Brian O. On 10/17/06 12:57 PM, "Barry Moore" wrote: > lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix > > does a reasonable job of textifying html. You get the links as > numbered references at the bottom or: > > lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix | > perl -ane 's/\[?\[\d+\](edit\])?//g;print' > > to remove the links all together. > > Barry > > P.S. Looks like this: > > #Creative Commons copyright > > Installing Bioperl for Unix > > From BioPerl > > Jump to: navigation, search > > Contents > > * 1 BIOPERL INSTALLATION > * 2 SYSTEM REQUIREMENTS > * 3 OPTIONAL > * 4 ADDITIONAL INSTALLATION INFORMATION > * 5 THE BIOPERL BUNDLE > * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN > * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make' > * 8 WHERE ARE THE MAN PAGES? > * 9 EXTERNAL PROGRAMS > + 9.1 Environment Variables > * 10 INSTALLING BIOPERL SCRIPTS > * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA > * 12 INSTALLING BIOPERL MODULES THE HARD WAY > * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION > * 14 THE TEST SYSTEM > * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE > + 15.1 CONFIGURING for BSD and Solaris boxes > + 15.2 INSTALLATION > * 16 DEPENDENCIES AND Bundle::BioPerl > > > BIOPERL INSTALLATION > > Bioperl has been installed on many forms of Unix, Win9X/NT/2000/XP, > and on Mac OS X (see the PLATFORMS file for more details). > Following are > instructions for installing Bioperl for Unix/Linux/Mac OS X; > Windows > installation instructions can be found here. For installing > Bioperl for > Mac OS X using Fink, see Getting BioPerl. > > > SYSTEM REQUIREMENTS > > * Perl 5.005 or later; version 5.6 and greater are recommended. > Note > that most modules will work with earlier versions of Perl. > The only ones > that will not are Bio::SimpleAlign and the Bio::Index::* > modules. If > you don't need these modules and you want to install Bioperl > using an > earlier version of Perl, edit the "require 5.005;" line in > Makefile.PL > as necessary. > > * External modules: Bioperl uses functionality provided in > other Perl > modules. Some of these are included in the standard perl > package but > some need to be obtained from the CPAN site. The list of > external > modules is included at the bottom of this document. > > The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of > these > external modules easy. Simply install the bundle using your CPAN > shell and > all necessary modules will be installed. See THE BIOPERL BUNDLE, > below. > > > OPTIONAL > > * ANSI C or GNU C compiler (gcc) for XS extensions (the > bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext > PACKAGE, below). > > > > ADDITIONAL INSTALLATION INFORMATION > > * Additional information on Bioperl and MAC OS: > + OS 9 - http://bioperl.org/Core/mac-bioperl.html > + OSX-http://www.tc.umn.edu/~cann0010/ > Bioperl_OSX_install.html > + OS X - Installing using Fink (in Getting BioPerl) > > > > THE BIOPERL BUNDLE > > You typically need root privileges to install using CPAN. If you > don't > have these privileges please see INSTALLING BIOPERL IN A PERSONAL > MODULE > AREA for additional information. > > Install Bundle::Bioperl using CPAN. One way: >> perl -MCPAN -e "install Bundle::BioPerl" > > Another way: >> perl -MCPAN -e shell > cpan>install Bundle::BioPerl > > > > On Oct 17, 2006, at 8:04 AM, Chris Fields wrote: > >> There isn't a very easy way since so many links have to be removed/ >> modified. >> I have found a few CPAN modules that could help, but for now I just >> dump the >> text output from a text browser (elinks) using the 'printable >> version' page >> and hand-edit, which works very quickly. That works for the time >> being >> until I can find another more automated solution. >> >> Fortunately there have been very few edits to either INSTALL wiki >> page so >> they should remain relatively stable. >> >> Christopher Fields >> Postdoctoral Researcher - Switzer Lab >> Dept. of Biochemistry >> University of Illinois Urbana-Champaign >> >>> -----Original Message----- >>> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk] >>> Sent: Tuesday, October 17, 2006 6:46 AM >>> To: Chris Fields >>> Cc: bioperl-l >>> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN >>> >>> Chris Fields wrote: >>>> The general consensus was to keep text versions available; we could >>>> add URL links to the wiki pages for the most up-to-dat version. >>>> BTW, >>>> I have modified INSTALL already. INSTALL.WIN is next in line (I was >>>> waiting for your changes). >>>> >>> Is it possible to generate these files from the wiki whenever >>> there is a >>> release? I now edits shouldn't be too severe or too often - but I can >>> see things getting a little messy/annoying if edits have to be >>> made in 2 >>> places. >>> >>> Nath >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Tue Oct 17 20:30:15 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 15:30:15 -0500 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> References: <001301c6f215$07a9a070$15327e82@pyrimidine> <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> Message-ID: <0FB91820-B2A1-4F7F-866C-8D4791DD8306@uiuc.edu> I can confirm this using bioperl-live: GenBank:AL591065.17.17 /Users/cjfields/src/bioperl-live/Bio/Annotation/DBLink.pm Could you file a bug report via bugzilla? Chris On Oct 17, 2006, at 2:01 PM, Erikjan wrote: > Hello, > > I noticed a little problem with the Annotation "DBLink" from > GenBank entries > > When I run: > > perl -MBio::DB::GenBank -e 'my $gi = > 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio = > $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my > $ac=$seq->annotation(); my @annotations = $ac->get_Annotations > ("dblink"); > for(@annotations) { print $_, "\n";} print $INC{ > "Bio/Annotation/DBLink.pm" }, "\n"; ' > > This yields: > > GenBank:AL591065.17.17 > > and the place where the used Bio/Annotation/DBLink.pm resides. > > Can others repeat this? > > I have dug into the source a little and Bio::Annotation::DBLink > seems to > be the place where this happens: it has a concatenation which leads to > that repeated version number. > > It this something that I should fix "client-side", so to speak, or > is it > worthwhile to add some logic to that concatenation to prevent this? > > > Thanks, > > Eric > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From paul.boutros at utoronto.ca Tue Oct 17 23:49:52 2006 From: paul.boutros at utoronto.ca (Paul Boutros) Date: Tue, 17 Oct 2006 19:49:52 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <001401c6f21a$836f9fc0$15327e82@pyrimidine> References: <001401c6f21a$836f9fc0$15327e82@pyrimidine> Message-ID: <20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca> Hi Chris, Yup, that's it. I installed XML::SAX::ExpatXS (make test output below). Should there be a note somewhere in the INSTALL docs saying basically what you just wrote? Or maybe it's already there somewhere and I missed it. Also, I'd propose changing BioDBSeqFeature_mysql.t so that it checks if DBD::mysql can be loaded, and if not doesn't run the test. Since the file is only one-line long, here's the modified file rather than a patch: ################################################################ BEGIN { # DBD::mysql is required eval { require DBD::mysql; }; if ( $@ ) { use Test::More skip_all => "DBD::mysql is not installed or is installed incorrectly - skipping BioDBSeqFeature _mysql.t"; exit(0); } } system "perl t/BioDBSeqFeature.t -adaptor DBI::mysql -create 1 -temp 1 -dsn test"; ################################################################ And when I run it I get: t/BioDBSeqFeature_mysql......skipped all skipped: DBD::mysql is not installed or is installed incorrectly - skipping BioDBSeqFeature_mysql.t And for the overall make test: All tests successful, 3 tests and 106 subtests skipped. Files=236, Tests=11642, 247 wallclock secs (143.85 cusr + 20.39 csys = 164.24 CPU) Hope this helps, Paul Quoting Chris Fields : > Your local copy of XML::SAX has XML::SAX::Expat set as the default SAX > backend parser. For some reason BLAST XML parsing doesn't work with that > parser (it tries to verify the XML first before parsing, hence the DTD > error). I may try getting this to work again, but so far I haven't found an > easy way to prevent XML verification via XML::SAX::Expat. > > There are two options: 1) install XML::SAX::ExpatXS (the better option), > which works AND is 4x faster than XML::SAX::Expat, or 2) set the default > parser in the PareserDetails.ini file in your local to use > XML::SAX::PurePerl. > > BTW, XML::SAX::ExpatXS is intended to replace XML::SAX::Expat (it just > hasn't officially happened yet); the latter hasn't had significant > development in about three years. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > >> -----Original Message----- >> From: Paul Boutros [mailto:paul.boutros at utoronto.ca] >> Sent: Tuesday, October 17, 2006 1:00 PM >> To: Chris Fields >> Cc: bioperl-l at lists.open-bio.org >> Subject: RE: [Bioperl-l] Bioperl 1.5.2 RC2 >> >> Hi Chris, >> >> Here it is: >> pcboutro at ccb690[643] >> perl -I. t/SearchIO.t >> 1..1337 >> ok 1 >> >> -------------------- WARNING --------------------- >> MSG: XML::SAX::Expat not currently supported; must have local copies >> of NCBI DTD docs! >> --------------------------------------------------- >> >> -------------------- WARNING --------------------- >> MSG: error in parsing a report: >> >> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' >> does not exist >> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd >> Handler couldn't resolve external entity at line 2, column 82, byte 104 >> error in processing external entity reference at line 2, column 82, >> byte 104 at >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line >> 187 >> >> --------------------------------------------------- >> not ok 2 >> # Failed test 2 in t/SearchIO.t at line 68 >> Can't call method "database_name" on an undefined value at >> t/SearchIO.t line 69. >> >> >> Quoting Chris Fields : >> >> > What do you get when you run the SearchIO.t test by itself using 'perl - >> I. >> > t/SearchIO.t'? It looks like something pretty catastrophic happened. >> > >> > Christopher Fields >> > Postdoctoral Researcher - Switzer Lab >> > Dept. of Biochemistry >> > University of Illinois Urbana-Champaign >> > >> > >> >> -----Original Message----- >> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >> >> bounces at lists.open-bio.org] On Behalf Of Paul Boutros >> >> Sent: Tuesday, October 17, 2006 11:57 AM >> >> To: bioperl-l at lists.open-bio.org >> >> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2 >> >> >> >> Hi, >> >> Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two failed >> >> tests, the first seems to be just a result of me not having DBD::mysql >> >> installed. >> >> Paul >> >> >> >> Test Summary >> >> ============ >> >> >> >> Failed Test Stat Wstat Total Fail List of Failed >> >> ----------------------------------------------------------------------- >> --- >> >> ----- >> >> t/BioDBSeqFeature_mysql.t 46 46 1-46 >> >> t/SearchIO.t 22 5632 1337 2671 2-1337 >> >> 2 tests and 106 subtests skipped. >> >> Failed 2/236 test scripts. 1382/11688 subtests failed. >> >> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys = >> >> 159.61 CPU) >> >> >> >> BioDBSeqFeature_mysql >> >> ===================== >> >> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t >> >> 1..46 >> >> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC >> >> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t >> >> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8 >> >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi >> >> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at >> >> (eval 37) line 3. >> >> Perhaps the DBD::mysql perl module hasn't been fully installed, >> >> or perhaps the capitalisation of 'mysql' isn't right. >> >> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge. >> >> at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208 >> >> >> >> SearchIO >> >> ======== >> >> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more >> >> 1..1337 >> >> ok 1 >> >> >> >> -------------------- WARNING --------------------- >> >> MSG: XML::SAX::Expat not currently supported; must have local copies >> >> of NCBI DTD docs! >> >> --------------------------------------------------- >> >> >> >> -------------------- WARNING --------------------- >> >> MSG: error in parsing a report: >> >> >> >> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' >> >> does not exist >> >> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd >> >> Handler couldn't resolve external entity at line 2, column 82, byte 104 >> >> error in processing external entity reference at line 2, column 82, >> >> byte 104 at >> >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line >> >> 187 >> >> >> >> --------------------------------------------------- >> >> not ok 2 >> >> # Failed test 2 in t/SearchIO.t at line 68 >> >> Can't call method "database_name" on an undefined value at >> >> t/SearchIO.t line 69. >> >> >> >> ------------------------------ >> >> >> >> Message: 10 >> >> Date: Tue, 17 Oct 2006 11:32:54 +0100 >> >> From: Sendu Bala >> >> Subject: [Bioperl-l] Bioperl 1.5.2 RC2 >> >> To: bioperl-l at bioperl.org >> >> Message-ID: <4534B156.4090501 at sendu.me.uk> >> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed >> >> >> >> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. >> >> See http://www.bioperl.org/wiki/Release_1.5.2 for >> >> instructions on getting and testing this RC. >> >> >> >> Developers: >> >> This should be the last RC before release ~next monday. Now would >> >> be a good time for last minute documentaiton updates and additions. >> >> >> >> Users: >> >> Even though 1.5.2 is a 'developer' release, we consider it the most >> >> stable and capable version of Bioperl, and recommend that you use >> >> it in all but the most critical production environments. Please >> >> try it out and let us know of any problems or difficulties you run >> >> into. >> >> >> >> >> >> Thank you, >> >> Sendu. >> >> >> >> >> >> >> >> _______________________________________________ >> >> Bioperl-l mailing list >> >> Bioperl-l at lists.open-bio.org >> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > >> > >> > > > From cjfields at uiuc.edu Wed Oct 18 00:51:35 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 17 Oct 2006 19:51:35 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca> References: <001401c6f21a$836f9fc0$15327e82@pyrimidine> <20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca> Message-ID: <67CA244C-5A2C-46A0-B474-47C5B67D199B@uiuc.edu> On Oct 17, 2006, at 6:49 PM, Paul Boutros wrote: > Hi Chris, > > Yup, that's it. I installed XML::SAX::ExpatXS (make test output > below). Should there be a note somewhere in the INSTALL docs saying > basically what you just wrote? Or maybe it's already there somewhere > and I missed it. The INSTALL docs should have this, yes. I'll double-check though. Pretty much anything that plugs into XML::SAX except XML::SAX::Expat works (XML::LibXML also works, I found). > Also, I'd propose changing BioDBSeqFeature_mysql.t so that it checks > if DBD::mysql can be loaded, and if not doesn't run the test. Since > the file is only one-line long, here's the modified file rather than a > patch: > ################################################################ > BEGIN { > # DBD::mysql is required > eval { > require DBD::mysql; > }; > if ( $@ ) { > use Test::More skip_all => "DBD::mysql is not > installed or is installed incorrectly - skipping BioDBSeqFeature > _mysql.t"; > exit(0); > } > } > > system "perl t/BioDBSeqFeature.t -adaptor DBI::mysql -create 1 -temp 1 > -dsn test"; > ################################################################ > > And when I run it I get: > t/BioDBSeqFeature_mysql......skipped > all skipped: DBD::mysql is not installed or is installed > incorrectly - skipping BioDBSeqFeature_mysql.t > > And for the overall make test: > All tests successful, 3 tests and 106 subtests skipped. > Files=236, Tests=11642, 247 wallclock secs (143.85 cusr + 20.39 csys = > 164.24 CPU) It should check this when using 'perl Makefile.PL', since the tests are only set up if MySQL is present (so you would assume that it checks for DBD::mysql). I'll look into it. Chris > Hope this helps, > Paul > > > Quoting Chris Fields : > >> Your local copy of XML::SAX has XML::SAX::Expat set as the default >> SAX >> backend parser. For some reason BLAST XML parsing doesn't work >> with that >> parser (it tries to verify the XML first before parsing, hence the >> DTD >> error). I may try getting this to work again, but so far I >> haven't found an >> easy way to prevent XML verification via XML::SAX::Expat. >> >> There are two options: 1) install XML::SAX::ExpatXS (the better >> option), >> which works AND is 4x faster than XML::SAX::Expat, or 2) set the >> default >> parser in the PareserDetails.ini file in your local to use >> XML::SAX::PurePerl. >> >> BTW, XML::SAX::ExpatXS is intended to replace XML::SAX::Expat (it >> just >> hasn't officially happened yet); the latter hasn't had significant >> development in about three years. >> >> Christopher Fields >> Postdoctoral Researcher - Switzer Lab >> Dept. of Biochemistry >> University of Illinois Urbana-Champaign >> >> >>> -----Original Message----- >>> From: Paul Boutros [mailto:paul.boutros at utoronto.ca] >>> Sent: Tuesday, October 17, 2006 1:00 PM >>> To: Chris Fields >>> Cc: bioperl-l at lists.open-bio.org >>> Subject: RE: [Bioperl-l] Bioperl 1.5.2 RC2 >>> >>> Hi Chris, >>> >>> Here it is: >>> pcboutro at ccb690[643] >> perl -I. t/SearchIO.t >>> 1..1337 >>> ok 1 >>> >>> -------------------- WARNING --------------------- >>> MSG: XML::SAX::Expat not currently supported; must have local copies >>> of NCBI DTD docs! >>> --------------------------------------------------- >>> >>> -------------------- WARNING --------------------- >>> MSG: error in parsing a report: >>> >>> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd' >>> does not exist >>> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd >>> Handler couldn't resolve external entity at line 2, column 82, >>> byte 104 >>> error in processing external entity reference at line 2, column 82, >>> byte 104 at >>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm >>> line >>> 187 >>> >>> --------------------------------------------------- >>> not ok 2 >>> # Failed test 2 in t/SearchIO.t at line 68 >>> Can't call method "database_name" on an undefined value at >>> t/SearchIO.t line 69. >>> >>> >>> Quoting Chris Fields : >>> >>>> What do you get when you run the SearchIO.t test by itself using >>>> 'perl - >>> I. >>>> t/SearchIO.t'? It looks like something pretty catastrophic >>>> happened. >>>> >>>> Christopher Fields >>>> Postdoctoral Researcher - Switzer Lab >>>> Dept. of Biochemistry >>>> University of Illinois Urbana-Champaign >>>> >>>> >>>>> -----Original Message----- >>>>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >>>>> bounces at lists.open-bio.org] On Behalf Of Paul Boutros >>>>> Sent: Tuesday, October 17, 2006 11:57 AM >>>>> To: bioperl-l at lists.open-bio.org >>>>> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2 >>>>> >>>>> Hi, >>>>> Here's a quick make test on AIX 5.2 with Perl 5.8.8. I get two >>>>> failed >>>>> tests, the first seems to be just a result of me not having >>>>> DBD::mysql >>>>> installed. >>>>> Paul >>>>> >>>>> Test Summary >>>>> ============ >>>>> >>>>> Failed Test Stat Wstat Total Fail List of Failed >>>>> ------------------------------------------------------------------ >>>>> ----- >>> --- >>>>> ----- >>>>> t/BioDBSeqFeature_mysql.t 46 46 1-46 >>>>> t/SearchIO.t 22 5632 1337 2671 2-1337 >>>>> 2 tests and 106 subtests skipped. >>>>> Failed 2/236 test scripts. 1382/11688 subtests failed. >>>>> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 >>>>> csys = >>>>> 159.61 CPU) >>>>> >>>>> BioDBSeqFeature_mysql >>>>> ===================== >>>>> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t >>>>> 1..46 >>>>> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC >>>>> (@INC >>>>> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t >>>>> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8 >>>>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi >>>>> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/ >>>>> site_perl) at >>>>> (eval 37) line 3. >>>>> Perhaps the DBD::mysql perl module hasn't been fully installed, >>>>> or perhaps the capitalisation of 'mysql' isn't right. >>>>> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge. >>>>> at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208 >>>>> >>>>> SearchIO >>>>> ======== >>>>> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more >>>>> 1..1337 >>>>> ok 1 >>>>> >>>>> -------------------- WARNING --------------------- >>>>> MSG: XML::SAX::Expat not currently supported; must have local >>>>> copies >>>>> of NCBI DTD docs! >>>>> --------------------------------------------------- >>>>> >>>>> -------------------- WARNING --------------------- >>>>> MSG: error in parsing a report: >>>>> >>>>> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/ >>>>> NCBI_BlastOutput.dtd' >>>>> does not exist >>>>> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd >>>>> Handler couldn't resolve external entity at line 2, column 82, >>>>> byte 104 >>>>> error in processing external entity reference at line 2, column >>>>> 82, >>>>> byte 104 at >>>>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/ >>>>> Parser.pm line >>>>> 187 >>>>> >>>>> --------------------------------------------------- >>>>> not ok 2 >>>>> # Failed test 2 in t/SearchIO.t at line 68 >>>>> Can't call method "database_name" on an undefined value at >>>>> t/SearchIO.t line 69. >>>>> >>>>> ------------------------------ >>>>> >>>>> Message: 10 >>>>> Date: Tue, 17 Oct 2006 11:32:54 +0100 >>>>> From: Sendu Bala >>>>> Subject: [Bioperl-l] Bioperl 1.5.2 RC2 >>>>> To: bioperl-l at bioperl.org >>>>> Message-ID: <4534B156.4090501 at sendu.me.uk> >>>>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed >>>>> >>>>> Bioperl 1.5.2 Release Candidate 2 is ready and available for >>>>> testing. >>>>> See http://www.bioperl.org/wiki/Release_1.5.2 for >>>>> instructions on getting and testing this RC. >>>>> >>>>> Developers: >>>>> This should be the last RC before release ~next monday. Now >>>>> would >>>>> be a good time for last minute documentaiton updates and >>>>> additions. >>>>> >>>>> Users: >>>>> Even though 1.5.2 is a 'developer' release, we consider it >>>>> the most >>>>> stable and capable version of Bioperl, and recommend that >>>>> you use >>>>> it in all but the most critical production environments. >>>>> Please >>>>> try it out and let us know of any problems or difficulties >>>>> you run >>>>> into. >>>>> >>>>> >>>>> Thank you, >>>>> Sendu. >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> >>> >> >> >> > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Wed Oct 18 06:52:05 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 18 Oct 2006 07:52:05 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4534B156.4090501@sendu.me.uk> References: <4534B156.4090501@sendu.me.uk> Message-ID: <4535CF15.4090502@sendu.me.uk> Sendu Bala wrote: > Bioperl 1.5.2 Release Candidate 2 is ready and available for testing. > See http://www.bioperl.org/wiki/Release_1.5.2 for > instructions on getting and testing this RC. > > Developers: > This should be the last RC before release ~next monday. Now would > be a good time for last minute documentaiton updates and additions. Given the few issues that have come up, it would be prudent to have another RC, so expect one around the time the 'Needs investigation' issues on the release page have been solved. If you think there are more things that need investigation, please add them, but note the bias toward things that affect the successful completion of the test suite as opposed to general bugs which should go to Bugzilla as normal. From bix at sendu.me.uk Wed Oct 18 08:55:21 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 18 Oct 2006 09:55:21 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <45350BA6.3040102@genomics.dk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> Message-ID: <4535EBF9.1090706@sendu.me.uk> Niels Larsen wrote: > ------------ EBI > > I invoked the EBI script > > http://www.ebi.ac.uk/Tools/webservices/downloads/WSWUBlastSOAPLite.zip > > like this > > WSWUBlastClient.pl -p blastn -D embl test.fasta > > where the content of test.fasta is below, and got > > Can't find method element in the message at > /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311. As you admit, this is not a Bioperl issue. I would suggest you contact EBI support. In the mean time/alternatively I'd suggest investigating the Bioperl interface to the SOAP server, which is part of the Bioperl-run package. http://doc.bioperl.org/releases/bioperl-current/bioperl-run/Bio/Tools/Run/Analysis.html > ------------ DDBJ > > Inspired by this page, > > http://xml.nig.ac.jp/doc/Blast.txt > > I made this test script [snip] > which for me prints undef. Again, not something I can really help you with. You'll need to triple-check your code and then seek support from the providers of that SOAP service. > ------------- NCBI/Bioperl > > I installed 1.5.2-RC2, looked at the RemoteBlast example in > > http://www.bioperl.org/wiki/Bptutorial.pl > > and then put that into this test code, more or less cut/paste, [snip] > Maybe I am supposed to add a check for content in $rc and then stop > the inner loop? Yes, the wiki page example isn't really adequate. I'll update it. For a better code example see the RemoteBlast documentation: http://doc.bioperl.org/bioperl-live/Bio/Tools/Run/RemoteBlast.html > I could figure that out maybe, but I wish there was a > function which simply takes a single sequence + arguments and only > returns a list of matches when done, and does not return until then > (or until a specified timeout). Yes, I hardly find dealing with RIDs that pleasant. You might like to add a feature request to Bugzilla. From n.haigh at sheffield.ac.uk Wed Oct 18 09:58:00 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Wed, 18 Oct 2006 10:58:00 +0100 Subject: [Bioperl-l] RC2 test results on WinXP Message-ID: <4535FAA8.2050506@sheffield.ac.uk> I get all tests passing except for BioDBSeqFeature_mysql which fails all tests (1-46). During perl Makefile.PL I get: "I see you have Berkeleydb installed. I will create the DBD tests for Bio::DB::SeqFeature::Store..." I notice under the "needs investigation" there is mention about tests been generated even if DBD::mysql isn't installed. I assume this is the problem? If this is the problem should DBD::mysql be added to the dependencies in Makefile.PL? Is there an easy way to find out what tests are being skipped due to absent modules? Cheers Nath From n.haigh at sheffield.ac.uk Wed Oct 18 11:34:21 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Wed, 18 Oct 2006 12:34:21 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4535EBF9.1090706@sendu.me.uk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> <4535EBF9.1090706@sendu.me.uk> Message-ID: <4536113D.1080307@sheffield.ac.uk> I've just added test results for 1.5.2 RC2 to the wiki. There are lots of fails for packages other than bioperl-live. I'm not sure excatly how the test fails/skipps are/should be handled since my setups are as follows. Clean WinXP Pro: This is a clean install of WinXP Pro SP2 with no major software installed, other than ActivePerl 5.8.8.819 and a few tools for archive extracting, anti virus etc. Therefore, I'm unsure how tests in bioperl-network and bioperl-db should return. For example, I have made no effort to setup biosql-schema but I thought that maybe there would be a test that would detect this, and fail, then skip over other tests gracefully - like the bioperl-run tests when a piece of software is not installed??? Debian Linux: This is a Bio-Linux machine with quite a lot of bioinformatics software installed in the Path. So most of the tests in bioperl-run should probably have passed. The same goes for bioperl-network and bioperl-db as with my Windows setup. If my thoughts are totally wrong - let me know! Nath From bix at sendu.me.uk Wed Oct 18 12:03:11 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 18 Oct 2006 13:03:11 +0100 Subject: [Bioperl-l] RC2 test results on WinXP In-Reply-To: <4535FAA8.2050506@sheffield.ac.uk> References: <4535FAA8.2050506@sheffield.ac.uk> Message-ID: <453617FF.9080508@sendu.me.uk> Nathan Haigh wrote: > I get all tests passing except for BioDBSeqFeature_mysql which fails all > tests (1-46). > > During perl Makefile.PL I get: > "I see you have Berkeleydb installed. I will create the DBD tests for > Bio::DB::SeqFeature::Store..." > > I notice under the "needs investigation" there is mention about tests > been generated even if DBD::mysql isn't installed. I assume this is the > problem? Probably. I'm looking into it. Not sure why it wasn't causing a problem before now. > If this is the problem should DBD::mysql be added to the > dependencies in Makefile.PL? No. You can use the modules in question without mysql (presumably; ie. you have a different sql setup), so it makes no sense to warn people they don't have a module they absolutely do not need. > Is there an easy way to find out what tests are being skipped due to > absent modules? Ideally, when the skip occurs the test script will issue a message. I think that happens in most, if not all cases. From bix at sendu.me.uk Wed Oct 18 13:02:50 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 18 Oct 2006 14:02:50 +0100 Subject: [Bioperl-l] RC2 test results on WinXP In-Reply-To: <453617FF.9080508@sendu.me.uk> References: <4535FAA8.2050506@sheffield.ac.uk> <453617FF.9080508@sendu.me.uk> Message-ID: <453625FA.6090907@sendu.me.uk> Sendu Bala wrote: > Nathan Haigh wrote: ? >> I notice under the "needs investigation" there is mention about tests >> been generated even if DBD::mysql isn't installed. I assume this is the >> problem? > > Probably. I'm looking into it. Not sure why it wasn't causing a problem > before now. > > > If this is the problem should DBD::mysql be added to the > > dependencies in Makefile.PL? > > No. You can use the modules in question without mysql (presumably; ie. > you have a different sql setup), so it makes no sense to warn people > they don't have a module they absolutely do not need. Oops. It /is/ in the pre-reqs in Makefile.PL. Maybe DBD::mysql is the only supported driver? From bix at sendu.me.uk Wed Oct 18 13:16:24 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 18 Oct 2006 14:16:24 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <67CA244C-5A2C-46A0-B474-47C5B67D199B@uiuc.edu> References: <001401c6f21a$836f9fc0$15327e82@pyrimidine> <20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca> <67CA244C-5A2C-46A0-B474-47C5B67D199B@uiuc.edu> Message-ID: <45362928.8070104@sendu.me.uk> Chris Fields wrote: > On Oct 17, 2006, at 6:49 PM, Paul Boutros wrote: > >> Hi Chris, >> >> Yup, that's it. I installed XML::SAX::ExpatXS (make test output >> below). Should there be a note somewhere in the INSTALL docs saying >> basically what you just wrote? Or maybe it's already there somewhere >> and I missed it. > > The INSTALL docs should have this, yes. I'll double-check though. > > Pretty much anything that plugs into XML::SAX except XML::SAX::Expat > works (XML::LibXML also works, I found). > >> Also, I'd propose changing BioDBSeqFeature_mysql.t so that it checks >> if DBD::mysql can be loaded, [snip] > It should check this when using 'perl Makefile.PL', since the tests > are only set up if MySQL is present (so you would assume that it > checks for DBD::mysql). I'll look into it. This looks like a booboo on my part. I left t/BioDBSeqFeature_mysql.t in my t directory when I packed it up for release. I'm tweaking Makefile.PL right now in any case; there are a few errors and t/BioDBSeqFeature_mysql.t et al. don't get deleted in a make clean. From cjfields at uiuc.edu Wed Oct 18 13:55:37 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 08:55:37 -0500 Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in GenBank/EMBL/DDBJ Message-ID: <001e01c6f2bd$1769aca0$15327e82@pyrimidine> Ding dong the witch is dead! As announce previously, from the latest GenBank release (156.0): ----------------------------------------------- 1.3.8 Feature location syntax X.Y no longer supported The Feature Table has supported feature locations of the form 'X.Y', to represent a base position which is greater or equal to X, and less than or equal to Y. For example: misc_feature 1.10..20 misc_feature join(100..150,200.210..250) In the first example, the misc_feature starts somewhere between bases 1 and 10 (inclusive), and ends at basepair 20. In the second, the 51 bases from 100..150 are joined together with a second basepair interval, which could be anywhere from 200..250 to 210..250 . Although this syntax seems like a reasonable way to capture an uncertain interval, it is used for features on a vanishingly small number of sequence records, most database submission mechanisms don't support it, and the meaning of its use in a join() context is not entirely clear. As of October 2006, this type of location is no longer supported. Those records with features which utilize X.Y locations will be reviewed and converted to a non-uncertain format. ----------------------------------------------- EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations. Not sure about UniProt/SwissProt. I guess we're keeping this in for backwards compatibility only, but how do we handle any bugs that pop up related to this? Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Wed Oct 18 14:10:07 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 09:10:07 -0500 Subject: [Bioperl-l] RC2 test results on WinXP In-Reply-To: <453617FF.9080508@sendu.me.uk> Message-ID: <001f01c6f2bf$20737270$15327e82@pyrimidine> > Nathan Haigh wrote: > > I get all tests passing except for BioDBSeqFeature_mysql which fails all > > tests (1-46). > > > > During perl Makefile.PL I get: > > "I see you have Berkeleydb installed. I will create the DBD tests for > > Bio::DB::SeqFeature::Store..." > > > > I notice under the "needs investigation" there is mention about tests > > been generated even if DBD::mysql isn't installed. I assume this is the > > problem? > > Probably. I'm looking into it. Not sure why it wasn't causing a problem > before now. Funny, I don't have problems with the BioDBSeqFeature_mysql tests on WinXP because 'perl Makefile.PL' doesn't detect my MySQL installation, so the MySQL-based tests don't run even though I have DBD::mysql installed. I thought this might just be a WinXP issue, but apparently not. If I can get to it I'll run a few checks. > > If this is the problem should DBD::mysql be added to the > > dependencies in Makefile.PL? > > No. You can use the modules in question without mysql (presumably; ie. > you have a different sql setup), so it makes no sense to warn people > they don't have a module they absolutely do not need. Agreed, though I don't know if other relational DB's are supported like PostgreSQL. > > Is there an easy way to find out what tests are being skipped due to > > absent modules? > > Ideally, when the skip occurs the test script will issue a message. I > think that happens in most, if not all cases. Yes, though we may run into the same issue we had with XEMBL tests not reporting the reasons it skipped. Each test suite should run an eval{} to check the required modules, then only skip blocks of tests that rely on those modules. I think we have caught most of those, but who knows w/o doing a complete test suite audit? Our eventual complete switchover to Test::More should hopefully clean these up. I don't consider it a pressing issue for this release, though Sendu may feel differently. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Wed Oct 18 14:12:52 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 09:12:52 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <45362928.8070104@sendu.me.uk> Message-ID: <002001c6f2bf$807849c0$15327e82@pyrimidine> ... > This looks like a booboo on my part. I left t/BioDBSeqFeature_mysql.t in > my t directory when I packed it up for release. > > I'm tweaking Makefile.PL right now in any case; there are a few errors > and t/BioDBSeqFeature_mysql.t et al. don't get deleted in a make clean. Okay, makes sense now. No big deal, it's still an RC (a developer's RC at that!). Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Wed Oct 18 14:17:35 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Wed, 18 Oct 2006 15:17:35 +0100 Subject: [Bioperl-l] RC2 test results on WinXP In-Reply-To: <001f01c6f2bf$20737270$15327e82@pyrimidine> References: <001f01c6f2bf$20737270$15327e82@pyrimidine> Message-ID: <4536377F.6000408@sheffield.ac.uk> Chris Fields wrote: >> Nathan Haigh wrote: >> >>> I get all tests passing except for BioDBSeqFeature_mysql which fails all >>> tests (1-46). >>> >>> During perl Makefile.PL I get: >>> "I see you have Berkeleydb installed. I will create the DBD tests for >>> Bio::DB::SeqFeature::Store..." >>> >>> I notice under the "needs investigation" there is mention about tests >>> been generated even if DBD::mysql isn't installed. I assume this is the >>> problem? >>> >> Probably. I'm looking into it. Not sure why it wasn't causing a problem >> before now. >> > > Funny, I don't have problems with the BioDBSeqFeature_mysql tests on WinXP > because 'perl Makefile.PL' doesn't detect my MySQL installation, so the > MySQL-based tests don't run even though I have DBD::mysql installed. I > thought this might just be a WinXP issue, but apparently not. If I can get > to it I'll run a few checks. > > This was on WinXP. >> > If this is the problem should DBD::mysql be added to the >> > dependencies in Makefile.PL? >> >> No. You can use the modules in question without mysql (presumably; ie. >> you have a different sql setup), so it makes no sense to warn people >> they don't have a module they absolutely do not need. >> > > Agreed, though I don't know if other relational DB's are supported like > PostgreSQL. > > >>> Is there an easy way to find out what tests are being skipped due to >>> absent modules? >>> >> Ideally, when the skip occurs the test script will issue a message. I >> think that happens in most, if not all cases. >> > > Yes, though we may run into the same issue we had with XEMBL tests not > reporting the reasons it skipped. Each test suite should run an eval{} to > check the required modules, then only skip blocks of tests that rely on > those modules. I think we have caught most of those, but who knows w/o > doing a complete test suite audit? > > Our eventual complete switchover to Test::More should hopefully clean these > up. I don't consider it a pressing issue for this release, though Sendu may > feel differently. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > From hlapp at gmx.net Wed Oct 18 14:36:31 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 18 Oct 2006 10:36:31 -0400 Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in GenBank/EMBL/DDBJ In-Reply-To: <001e01c6f2bd$1769aca0$15327e82@pyrimidine> References: <001e01c6f2bd$1769aca0$15327e82@pyrimidine> Message-ID: On Oct 18, 2006, at 9:55 AM, Chris Fields wrote: > how do we handle any bugs that pop up related to this? By an evil grin, followed by deflecting the blame to NCBI, followed by another evil grin. -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Wed Oct 18 14:43:31 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 09:43:31 -0500 Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in GenBank/EMBL/DDBJ In-Reply-To: Message-ID: <002401c6f2c3$c83c7e30$15327e82@pyrimidine> > On Oct 18, 2006, at 9:55 AM, Chris Fields wrote: > > > how do we handle any bugs that pop up related to this? > > By an evil grin, followed by deflecting the blame to NCBI, followed > by another evil grin. > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== Sounds good to me! One less thing to worry about. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Wed Oct 18 14:45:57 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Wed, 18 Oct 2006 15:45:57 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4536113D.1080307@sheffield.ac.uk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> <4535EBF9.1090706@sendu.me.uk> <4536113D.1080307@sheffield.ac.uk> Message-ID: <45363E25.8010806@sheffield.ac.uk> Nathan Haigh wrote: > I've just added test results for 1.5.2 RC2 to the wiki. > > There are lots of fails for packages other than bioperl-live. I'm not > sure excatly how the test fails/skipps are/should be handled since my > setups are as follows. > > Clean WinXP Pro: > This is a clean install of WinXP Pro SP2 with no major software > installed, other than ActivePerl 5.8.8.819 and a few tools for archive > extracting, anti virus etc. Therefore, I'm unsure how tests in > bioperl-network and bioperl-db should return. For example, I have made > no effort to setup biosql-schema but I thought that maybe there would be > a test that would detect this, and fail, then skip over other tests > gracefully - like the bioperl-run tests when a piece of software is not > installed??? > > Debian Linux: > This is a Bio-Linux machine with quite a lot of bioinformatics software > installed in the Path. So most of the tests in bioperl-run should > probably have passed. The same goes for bioperl-network and bioperl-db > as with my Windows setup. > > If my thoughts are totally wrong - let me know! > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > Just looking into the failed Linux tests. Several of the tests result in errors like: ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Unallowed parameter: ARGUMENTS ! STACK: Error::throw STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328 STACK: Bio::Tools::Run::Alignment::Exonerate::AUTOLOAD /home/bo1nsh/cvswc/bioperl-run-live/blib/lib/Bio/Tools/Run/Alignment/Exonerate.pm:126 STACK: Bio::Tools::Run::Alignment::Exonerate::new /home/bo1nsh/cvswc/bioperl-run-live/blib/lib/Bio/Tools/Run/Alignment/Exonerate.pm:154 STACK: t/Exonerate.t:32 ----------------------------------------------------------- ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Unallowed parameter: 'arguments' ! STACK: Error::throw STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328 STACK: Bio::Tools::Run::Hmmer::AUTOLOAD Bio/Tools/Run/Hmmer.pm:172 STACK: Bio::Tools::Run::Hmmer::_run Bio/Tools/Run/Hmmer.pm:253 STACK: Bio::Tools::Run::Hmmer::run Bio/Tools/Run/Hmmer.pm:228 STACK: t/Hmmer.t:54 ----------------------------------------------------------- ------------- EXCEPTION: Bio::Root::Exception ------------- MSG: Unallowed parameter: ARGUMENTS ! STACK: Error::throw STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328 STACK: Bio::Tools::Run::Phrap::AUTOLOAD Bio/Tools/Run/Phrap.pm:137 STACK: Bio::Tools::Run::Phrap::new Bio/Tools/Run/Phrap.pm:165 STACK: t/Phrap.t:34 ----------------------------------------------------------- Any ideas?? Nath From hlapp at gmx.net Wed Oct 18 14:51:36 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Wed, 18 Oct 2006 10:51:36 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4536113D.1080307@sheffield.ac.uk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> <4535EBF9.1090706@sendu.me.uk> <4536113D.1080307@sheffield.ac.uk> Message-ID: On Oct 18, 2006, at 7:34 AM, Nathan Haigh wrote: > For example, I have made > no effort to setup biosql-schema but I thought that maybe there > would be > a test that would detect this I'm afraid there isn't. Bioperl-db is meaningless without biosql-schema. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bosborne11 at verizon.net Wed Oct 18 14:43:06 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 18 Oct 2006 10:43:06 -0400 Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in GenBank/EMBL/DDBJ In-Reply-To: <001e01c6f2bd$1769aca0$15327e82@pyrimidine> Message-ID: Chris, I'm fairly sure that Swissprot/Uniprot still supports fuzzy locations, all of the more recent examples in t/LocationFactory.t come from there. Brian O. On 10/18/06 9:55 AM, "Chris Fields" wrote: > EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations. > Not sure about UniProt/SwissProt. From cjfields at uiuc.edu Wed Oct 18 15:00:30 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 10:00:30 -0500 Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in GenBank/EMBL/DDBJ In-Reply-To: Message-ID: <002501c6f2c6$27625540$15327e82@pyrimidine> Do they still use the X.Y notations? Those are the most troublesome. I guess we still don't support the ones containing '?'. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: Brian Osborne [mailto:bosborne11 at verizon.net] > Sent: Wednesday, October 18, 2006 9:43 AM > To: Chris Fields; bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] X.Y Fuzzy locations no longer supported in > GenBank/EMBL/DDBJ > > Chris, > > I'm fairly sure that Swissprot/Uniprot still supports fuzzy locations, all > of the more recent examples in t/LocationFactory.t come from there. > > Brian O. > > > On 10/18/06 9:55 AM, "Chris Fields" wrote: > > > EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations. > > Not sure about UniProt/SwissProt. From Kevin.M.Brown at asu.edu Wed Oct 18 15:16:50 2006 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Wed, 18 Oct 2006 08:16:50 -0700 Subject: [Bioperl-l] Blast information Message-ID: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu> I just recently upgraded to 1.5.1 on WinXP to bring this version closer to live to parse some locally created blast files. I'm trying to find the method that returns the values that are underneath the Identities and Positives information as I'm trying to replicate the output of an old blast parser we have here written in RealBasic which is showing its age. Once I have it replicating the old output I then intend to add more features in terms of filtering returned hits (like not returning self->self hits or a->b so don't show b->a). Example: I'm looking for the methods that will return 117 from identities and 117 from positives. I can't just use num_identical/percent_identity as that isn't 100% accurate. >BurkM_2016 Length = 241 Score = 43.2 bits (88), Expect = 7e-005 Identities = 26/117 (22%), Positives = 51/117 (43%) Query: 298 QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL 357 Q F F + A+ ++ + + + L +R GL + P E + A+L Sbjct: 111 QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL 170 Query: 358 MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414 A+ G++L++DV ++ H R ++ L L + +S R +T+ L + L Sbjct: 171 KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227 Thanks, Kevin From cjfields at uiuc.edu Wed Oct 18 15:25:59 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 10:25:59 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <4536113D.1080307@sheffield.ac.uk> Message-ID: <002601c6f2c9$b6d04a90$15327e82@pyrimidine> > I've just added test results for 1.5.2 RC2 to the wiki. > > There are lots of fails for packages other than bioperl-live. I'm not > sure excatly how the test fails/skipps are/should be handled since my > setups are as follows. > > Clean WinXP Pro: > This is a clean install of WinXP Pro SP2 with no major software > installed, other than ActivePerl 5.8.8.819 and a few tools for archive > extracting, anti virus etc. Therefore, I'm unsure how tests in > bioperl-network and bioperl-db should return. For example, I have made > no effort to setup biosql-schema but I thought that maybe there would be > a test that would detect this, and fail, then skip over other tests > gracefully - like the bioperl-run tests when a piece of software is not > installed??? > > Debian Linux: > This is a Bio-Linux machine with quite a lot of bioinformatics software > installed in the Path. So most of the tests in bioperl-run should > probably have passed. The same goes for bioperl-network and bioperl-db > as with my Windows setup. > > If my thoughts are totally wrong - let me know! > Nath The bioperl-db tests rely on a local BioSQL database and on having a properly set up configuration file (these are detailed in the bioperl-db INSTALL doc). Furthermore, there are serious problems with bioperl-db and WinXP (see Bug 1938 in bugzilla). There is a workaround, but it isn't perfect by any means. http://bugzilla.open-bio.org/show_bug.cgi?id=1938 Many of the bioperl-run tests rely on env. variables being set properly, so maybe that's why they failed. These should all be detailed in the INSTALL file (but maybe they aren't?). I haven't tried installing bioperl-network or bioperl-run on WinXP or Mac OS X yet but intended on doing this within the week. The INSTALL file details the requirements for the packages (Graph 0.80 is the only one for bioperl-network, for instance, and there isn't a PPM for that version available yet). It would be nice to skip the tests based on absence of the particular modules or installed programs, and I think the final goal is to possibly attempt to do this. However, all of the bioperl-related distributions have their own documentation which outline their installation, requirements, and use. At least we can point to that, which works for now. We could always start up a wiki page for the various bioperl distributions to monitor problems or issues with each based on OS, proposed enhancements/ideas, etc. Also, most (if not all, including core) have been primarily tested on some *nix-related system, which means that they may not work on Win32 systems. Though the Windows support is light-years ahead of what it used to be circa rel 0.7, I don't think it is full-proof yet, as witnessed by the bioperl-db bug. Frankly, we need more WinXP users for those packages willing to test them out and offer suggestions. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign l From bosborne11 at verizon.net Wed Oct 18 15:13:51 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 18 Oct 2006 11:13:51 -0400 Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in GenBank/EMBL/DDBJ In-Reply-To: <002501c6f2c6$27625540$15327e82@pyrimidine> Message-ID: Chris, No, I don't think they use the form X.Y. See below, from t/LocationFactory.t, we do support most of the forms using ?. Supposedly these tests accommodate all of the possible fuzzy locations encountered in Swissprot, I wrote these a year or so ago. Brian O. # UNCERTAIN locations and positions (Swissprot) "?2465..2774" => [$fuzzy_impl, 2465, 2465, "UNCERTAIN", 2774, 2774, "EXACT", "EXACT", 1, 1], "22..?64" => [$fuzzy_impl, 22, 22, "EXACT", 64, 64, "UNCERTAIN", "EXACT", 1, 1], "?22..?64" => [$fuzzy_impl, 22, 22, "UNCERTAIN", 64, 64, "UNCERTAIN", "EXACT", 1, 1], "?..>393" => [$fuzzy_impl, undef, undef, "UNCERTAIN", 393, undef, "AFTER", "UNCERTAIN", 1, 1], "<1..?" => [$fuzzy_impl, undef, 1, "BEFORE", undef, undef, "UNCERTAIN", "UNCERTAIN", 1, 1], "?..536" => [$fuzzy_impl, undef, undef, "UNCERTAIN", 536, 536, "EXACT", "UNCERTAIN", 1, 1], "1..?" => [$fuzzy_impl, 1, 1, "EXACT", undef, undef, "UNCERTAIN", "UNCERTAIN", 1, 1], "?..?" => [$fuzzy_impl, undef, undef, "UNCERTAIN", undef, undef, "UNCERTAIN", "UNCERTAIN", 1, 1], # Not working yet: #"12..?1" => [$fuzzy_impl, # 1, 1, "UNCERTAIN", 12, 12, "EXACT", "EXACT", 1, 1] On 10/18/06 11:00 AM, "Chris Fields" wrote: > Do they still use the X.Y notations? Those are the most troublesome. I > guess we still don't support the ones containing '?'. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > >> -----Original Message----- >> From: Brian Osborne [mailto:bosborne11 at verizon.net] >> Sent: Wednesday, October 18, 2006 9:43 AM >> To: Chris Fields; bioperl-l at lists.open-bio.org >> Subject: Re: [Bioperl-l] X.Y Fuzzy locations no longer supported in >> GenBank/EMBL/DDBJ >> >> Chris, >> >> I'm fairly sure that Swissprot/Uniprot still supports fuzzy locations, all >> of the more recent examples in t/LocationFactory.t come from there. >> >> Brian O. >> >> >> On 10/18/06 9:55 AM, "Chris Fields" wrote: >> >>> EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations. >>> Not sure about UniProt/SwissProt. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Wed Oct 18 16:56:07 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 11:56:07 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <002601c6f2c9$b6d04a90$15327e82@pyrimidine> Message-ID: <000401c6f2d6$5144e2f0$15327e82@pyrimidine> ... > I haven't tried installing bioperl-network or bioperl-run on WinXP or Mac > OS All, > X yet but intended on doing this within the week. The INSTALL file > details > the requirements for the packages (Graph 0.80 is the only one for > bioperl-network, for instance, and there isn't a PPM for that version > available yet). ... As a followup in this, I tried bioperl-network and had similar failed tests with Graph 0.79 (the only PPM available from ActiveState). However, the INSTALL docs state that Graph 0.80 is needed, and the test run gave several warnings about not having Graph 0.80 installed. I made a PPM of Graph 0.80, installed, retried bioperl-network tests, and everything passed. Maybe we need to have a Graph PPM available for those who want bioperl-network? As for bioperl-run, all tests passed from a new CVS checkout even though I have none of the programs installed, so they seem to skip properly. The test run also printed warnings when a program wasn't available or installed. Chris From bosborne11 at verizon.net Wed Oct 18 17:10:34 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 18 Oct 2006 13:10:34 -0400 Subject: [Bioperl-l] Blast information In-Reply-To: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu> Message-ID: Kevin, Are you looking for hsp_length()? See the SearchIO HOWTO for a list of methods: http://www.bioperl.org/wiki/HOWTO:SearchIO Brian O. On 10/18/06 11:16 AM, "Kevin Brown" wrote: > I just recently upgraded to 1.5.1 on WinXP to bring this version closer > to live to parse some locally created blast files. I'm trying to find > the method that returns the values that are underneath the Identities > and Positives information as I'm trying to replicate the output of an > old blast parser we have here written in RealBasic which is showing its > age. Once I have it replicating the old output I then intend to add > more features in terms of filtering returned hits (like not returning > self->self hits or a->b so don't show b->a). > > Example: > I'm looking for the methods that will return 117 from identities and 117 > from positives. I can't just use num_identical/percent_identity as that > isn't 100% accurate. > >> BurkM_2016 > Length = 241 > > Score = 43.2 bits (88), Expect = 7e-005 > Identities = 26/117 (22%), Positives = 51/117 (43%) > > Query: 298 QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL > 357 > Q F F + A+ ++ + + + L +R GL + P E + A+L > Sbjct: 111 QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL > 170 > > Query: 358 MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414 > A+ G++L++DV ++ H R ++ L L + +S R +T+ L + L > Sbjct: 171 KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227 > > Thanks, > Kevin > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From Kevin.M.Brown at asu.edu Wed Oct 18 21:25:48 2006 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Wed, 18 Oct 2006 14:25:48 -0700 Subject: [Bioperl-l] Blast information Message-ID: <1A4207F8295607498283FE9E93B775B4022A71C3@EX02.asurite.ad.asu.edu> Yes, that does indeed look like what I was after. > -----Original Message----- > From: Brian Osborne [mailto:bosborne11 at verizon.net] > Sent: Wednesday, October 18, 2006 10:11 AM > To: Kevin Brown; bioperl-l > Subject: Re: [Bioperl-l] Blast information > > Kevin, > > Are you looking for hsp_length()? See the SearchIO HOWTO for a list of > methods: > > http://www.bioperl.org/wiki/HOWTO:SearchIO > > > Brian O. > > > On 10/18/06 11:16 AM, "Kevin Brown" wrote: > > > I just recently upgraded to 1.5.1 on WinXP to bring this > version closer > > to live to parse some locally created blast files. I'm > trying to find > > the method that returns the values that are underneath the > Identities > > and Positives information as I'm trying to replicate the > output of an > > old blast parser we have here written in RealBasic which is > showing its > > age. Once I have it replicating the old output I then intend to add > > more features in terms of filtering returned hits (like not > returning > > self->self hits or a->b so don't show b->a). > > > > Example: > > I'm looking for the methods that will return 117 from > identities and 117 > > from positives. I can't just use > num_identical/percent_identity as that > > isn't 100% accurate. > > > >> BurkM_2016 > > Length = 241 > > > > Score = 43.2 bits (88), Expect = 7e-005 > > Identities = 26/117 (22%), Positives = 51/117 (43%) > > > > Query: 298 > QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL > > 357 > > Q F F + A+ ++ + + + L +R GL + > P E + A+L > > Sbjct: 111 > QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL > > 170 > > > > Query: 358 > MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414 > > A+ G++L++DV ++ H R ++ L L + +S R +T+ L + L > > Sbjct: 171 > KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227 > > > > Thanks, > > Kevin > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > From n.appleby at uq.edu.au Wed Oct 18 21:58:06 2006 From: n.appleby at uq.edu.au (Nikki Appleby) Date: Thu, 19 Oct 2006 07:58:06 +1000 Subject: [Bioperl-l] CONTIG dealing Message-ID: <000f01c6f300$7e2f3390$1b776682@IMBPC.AD> I have just entered the wonderful new world of BioPerl, so the answer to my question may be obvious to any of the gurus reading this. I need to collect sequence features and ontology annotations. Here goes. I am retrieving sequences from SwissProt via Bio::DB::SwissProt and get_Seq_by_id, for this example Q8RZV7. Once I have parsed it into an RDBMS format that I am happy with I can get at the xref ids. In this case, they are AP003451; BAB86144.1; -; Genomic_DNA. AP008207; BAF07116.1; -; Genomic_DNA. AB103395; BAC81207.1; -; mRNA. I can happily go off and fetch those from Bio::DB::GenBank (first column), and Bio::DB::GenPept (second). All good, except... AP008207 is a contig. I don't want to get all of the features for the entire thing, just the single contig that actually matches the original sequence. It takes a couple of hours to get at it and then it gives me way too much. I will come across this problem with other sequences. How do I (a) find out if it is a contig without downloading it in it's entirety and (b) extract the list of sequences that are about to be contigged together. I have searched the web for answers, including this list, but see nothing. Help! Nikki Appleby. From bosborne11 at verizon.net Thu Oct 19 00:54:04 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Wed, 18 Oct 2006 20:54:04 -0400 Subject: [Bioperl-l] LocatableSeq object vs Sequence Object In-Reply-To: <830b42e70610170926h5bdaabd6hc1352b8d4f3c8c4c@mail.gmail.com> Message-ID: Peter, I'm not understanding your question, partly because your letter and your code are saying different things. You say you want to call location_from_column() but your code shows you calling species(). What happens when you call location_from_column? Do you see errors? Brian O. On 10/17/06 12:26 PM, "Peter H. Baenziger" wrote: > I was thinking I could use: > foreach $seq ($alignment->each_seq()) > to loop through the sequences and call: > $seq->location_from_column($pos) > on each of the sequences. From cjfields at uiuc.edu Thu Oct 19 02:46:14 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 18 Oct 2006 21:46:14 -0500 Subject: [Bioperl-l] CONTIG dealing In-Reply-To: <000f01c6f300$7e2f3390$1b776682@IMBPC.AD> References: <000f01c6f300$7e2f3390$1b776682@IMBPC.AD> Message-ID: On Oct 18, 2006, at 4:58 PM, Nikki Appleby wrote: > > I have just entered the wonderful new world of BioPerl, so the > answer to my > question may be obvious to any of the gurus reading this. > > I need to collect sequence features and ontology annotations. Here > goes. > > I am retrieving sequences from SwissProt via Bio::DB::SwissProt and > get_Seq_by_id, for this example Q8RZV7. Once I have parsed it into > an RDBMS > format that I am happy with I can get at the xref ids. In this > case, they > are > > AP003451; BAB86144.1; -; Genomic_DNA. > AP008207; BAF07116.1; -; Genomic_DNA. > AB103395; BAC81207.1; -; mRNA. > > I can happily go off and fetch those from Bio::DB::GenBank (first > column), > and Bio::DB::GenPept (second). All good, except... > > AP008207 is a contig. I don't want to get all of the features for > the entire > thing, just the single contig that actually matches the original > sequence. > It takes a couple of hours to get at it and then it gives me way > too much. > > I will come across this problem with other sequences. How do I (a) > find out > if it is a contig without downloading it in it's entirety and (b) > extract > the list of sequences that are about to be contigged together. > > I have searched the web for answers, including this list, but see > nothing. > Help! > > Nikki Appleby. The default setting for the retrieval format for GenBank is 'gbwithparts' (which gets the full sequence at all times). You can set this to 'gb' using request_format() to retrieve the sequence file with the contig information instead of the sequence, if it contains such (otherwise it just retrieves the sequence anyway). However, I have noticed this particular file does not represent a true contig record but is the entire chromosome sequence. The contig information is in the comments section, probably b/c the record is converted over. You could just download the sequence record and run regexp to grab the comments section, then parse out the contigs (a pain) if you really want that. Or you could try to find the equivalent GenBank record, such as the ones derived from the WGS records. I did notice the list of dbxrefs in your swissprot record indicate three EMBL sequences. If the order is consistent for the SwissProt entries you want, they probably represent: The contig (what you want): AP003451; BAB86144.1; -; Genomic_DNA. The supercontig (chromosome) : AP008207; BAF07116.1; -; Genomic_DNA. The cDNA : AB103395; BAC81207.1; -; mRNA. I checked the first one (AP003451), which seems to confirm this. Since the chromosome supercontig is built from the smaller sequence contigs you could just grab the first EMBL dbxref instead of all of them. It parses much faster than the chromosome file. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Wed Oct 18 15:47:14 2006 From: jason at bioperl.org (Jason Stajich) Date: Wed, 18 Oct 2006 08:47:14 -0700 Subject: [Bioperl-l] Blast information In-Reply-To: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu> References: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu> Message-ID: <6B7D24F3-69F1-498D-AB53-B4CEB14E4F3D@bioperl.org> I think this will work for you. The seq_inds method parses the middle homology sequence and classifies each alignment column and returns a list of the columns meeting the criteria. You can interrogate query or hit in this case since you are requiring it to be identical my $identicalbases = scalar $hsp->seq_inds('query', 'identical'); my $conservedbases = scalar $hsp->seq_inds('query','conserved'); Conserved returns those identical or conserved, if you want just those with conservative replacements use 'conserved-not-identical' See http://bioperl.org/wiki/HOWTO:SearchIO#Table_of_Methods for more info. -jason On Oct 18, 2006, at 8:16 AM, Kevin Brown wrote: > I just recently upgraded to 1.5.1 on WinXP to bring this version > closer > to live to parse some locally created blast files. I'm trying to find > the method that returns the values that are underneath the Identities > and Positives information as I'm trying to replicate the output of an > old blast parser we have here written in RealBasic which is showing > its > age. Once I have it replicating the old output I then intend to add > more features in terms of filtering returned hits (like not returning > self->self hits or a->b so don't show b->a). > > Example: > I'm looking for the methods that will return 117 from identities > and 117 > from positives. I can't just use num_identical/percent_identity as > that > isn't 100% accurate. > >> BurkM_2016 > Length = 241 > > Score = 43.2 bits (88), Expect = 7e-005 > Identities = 26/117 (22%), Positives = 51/117 (43%) > > Query: 298 > QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL > 357 > Q F F + A+ ++ + + + L +R GL + P E + > A+L > Sbjct: 111 > QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL > 170 > > Query: 358 > MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414 > A+ G++L++DV ++ H R ++ L L + +S R +T+ L + L > Sbjct: 171 > KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227 > > Thanks, > Kevin > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From jason at bioperl.org Thu Oct 19 05:00:28 2006 From: jason at bioperl.org (Jason Stajich) Date: Wed, 18 Oct 2006 22:00:28 -0700 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> References: <001301c6f215$07a9a070$15327e82@pyrimidine> <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> Message-ID: <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org> So I'm unsure what we should do here. We can certainly fix the problem which you report which is relying on the "" method -- if you were to do instead: print $_->database, ":", $_->primary_id, "\n"; you'll get the right answer. We at a minimum just fix the auto- string converting method to do The Right Thing. But I am not sure if we should keep the version out of the primary_id field. This will require some rejiggering in several modules when it comes to printing DBlinks and I don't want to do this before the release. I also am not sure if there was an explicit reason why someone did put the version information in the primary_id. (I hope it wasn't me because I don't think I'm going to remember why). Does anyone else have a strong feeling? -jason On Oct 17, 2006, at 12:01 PM, Erikjan wrote: > Hello, > > I noticed a little problem with the Annotation "DBLink" from > GenBank entries > > When I run: > > perl -MBio::DB::GenBank -e 'my $gi = > 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio = > $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my > $ac=$seq->annotation(); my @annotations = $ac->get_Annotations > ("dblink"); > for(@annotations) { print $_, "\n";} print $INC{ > "Bio/Annotation/DBLink.pm" }, "\n"; ' > > This yields: > > GenBank:AL591065.17.17 > > and the place where the used Bio/Annotation/DBLink.pm resides. > > Can others repeat this? > > I have dug into the source a little and Bio::Annotation::DBLink > seems to > be the place where this happens: it has a concatenation which leads to > that repeated version number. > > It this something that I should fix "client-side", so to speak, or > is it > worthwhile to add some logic to that concatenation to prevent this? > > > Thanks, > > Eric > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From n.haigh at sheffield.ac.uk Thu Oct 19 06:41:02 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 19 Oct 2006 07:41:02 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <000401c6f2d6$5144e2f0$15327e82@pyrimidine> References: <000401c6f2d6$5144e2f0$15327e82@pyrimidine> Message-ID: <45371DFE.6050306@sheffield.ac.uk> > As a followup in this, I tried bioperl-network and had similar failed tests > with Graph 0.79 (the only PPM available from ActiveState). However, the > INSTALL docs state that Graph 0.80 is needed, and the test run gave several > warnings about not having Graph 0.80 installed. > > I made a PPM of Graph 0.80, installed, retried bioperl-network tests, and > everything passed. Maybe we need to have a Graph PPM available for those > who want bioperl-network? > > As for bioperl-run, all tests passed from a new CVS checkout even though I > have none of the programs installed, so they seem to skip properly. The > test run also printed warnings when a program wasn't available or installed. > > > Chris > > If you can pop the Graph 0.80 ppd files in DIST/RC/ i'll make modifications to integrate them into the package.xml file for PPM4 clients. Nath From n.haigh at sheffield.ac.uk Thu Oct 19 10:40:21 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 19 Oct 2006 11:40:21 +0100 Subject: [Bioperl-l] bioperl-run t/EMBOSS.t Message-ID: <45375615.1020603@sheffield.ac.uk> Should line 25 read: require Bio::Factory::EMBOSS instead of: require Bio::EMBOSS::Factory; Nath From hlapp at gmx.net Thu Oct 19 13:56:05 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 19 Oct 2006 09:56:05 -0400 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org> References: <001301c6f215$07a9a070$15327e82@pyrimidine> <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org> Message-ID: <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net> Here is the overload code: use overload '""' => sub { (($_[0]->database ? $_[0]->database . ':' : '' ) . ($_[0]->primary_id ? $_[0]->primary_id : '') . ($_[0]->version ? '.' . $_[0]->version : '')) || '' }; Except that the last '||' is redundant and unnecessary (it either does nothing or replaces an empty string with an empty string), I don't see the potential for duplicating the version number here - unless primary_id() did that, which I don't see it doing. So, to me this seems to come from a parsing error in the beginning, rather than an erroneous mangling of version into primary_id later. Is someone in the position to confirm this? -hilmar On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: > So I'm unsure what we should do here. > > We can certainly fix the problem which you report which is relying on > the "" method -- if you were to do instead: > print $_->database, ":", $_->primary_id, "\n"; > > you'll get the right answer. We at a minimum just fix the auto- > string converting method to do The Right Thing. > > But I am not sure if we should keep the version out of the primary_id > field. This will require some rejiggering in several modules when it > comes to printing DBlinks and I don't want to do this before the > release. I also am not sure if there was an explicit reason why > someone did put the version information in the primary_id. (I hope it > wasn't me because I don't think I'm going to remember why). > > Does anyone else have a strong feeling? > > -jason > On Oct 17, 2006, at 12:01 PM, Erikjan wrote: > >> Hello, >> >> I noticed a little problem with the Annotation "DBLink" from >> GenBank entries >> >> When I run: >> >> perl -MBio::DB::GenBank -e 'my $gi = >> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio = >> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my >> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations >> ("dblink"); >> for(@annotations) { print $_, "\n";} print $INC{ >> "Bio/Annotation/DBLink.pm" }, "\n"; ' >> >> This yields: >> >> GenBank:AL591065.17.17 >> >> and the place where the used Bio/Annotation/DBLink.pm resides. >> >> Can others repeat this? >> >> I have dug into the source a little and Bio::Annotation::DBLink >> seems to >> be the place where this happens: it has a concatenation which >> leads to >> that repeated version number. >> >> It this something that I should fix "client-side", so to speak, or >> is it >> worthwhile to add some logic to that concatenation to prevent this? >> >> >> Thanks, >> >> Eric >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich, PhD > Miller Research Fellow > University of California > Dept of Plant and Microbial Biology > 321 Koshland Hall #3102 > Berkeley, CA 94720-3102 > lab: 510.642.8441 > http://pmb.berkeley.edu/~taylor/people/js.html > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From dmessina at wustl.edu Thu Oct 19 13:55:31 2006 From: dmessina at wustl.edu (David Messina) Date: Thu, 19 Oct 2006 08:55:31 -0500 Subject: [Bioperl-l] missing documentation (request for help) Message-ID: <69453D5F-7794-4DC7-BAE1-A8B2191752E6@wustl.edu> Hi all, There are a few modules missing a one-line description, and by one- line description, I'm referring to the part that comes after the module name in the POD. e.g. in =head1 NAME Bio::SearchIO - Driver for parsing Sequence Database Searches (BLAST, FASTA, ...) =head1 SYNOPSIS [etc...] "Driver for parsing Sequence Database Searches (BLAST, FASTA, ...)" is the one-line description (even though it falls onto two lines) :). I fixed the modules that I knew something about, but there are some I haven't used. Perhaps the author, or someone else familiar with these modules, could fill in an appropriate short description? Here is the list of affected modules: Bio::DB::Expression Bio::Expression::Contact Bio::Expression::DataSet Bio::Expression::Platform Bio::Expression::Sample Bio::Search::Processor Bio::DB::EUtilities::ElinkData Bio::DB::GFF::Adaptor::memory::feature_serializer Bio::DB::SeqFeature::Store::DBI::Iterator Bio::Expression::FeatureGroup::FeatureGroupMas50 Bio::Expression::FeatureSet::FeatureSetMas50 Bio::Matrix::PSM::PsmHeaderI Bio::OntologyIO::Handlers::BaseSAXHandler Some of these are missing other POD parts as well -- please add those too if you can. Thanks, Dave From mckays at cshl.edu Thu Oct 19 13:51:18 2006 From: mckays at cshl.edu (Sheldon McKay) Date: Thu, 19 Oct 2006 09:51:18 -0400 Subject: [Bioperl-l] chromosome ideograms Message-ID: <6b0de00426b3c04b0d0d7641bc8e14e3@cshl.edu> Hi, Sorry for the late reply. I have been working on a karyotype drawing tool as part of the Generic Genome Browser that may be useful. In addition to drawing features next to chromosome ideograms, it also supports making chromosome 'bands' from any kind of scored features to create a sort of heat map on the chromosome itself. I have a demo running at http://mckay.cshl.edu/cgi-bin/gbrowse_karyotype and the source is available from the GMOD CVS HEAD http://www.gmod.org/cvs Sheldon -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Sheldon McKay, PhD Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 From n.haigh at sheffield.ac.uk Thu Oct 19 15:37:31 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Thu, 19 Oct 2006 15:37:31 +0000 Subject: [Bioperl-l] bioperl-run t/EMBOSS.t In-Reply-To: <45375615.1020603@sheffield.ac.uk> References: <45375615.1020603@sheffield.ac.uk> Message-ID: <45379BBB.1040400@sheffield.ac.uk> Thanks for committing that change Brian. Now the tests proceed from this point, I get the following error: ------------- EXCEPTION: Bio::Root::NotImplemented ------------- MSG: Abstract method "Bio::Tools::Run::WrapperBase::program_dir" is not implemented by package Bio::Tools::Run::EMBOSSApplication. This is not your fault - author of Bio::Tools::Run::EMBOSSApplication should be blamed! STACK: Error::throw STACK: Bio::Root::Root::throw /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/Root.pm:350 STACK: Bio::Root::RootI::throw_not_implemented /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/RootI.pm:522 STACK: Bio::Tools::Run::WrapperBase::program_dir /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:346 STACK: Bio::Tools::Run::WrapperBase::program_path /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:327 STACK: Bio::Tools::Run::WrapperBase::executable /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:297 STACK: t/EMBOSS.t:58 ---------------------------------------------------------------- From N.Haigh at sheffield.ac.uk Thu Oct 19 15:03:00 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 16:03:00 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <45379BBB.1040400@sheffield.ac.uk> References: <45375615.1020603@sheffield.ac.uk> <45379BBB.1040400@sheffield.ac.uk> Message-ID: <1161270180.453793a432e4f@webmail.shef.ac.uk> I thought I'd have my first proper try at writing some tests. I was wondering if there is a template test file that I should use/study in order to be consistent with other tests. Failing that - Is there a good test writing style I should follow in one of the other test files? Thanks Nathan From bosborne11 at verizon.net Thu Oct 19 15:06:08 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Thu, 19 Oct 2006 11:06:08 -0400 Subject: [Bioperl-l] bioperl-run t/EMBOSS.t In-Reply-To: <45379BBB.1040400@sheffield.ac.uk> Message-ID: Nathan, Yes, I see. Those EMBOSS programs work a bit differently from the typical app run by bioperl-run, there's no need for WrapperBase methods like program_dir(), executable(), it seems. Well, I can try and take a look at this tonight but there's probably someone better suited to this than me, I've spent very little time with bioperl-run. Volunteer? Brian O. On 10/19/06 11:37 AM, "Nathan S. Haigh" wrote: > Thanks for committing that change Brian. Now the tests proceed from this > point, I get the following error: > > ------------- EXCEPTION: Bio::Root::NotImplemented ------------- > MSG: Abstract method "Bio::Tools::Run::WrapperBase::program_dir" is not > implemented by package Bio::Tools::Run::EMBOSSApplication. > This is not your fault - author of Bio::Tools::Run::EMBOSSApplication > should be blamed! > > STACK: Error::throw > STACK: Bio::Root::Root::throw > /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/Root.pm:350 > STACK: Bio::Root::RootI::throw_not_implemented > /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/RootI.pm:522 > STACK: Bio::Tools::Run::WrapperBase::program_dir > /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:346 > STACK: Bio::Tools::Run::WrapperBase::program_path > /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:327 > STACK: Bio::Tools::Run::WrapperBase::executable > /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:297 > STACK: t/EMBOSS.t:58 > ---------------------------------------------------------------- > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From niels at genomics.dk Thu Oct 19 15:16:37 2006 From: niels at genomics.dk (Niels Larsen) Date: Thu, 19 Oct 2006 17:16:37 +0200 Subject: [Bioperl-l] From EBI support re WU-Blast SOAP service In-Reply-To: <4535EBF9.1090706@sendu.me.uk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> <4535EBF9.1090706@sendu.me.uk> Message-ID: <453796D5.2070808@genomics.dk> Sendu Bala wrote: >> I invoked the EBI script >> >> http://www.ebi.ac.uk/Tools/webservices/downloads/WSWUBlastSOAPLite.zip >> >> like this >> >> WSWUBlastClient.pl -p blastn -D embl test.fasta >> >> where the content of test.fasta is below, and got >> >> Can't find method element in the message at >> /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311. > > As you admit, this is not a Bioperl issue. I would suggest you contact > EBI support. > To use EBI's WU-blast SOAP interface from perl, EBI support says it one must use SOAP::Lite v 0.60 (no later version) and include '--email you.example.com' on the command line. This is neither evident from their web pages or the script usage statement, but they promised to fix. ------------------------------------------------------------------------ Niels Larsen Danish Genome Institute Gustav Wieds vej 10 C DK-8000 Aarhus C Denmark Electronic mail: niels at genomics.dk Skype: niels_larsen_denmark Telephone: +45-8942-5268 Telefax: +45-8620-1222 ------------------------------------------------------------------------ From cjfields at uiuc.edu Thu Oct 19 15:31:45 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 10:31:45 -0500 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <45371DFE.6050306@sheffield.ac.uk> Message-ID: <001501c6f393$b66bd4a0$15327e82@pyrimidine> > > As a followup in this, I tried bioperl-network and had similar failed > tests > > with Graph 0.79 (the only PPM available from ActiveState). However, the > > INSTALL docs state that Graph 0.80 is needed, and the test run gave > several > > warnings about not having Graph 0.80 installed. > > > > I made a PPM of Graph 0.80, installed, retried bioperl-network tests, > and > > everything passed. Maybe we need to have a Graph PPM available for > those > > who want bioperl-network? > > > > As for bioperl-run, all tests passed from a new CVS checkout even though > I > > have none of the programs installed, so they seem to skip properly. The > > test run also printed warnings when a program wasn't available or > installed. > > > > > > Chris > > > > > If you can pop the Graph 0.80 ppd files in DIST/RC/ i'll make > modifications to integrate them into the package.xml file for PPM4 > clients. > > Nath Will do. Should these be forwarded to Mauricio? Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From N.Haigh at sheffield.ac.uk Thu Oct 19 15:38:05 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 16:38:05 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <001501c6f393$b66bd4a0$15327e82@pyrimidine> References: <001501c6f393$b66bd4a0$15327e82@pyrimidine> Message-ID: <1161272285.45379bdd1aea4@webmail.shef.ac.uk> > > If you can pop the Graph 0.80 ppd files in DIST/RC/ i'll make > > modifications to integrate them into the package.xml file for PPM4 > > clients. > > > > Nath > > Will do. Should these be forwarded to Mauricio? > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > If you don't have access to the web, you can send them to me - I now have an account on that server. Cheers Nath From cjfields at uiuc.edu Thu Oct 19 15:45:00 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 10:45:00 -0500 Subject: [Bioperl-l] test::more template In-Reply-To: <1161270180.453793a432e4f@webmail.shef.ac.uk> Message-ID: <001601c6f395$8a752ed0$15327e82@pyrimidine> > I thought I'd have my first proper try at writing some tests. I was > wondering if there is a template test file that I should use/study in > order to be > consistent with other tests. > > Failing that - Is there a good test writing style I should follow in one > of the other test files? > > Thanks > Nathan I would start with the Test::Simple and Test::More perldoc; they're pretty self-explanatory. You can look at the various test suites using Test::More as well for pointers. By far, most tests will use is(). You can use SKIP blocks to skip tests that have a requirement, or skip all tests if they all require something. Pretty flexible. We should probably get a wiki page for the developers underway, maybe a HOWTO on writing tests. At least have these focus on BioPerl, OOP, remote DB tests, etc. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Thu Oct 19 16:23:40 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 11:23:40 -0500 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net> Message-ID: <001b01c6f39a$f0288ba0$15327e82@pyrimidine> > Here is the overload code: > > use overload '""' => sub { > (($_[0]->database ? $_[0]->database . ':' : '' ) > . ($_[0]->primary_id ? $_[0]->primary_id : '') > . ($_[0]->version ? '.' . $_[0]->version : '')) > || '' }; > > Except that the last '||' is redundant and unnecessary (it either > does nothing or replaces an empty string with an empty string), I > don't see the potential for duplicating the version number here - > unless primary_id() did that, which I don't see it doing. > > So, to me this seems to come from a parsing error in the beginning, > rather than an erroneous mangling of version into primary_id later. > > Is someone in the position to confirm this? > > -hilmar I have attached a script to the bug report on bugzilla, as well as the test output sequence and the actual GenBank record. There are a number of problems: 1) primary_id() is assigned both the id and version. 2) version() is still assigned the version. The above explain when printing the object directly using the overload (it concatenates them). However, there are a few more issues. The ID is printed normally (accession.version), but the source DB is not present when SeqIO handles the sequence. I have attached the output and the original GenBank record to the bug report. I can look into it but it won't be today; got my hands full with enzyme assays. Chris From N.Haigh at sheffield.ac.uk Thu Oct 19 16:50:57 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 17:50:57 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <001601c6f395$8a752ed0$15327e82@pyrimidine> References: <001601c6f395$8a752ed0$15327e82@pyrimidine> Message-ID: <1161276657.4537acf1edc80@webmail.shef.ac.uk> Quoting Chris Fields : > > I thought I'd have my first proper try at writing some tests. I was > > wondering if there is a template test file that I should use/study in > > order to be > > consistent with other tests. > > > > Failing that - Is there a good test writing style I should follow in one > > of the other test files? > > > > Thanks > > Nathan > > I would start with the Test::Simple and Test::More perldoc; they're pretty > self-explanatory. You can look at the various test suites using Test::More > as well for pointers. By far, most tests will use is(). You can use SKIP > blocks to skip tests that have a requirement, or skip all tests if they all > require something. Pretty flexible. > > We should probably get a wiki page for the developers underway, maybe a > HOWTO on writing tests. At least have these focus on BioPerl, OOP, remote > DB tests, etc. > > Chris > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > > Just working through some test things now, I thought I'd start on the bioperl-run stuff as I thought it might be a bit more straight forward, i'm familiar with some of them and they seem to get neglected. I'm heavily commenting my tests with the thought of starting a wiki guide to testing Bioperl modules. See how far I get! Nath From hlapp at gmx.net Thu Oct 19 17:11:27 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 19 Oct 2006 13:11:27 -0400 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org> References: <001301c6f215$07a9a070$15327e82@pyrimidine> <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org> <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net> <8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org> Message-ID: <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net> Actually you did that Jason: http://tinyurl.com/ye2edk Apparently the motivation was to "parse swissprot fields in genpept file (dbsource)"? It clearly looks wrong to add the version. You've probably had a reason why you did this at the time but if we (you :) can't recover that I guess it's best to just fix it to do the right thing (in both places obviously). -hilmar On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote: > Well there is explicit addition of the version to the primary id so > it isn't so much a parsing error as a deliberate decision to append > it. > see Bio::SeqIO::genbank > > to make the dblink > $annotation- > >add_Annotation > ('dblink', > > Bio::Annotation::DBLink->new > (-primary_id > => $id . "." . $version, > -version => > $version, > -database => > $db, > -tagname => > 'dblink')); > > and the code to print the dblink back out in the writer already > assumes the version number is appended... > > foreach my $ref ( $seq->annotation->get_Annotations > ('dblink') ) { > # if ($ref->comment eq 'DBSOURCE') { > $self->_print('DBSOURCE accession ', > $ref->primary_id, "\n"); > # } > } > > On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote: > >> Here is the overload code: >> >> use overload '""' => sub { >> (($_[0]->database ? $_[0]->database . ':' : '' ) >> . ($_[0]->primary_id ? $_[0]->primary_id : '') >> . ($_[0]->version ? '.' . $_[0]->version : '')) >> || '' }; >> >> Except that the last '||' is redundant and unnecessary (it either >> does nothing or replaces an empty string with an empty string), I >> don't see the potential for duplicating the version number here - >> unless primary_id() did that, which I don't see it doing. >> >> So, to me this seems to come from a parsing error in the >> beginning, rather than an erroneous mangling of version into >> primary_id later. >> >> Is someone in the position to confirm this? >> >> -hilmar >> >> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: >> >>> So I'm unsure what we should do here. >>> >>> We can certainly fix the problem which you report which is >>> relying on >>> the "" method -- if you were to do instead: >>> print $_->database, ":", $_->primary_id, "\n"; >>> >>> you'll get the right answer. We at a minimum just fix the auto- >>> string converting method to do The Right Thing. >>> >>> But I am not sure if we should keep the version out of the >>> primary_id >>> field. This will require some rejiggering in several modules >>> when it >>> comes to printing DBlinks and I don't want to do this before the >>> release. I also am not sure if there was an explicit reason why >>> someone did put the version information in the primary_id. (I >>> hope it >>> wasn't me because I don't think I'm going to remember why). >>> >>> Does anyone else have a strong feeling? >>> >>> -jason >>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote: >>> >>>> Hello, >>>> >>>> I noticed a little problem with the Annotation "DBLink" from >>>> GenBank entries >>>> >>>> When I run: >>>> >>>> perl -MBio::DB::GenBank -e 'my $gi = >>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my >>>> $seqio = >>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my >>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations >>>> ("dblink"); >>>> for(@annotations) { print $_, "\n";} print $INC{ >>>> "Bio/Annotation/DBLink.pm" }, "\n"; ' >>>> >>>> This yields: >>>> >>>> GenBank:AL591065.17.17 >>>> >>>> and the place where the used Bio/Annotation/DBLink.pm resides. >>>> >>>> Can others repeat this? >>>> >>>> I have dug into the source a little and Bio::Annotation::DBLink >>>> seems to >>>> be the place where this happens: it has a concatenation which >>>> leads to >>>> that repeated version number. >>>> >>>> It this something that I should fix "client-side", so to speak, or >>>> is it >>>> worthwhile to add some logic to that concatenation to prevent this? >>>> >>>> >>>> Thanks, >>>> >>>> Eric >>>> >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >>> -- >>> Jason Stajich, PhD >>> Miller Research Fellow >>> University of California >>> Dept of Plant and Microbial Biology >>> 321 Koshland Hall #3102 >>> Berkeley, CA 94720-3102 >>> lab: 510.642.8441 >>> http://pmb.berkeley.edu/~taylor/people/js.html >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> >> -- >> =========================================================== >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >> =========================================================== >> >> >> >> >> > > -- > Jason Stajich, PhD > Miller Research Fellow > University of California > Dept of Plant and Microbial Biology > 321 Koshland Hall #3102 > Berkeley, CA 94720-3102 > lab: 510.642.8441 > http://pmb.berkeley.edu/~taylor/people/js.html > > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From N.Haigh at sheffield.ac.uk Thu Oct 19 17:17:33 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 18:17:33 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <001601c6f395$8a752ed0$15327e82@pyrimidine> References: <001601c6f395$8a752ed0$15327e82@pyrimidine> Message-ID: <1161278253.4537b32dd3d15@webmail.shef.ac.uk> Quoting Chris Fields : > > I thought I'd have my first proper try at writing some tests. I was > > wondering if there is a template test file that I should use/study in > > order to be > > consistent with other tests. > > > > Failing that - Is there a good test writing style I should follow in one > > of the other test files? > > > > Thanks > > Nathan > > I would start with the Test::Simple and Test::More perldoc; they're pretty > self-explanatory. You can look at the various test suites using Test::More > as well for pointers. By far, most tests will use is(). You can use SKIP > blocks to skip tests that have a requirement, or skip all tests if they all > require something. Pretty flexible. > > We should probably get a wiki page for the developers underway, maybe a > HOWTO on writing tests. At least have these focus on BioPerl, OOP, remote > DB tests, etc. > > Chris > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > Just wrote a partial and small test script for t/Amap.t in bioperl-run. When I run "perl -I. t/Amap.t" I get the following output: 1..10 ok 1 - use Bio::Tools::Run::Alignment::Amap; ok 2 - use Bio::AlignIO; ok 3 - use Bio::SeqIO; ok 4 - use Bio::Root::IO; ok 5 - All the required modules are present ok 6 - new() returned something ok 7 - and its the right class not ok 8 - executable() got the correct filename # Failed test 'executable() got the correct filename' # in t/Amap.t at line 90. # got: undef # expected: 'filename' ok 9 # skip Got incorrect filename for executable ok 10 # skip Got incorrect filename for executable # Looks like you failed 1 test of 10. So far this looks good (well, that it's failing passing expected tests). However, when i run "make test" the output is unexpected and I don't know why. It seems to die and produce the results of the testing before the rest of the test suit is run: t/Amap....................NOK 8 # Failed test 'executable() got the correct filename' # in t/Amap.t at line 90. # got: undef # expected: 'filename' # Looks like you failed 1 test of 10. t/Amap....................dubious Test returned status 1 (wstat 256, 0x100) DIED. FAILED test 8 Failed 1/10 tests, 90.00% okay (less 2 skipped tests: 7 okay, 70.00%) t/Analysis_soap...........ok 7/17make: *** wait: No child processes. Stop. Is there something I'm missing?? If it's something less obvious, let me know and i'll post whole test file. Nath From cjfields at uiuc.edu Thu Oct 19 17:26:45 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 12:26:45 -0500 Subject: [Bioperl-l] test::more template In-Reply-To: <1161278253.4537b32dd3d15@webmail.shef.ac.uk> Message-ID: <002001c6f3a3$c00b9080$15327e82@pyrimidine> ... > Just wrote a partial and small test script for t/Amap.t in bioperl-run. > When I run "perl -I. t/Amap.t" I get the following output: > 1..10 > ok 1 - use Bio::Tools::Run::Alignment::Amap; > ok 2 - use Bio::AlignIO; > ok 3 - use Bio::SeqIO; > ok 4 - use Bio::Root::IO; > ok 5 - All the required modules are present > ok 6 - new() returned something > ok 7 - and its the right class > not ok 8 - executable() got the correct filename > # Failed test 'executable() got the correct filename' > # in t/Amap.t at line 90. > # got: undef > # expected: 'filename' > ok 9 # skip Got incorrect filename for executable > ok 10 # skip Got incorrect filename for executable > # Looks like you failed 1 test of 10. > > > So far this looks good (well, that it's failing passing expected tests). > However, when i run "make test" the output is unexpected and I don't know > why. It seems to die and produce the results of the testing before the > rest of the test suit is run: > t/Amap....................NOK 8 > # Failed test 'executable() got the correct filename' > # in t/Amap.t at line 90. > # got: undef > # expected: 'filename' > # Looks like you failed 1 test of 10. > t/Amap....................dubious > Test returned status 1 (wstat 256, 0x100) > DIED. FAILED test 8 > Failed 1/10 tests, 90.00% okay (less 2 skipped tests: 7 okay, > 70.00%) > t/Analysis_soap...........ok 7/17make: *** wait: No child processes. > Stop. > > > > Is there something I'm missing?? If it's something less obvious, let me > know and i'll post whole test file. > Nath Well, Analysis_soap.t ran immediately after, so Amap.t tests shouldn't be the problem. The only issue I can think of is that Test::More TODO blocks require a newer version of Test::Harness (which most users have anyway). Are you using a TODO block? You can send me Amap.t and I'll give it a try, but I can't promise I'll get to it immediately (busy day). Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From N.Haigh at sheffield.ac.uk Thu Oct 19 17:38:25 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 18:38:25 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <002001c6f3a3$c00b9080$15327e82@pyrimidine> References: <002001c6f3a3$c00b9080$15327e82@pyrimidine> Message-ID: <1161279505.4537b811e143f@webmail.shef.ac.uk> > Well, Analysis_soap.t ran immediately after, so Amap.t tests shouldn't be > the problem. The only issue I can think of is that Test::More TODO blocks > require a newer version of Test::Harness (which most users have anyway). > Are you using a TODO block? > > You can send me Amap.t and I'll give it a try, but I can't promise I'll get > to it immediately (busy day). > > Chris > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > No TODO blocks. I must have done something wrong - it's the first time I've seen this - but then again, I don't look that closely at the output of "make test" unless something shows as a fail. Anyway, below is the short bit of code. Thanks Nath use strict; use Bio::Root::IO; # cant test for this, might be needed to get Test::More BEGIN { # Things to do ASAP once the script is run # even before anything else in the file is parsed use vars qw($NUMTESTS $DEBUG $error); $DEBUG = $ENV{'BIOIPERLDEBUG'} || 0; # Use installed Test module, otherwise fall back # to copy of Test.pm located in the t dir eval { require Test::More; }; if ( $@ ) { use lib Bio::Root::IO->catfile('t','lib'); } # Currently no errors $error = 0; # Setup the number of tests to be run # what about using: # use Test::More 'no_plan'; use Test::More; $NUMTESTS = 10; plan tests => $NUMTESTS; # Use modules that are needed in this test that are from # any of the Bioperl packages: Bioperl-core, Bioperl-run ... etc # use_ok(''); use_ok('Bio::Tools::Run::Alignment::Amap'); use_ok('Bio::AlignIO'); use_ok('Bio::SeqIO'); use_ok('Bio::Root::IO'); } # Multiple END blocks are run in reverse order of their definition # Last In, First Out (LIFO) END { # Things to do right at the very end, just # when the interpreter finishes/exits # E.g. deleting intermediate files produced during the test foreach my $file ( qw(cysprot.dnd cysprot1a.dnd) ) { unlink $file; # check it was deleted } #unlink qw(cysprot.dnd cysprot1a.dnd) } END { # Not sure what this is doing? #for ( $Test::ntest..$NUMTESTS ) { # skip("Amap program not found. Skipping.\n",1); #} } # if we got to here, thats OK! # is this really needed? ok( 1, 'All the required modules are present'); # setup input files etc my $inputfilename = Bio::Root::IO->catfile("t","data","cysprot.fa"); # setup output files etc # none in this test # setup global objects that are to be used in more than one test # Also test they were initialised correctly my @params = (); my $aln; my $factory = Bio::Tools::Run::Alignment::Amap->new(@params); ok( defined $factory, 'new() returned something' ); ok( $factory->isa('Bio::Tools::Run::Alignment::Amap'), ' and its the right class' ); # Now onto the nitty gritty tests of the modules methods my $executable_file = $factory->executable(); #is( $factory->executable(), 'filename', 'executable() got the correct filename' ); # block of tests to skip if you know the tests will fail # under some condition. E.g.: # Need network access, # Wont work on particular OS, # Cant find the exectuable # Do not just skip tests that seem to fail for an unknown reason SKIP: { # condition used to skip this block of tests #skip($why, $how_many_in_block); skip("Got incorrect filename for executable", 2) unless is($factory->executable(), 'filename', 'executable() got the correct filename'); ok( -e $executable_file, 'Found executable' ); ok( $factory->version >= 2.0, 'Code tested on Amap versions >= 2.0' ); } From jason at bioperl.org Thu Oct 19 17:44:51 2006 From: jason at bioperl.org (Jason Stajich) Date: Thu, 19 Oct 2006 10:44:51 -0700 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net> References: <001301c6f215$07a9a070$15327e82@pyrimidine> <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org> <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net> <8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org> <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net> Message-ID: Yikes - I was worried that it might have been me..... Okay I'll look into fixing it -- ChrisF - check in with me before diving in, in case I've gotten it done and I expect your enzyme assays might take up the time. -jason On Oct 19, 2006, at 10:11 AM, Hilmar Lapp wrote: > Actually you did that Jason: http://tinyurl.com/ye2edk > > Apparently the motivation was to "parse swissprot fields in genpept > file (dbsource)"? > > It clearly looks wrong to add the version. You've probably had a > reason why you did this at the time but if we (you :) can't recover > that I guess it's best to just fix it to do the right thing (in > both places obviously). > > -hilmar > > On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote: > >> Well there is explicit addition of the version to the primary id >> so it isn't so much a parsing error as a deliberate decision to >> append it. >> see Bio::SeqIO::genbank >> >> to make the dblink >> $annotation- >> >add_Annotation >> ('dblink', >> >> Bio::Annotation::DBLink->new >> (-primary_id >> => $id . "." . $version, >> -version => >> $version, >> -database => >> $db, >> -tagname => >> 'dblink')); >> >> and the code to print the dblink back out in the writer already >> assumes the version number is appended... >> >> foreach my $ref ( $seq->annotation->get_Annotations >> ('dblink') ) { >> # if ($ref->comment eq 'DBSOURCE') { >> $self->_print('DBSOURCE accession ', >> $ref->primary_id, "\n"); >> # } >> } >> >> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote: >> >>> Here is the overload code: >>> >>> use overload '""' => sub { >>> (($_[0]->database ? $_[0]->database . ':' : '' ) >>> . ($_[0]->primary_id ? $_[0]->primary_id : '') >>> . ($_[0]->version ? '.' . $_[0]->version : '')) >>> || '' }; >>> >>> Except that the last '||' is redundant and unnecessary (it either >>> does nothing or replaces an empty string with an empty string), I >>> don't see the potential for duplicating the version number here - >>> unless primary_id() did that, which I don't see it doing. >>> >>> So, to me this seems to come from a parsing error in the >>> beginning, rather than an erroneous mangling of version into >>> primary_id later. >>> >>> Is someone in the position to confirm this? >>> >>> -hilmar >>> >>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: >>> >>>> So I'm unsure what we should do here. >>>> >>>> We can certainly fix the problem which you report which is >>>> relying on >>>> the "" method -- if you were to do instead: >>>> print $_->database, ":", $_->primary_id, "\n"; >>>> >>>> you'll get the right answer. We at a minimum just fix the auto- >>>> string converting method to do The Right Thing. >>>> >>>> But I am not sure if we should keep the version out of the >>>> primary_id >>>> field. This will require some rejiggering in several modules >>>> when it >>>> comes to printing DBlinks and I don't want to do this before the >>>> release. I also am not sure if there was an explicit reason why >>>> someone did put the version information in the primary_id. (I >>>> hope it >>>> wasn't me because I don't think I'm going to remember why). >>>> >>>> Does anyone else have a strong feeling? >>>> >>>> -jason >>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote: >>>> >>>>> Hello, >>>>> >>>>> I noticed a little problem with the Annotation "DBLink" from >>>>> GenBank entries >>>>> >>>>> When I run: >>>>> >>>>> perl -MBio::DB::GenBank -e 'my $gi = >>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my >>>>> $seqio = >>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my >>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations >>>>> ("dblink"); >>>>> for(@annotations) { print $_, "\n";} print $INC{ >>>>> "Bio/Annotation/DBLink.pm" }, "\n"; ' >>>>> >>>>> This yields: >>>>> >>>>> GenBank:AL591065.17.17 >>>>> >>>>> and the place where the used Bio/Annotation/DBLink.pm resides. >>>>> >>>>> Can others repeat this? >>>>> >>>>> I have dug into the source a little and Bio::Annotation::DBLink >>>>> seems to >>>>> be the place where this happens: it has a concatenation which >>>>> leads to >>>>> that repeated version number. >>>>> >>>>> It this something that I should fix "client-side", so to speak, or >>>>> is it >>>>> worthwhile to add some logic to that concatenation to prevent >>>>> this? >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Eric >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Bioperl-l mailing list >>>>> Bioperl-l at lists.open-bio.org >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>>> -- >>>> Jason Stajich, PhD >>>> Miller Research Fellow >>>> University of California >>>> Dept of Plant and Microbial Biology >>>> 321 Koshland Hall #3102 >>>> Berkeley, CA 94720-3102 >>>> lab: 510.642.8441 >>>> http://pmb.berkeley.edu/~taylor/people/js.html >>>> >>>> >>>> _______________________________________________ >>>> Bioperl-l mailing list >>>> Bioperl-l at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>>> >>> >>> -- >>> =========================================================== >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : >>> =========================================================== >>> >>> >>> >>> >>> >> >> -- >> Jason Stajich, PhD >> Miller Research Fellow >> University of California >> Dept of Plant and Microbial Biology >> 321 Koshland Hall #3102 >> Berkeley, CA 94720-3102 >> lab: 510.642.8441 >> http://pmb.berkeley.edu/~taylor/people/js.html >> >> > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From cjfields at uiuc.edu Thu Oct 19 18:03:52 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 13:03:52 -0500 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net> Message-ID: <000001c6f3a8$f0a46a00$15327e82@pyrimidine> Also seems that the DBSOURCE line isn't caught correctly and stuffs it by default into a GenBank dblink (the dbsource ihn the test case is EMBL, not GenBank). http://bugzilla.open-bio.org/show_bug.cgi?id=2124 It looks like NCBI may be now using: DBSOURCE embl accession Z49548.1 instead of the old version: DBSOURCE embl locus SCYJR048W, accession Z49548.1 I don't recall NCBI mentioning changes regarding DBSOURCE in any of the recent release notes. Chris > Actually you did that Jason: http://tinyurl.com/ye2edk > > Apparently the motivation was to "parse swissprot fields in genpept > file (dbsource)"? > > It clearly looks wrong to add the version. You've probably had a > reason why you did this at the time but if we (you :) can't recover > that I guess it's best to just fix it to do the right thing (in both > places obviously). > > -hilmar > > On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote: > > > Well there is explicit addition of the version to the primary id so > > it isn't so much a parsing error as a deliberate decision to append > > it. > > see Bio::SeqIO::genbank > > > > to make the dblink > > $annotation- > > >add_Annotation > > ('dblink', > > > > Bio::Annotation::DBLink->new > > (-primary_id > > => $id . "." . $version, > > -version => > > $version, > > -database => > > $db, > > -tagname => > > 'dblink')); > > > > and the code to print the dblink back out in the writer already > > assumes the version number is appended... > > > > foreach my $ref ( $seq->annotation->get_Annotations > > ('dblink') ) { > > # if ($ref->comment eq 'DBSOURCE') { > > $self->_print('DBSOURCE accession ', > > $ref->primary_id, "\n"); > > # } > > } > > > > On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote: > > > >> Here is the overload code: > >> > >> use overload '""' => sub { > >> (($_[0]->database ? $_[0]->database . ':' : '' ) > >> . ($_[0]->primary_id ? $_[0]->primary_id : '') > >> . ($_[0]->version ? '.' . $_[0]->version : '')) > >> || '' }; > >> > >> Except that the last '||' is redundant and unnecessary (it either > >> does nothing or replaces an empty string with an empty string), I > >> don't see the potential for duplicating the version number here - > >> unless primary_id() did that, which I don't see it doing. > >> > >> So, to me this seems to come from a parsing error in the > >> beginning, rather than an erroneous mangling of version into > >> primary_id later. > >> > >> Is someone in the position to confirm this? > >> > >> -hilmar > >> > >> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: > >> > >>> So I'm unsure what we should do here. > >>> > >>> We can certainly fix the problem which you report which is > >>> relying on > >>> the "" method -- if you were to do instead: > >>> print $_->database, ":", $_->primary_id, "\n"; > >>> > >>> you'll get the right answer. We at a minimum just fix the auto- > >>> string converting method to do The Right Thing. > >>> > >>> But I am not sure if we should keep the version out of the > >>> primary_id > >>> field. This will require some rejiggering in several modules > >>> when it > >>> comes to printing DBlinks and I don't want to do this before the > >>> release. I also am not sure if there was an explicit reason why > >>> someone did put the version information in the primary_id. (I > >>> hope it > >>> wasn't me because I don't think I'm going to remember why). > >>> > >>> Does anyone else have a strong feeling? > >>> > >>> -jason > >>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote: > >>> > >>>> Hello, > >>>> > >>>> I noticed a little problem with the Annotation "DBLink" from > >>>> GenBank entries > >>>> > >>>> When I run: > >>>> > >>>> perl -MBio::DB::GenBank -e 'my $gi = > >>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my > >>>> $seqio = > >>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my > >>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations > >>>> ("dblink"); > >>>> for(@annotations) { print $_, "\n";} print $INC{ > >>>> "Bio/Annotation/DBLink.pm" }, "\n"; ' > >>>> > >>>> This yields: > >>>> > >>>> GenBank:AL591065.17.17 > >>>> > >>>> and the place where the used Bio/Annotation/DBLink.pm resides. > >>>> > >>>> Can others repeat this? > >>>> > >>>> I have dug into the source a little and Bio::Annotation::DBLink > >>>> seems to > >>>> be the place where this happens: it has a concatenation which > >>>> leads to > >>>> that repeated version number. > >>>> > >>>> It this something that I should fix "client-side", so to speak, or > >>>> is it > >>>> worthwhile to add some logic to that concatenation to prevent this? > >>>> > >>>> > >>>> Thanks, > >>>> > >>>> Eric > >>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> Bioperl-l mailing list > >>>> Bioperl-l at lists.open-bio.org > >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>> > >>> -- > >>> Jason Stajich, PhD > >>> Miller Research Fellow > >>> University of California > >>> Dept of Plant and Microbial Biology > >>> 321 Koshland Hall #3102 > >>> Berkeley, CA 94720-3102 > >>> lab: 510.642.8441 > >>> http://pmb.berkeley.edu/~taylor/people/js.html > >>> > >>> > >>> _______________________________________________ > >>> Bioperl-l mailing list > >>> Bioperl-l at lists.open-bio.org > >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>> > >> > >> -- > >> =========================================================== > >> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > >> =========================================================== > >> > >> > >> > >> > >> > > > > -- > > Jason Stajich, PhD > > Miller Research Fellow > > University of California > > Dept of Plant and Microbial Biology > > 321 Koshland Hall #3102 > > Berkeley, CA 94720-3102 > > lab: 510.642.8441 > > http://pmb.berkeley.edu/~taylor/people/js.html > > > > > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From N.Haigh at sheffield.ac.uk Thu Oct 19 18:06:11 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 19:06:11 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <002001c6f3a3$c00b9080$15327e82@pyrimidine> References: <002001c6f3a3$c00b9080$15327e82@pyrimidine> Message-ID: <1161281171.4537be93b63c9@webmail.shef.ac.uk> > > Well, Analysis_soap.t ran immediately after, so Amap.t tests shouldn't be > the problem. The only issue I can think of is that Test::More TODO blocks > require a newer version of Test::Harness (which most users have anyway). > Are you using a TODO block? > > You can send me Amap.t and I'll give it a try, but I can't promise I'll get > to it immediately (busy day). > > Chris > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > Nevermind about this - It's working as expected! I got confused as a previous run threw errors but wasn't included in the final table of failed tests - working now. Nath From N.Haigh at sheffield.ac.uk Thu Oct 19 18:14:54 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 19:14:54 +0100 Subject: [Bioperl-l] bioperl-run executable In-Reply-To: <002001c6f3a3$c00b9080$15327e82@pyrimidine> References: <002001c6f3a3$c00b9080$15327e82@pyrimidine> Message-ID: <1161281694.4537c09ebd0f8@webmail.shef.ac.uk> I have a few questions about How bioperl-run modules. 1) How do modules define what the name of the executable is that it uses? 2) Is there a way to test what this is? 3) Does $factory->executable return this or does it only return the name if it successfully found it? Thanks Nath From cjfields at uiuc.edu Thu Oct 19 18:15:08 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 13:15:08 -0500 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: Message-ID: <000001c6f3aa$82845ba0$15327e82@pyrimidine> Go for it. I haven't got the time to spare at the moment, sucky protein assays.... Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Jason Stajich > Sent: Thursday, October 19, 2006 12:45 PM > To: Hilmar Lapp > Cc: bioperl-l at lists.open-bio.org; Erikjan > Subject: Re: [Bioperl-l] Annotation-DBLink- version numbers repeating > > Yikes - I was worried that it might have been me..... > > Okay I'll look into fixing it -- ChrisF - check in with me before > diving in, in case I've gotten it done and I expect your enzyme > assays might take up the time. > > -jason > On Oct 19, 2006, at 10:11 AM, Hilmar Lapp wrote: > > > Actually you did that Jason: http://tinyurl.com/ye2edk > > > > Apparently the motivation was to "parse swissprot fields in genpept > > file (dbsource)"? > > > > It clearly looks wrong to add the version. You've probably had a > > reason why you did this at the time but if we (you :) can't recover > > that I guess it's best to just fix it to do the right thing (in > > both places obviously). > > > > -hilmar > > > > On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote: > > > >> Well there is explicit addition of the version to the primary id > >> so it isn't so much a parsing error as a deliberate decision to > >> append it. > >> see Bio::SeqIO::genbank > >> > >> to make the dblink > >> $annotation- > >> >add_Annotation > >> ('dblink', > >> > >> Bio::Annotation::DBLink->new > >> (-primary_id > >> => $id . "." . $version, > >> -version => > >> $version, > >> -database => > >> $db, > >> -tagname => > >> 'dblink')); > >> > >> and the code to print the dblink back out in the writer already > >> assumes the version number is appended... > >> > >> foreach my $ref ( $seq->annotation->get_Annotations > >> ('dblink') ) { > >> # if ($ref->comment eq 'DBSOURCE') { > >> $self->_print('DBSOURCE accession ', > >> $ref->primary_id, "\n"); > >> # } > >> } > >> > >> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote: > >> > >>> Here is the overload code: > >>> > >>> use overload '""' => sub { > >>> (($_[0]->database ? $_[0]->database . ':' : '' ) > >>> . ($_[0]->primary_id ? $_[0]->primary_id : '') > >>> . ($_[0]->version ? '.' . $_[0]->version : '')) > >>> || '' }; > >>> > >>> Except that the last '||' is redundant and unnecessary (it either > >>> does nothing or replaces an empty string with an empty string), I > >>> don't see the potential for duplicating the version number here - > >>> unless primary_id() did that, which I don't see it doing. > >>> > >>> So, to me this seems to come from a parsing error in the > >>> beginning, rather than an erroneous mangling of version into > >>> primary_id later. > >>> > >>> Is someone in the position to confirm this? > >>> > >>> -hilmar > >>> > >>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: > >>> > >>>> So I'm unsure what we should do here. > >>>> > >>>> We can certainly fix the problem which you report which is > >>>> relying on > >>>> the "" method -- if you were to do instead: > >>>> print $_->database, ":", $_->primary_id, "\n"; > >>>> > >>>> you'll get the right answer. We at a minimum just fix the auto- > >>>> string converting method to do The Right Thing. > >>>> > >>>> But I am not sure if we should keep the version out of the > >>>> primary_id > >>>> field. This will require some rejiggering in several modules > >>>> when it > >>>> comes to printing DBlinks and I don't want to do this before the > >>>> release. I also am not sure if there was an explicit reason why > >>>> someone did put the version information in the primary_id. (I > >>>> hope it > >>>> wasn't me because I don't think I'm going to remember why). > >>>> > >>>> Does anyone else have a strong feeling? > >>>> > >>>> -jason > >>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote: > >>>> > >>>>> Hello, > >>>>> > >>>>> I noticed a little problem with the Annotation "DBLink" from > >>>>> GenBank entries > >>>>> > >>>>> When I run: > >>>>> > >>>>> perl -MBio::DB::GenBank -e 'my $gi = > >>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my > >>>>> $seqio = > >>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my > >>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations > >>>>> ("dblink"); > >>>>> for(@annotations) { print $_, "\n";} print $INC{ > >>>>> "Bio/Annotation/DBLink.pm" }, "\n"; ' > >>>>> > >>>>> This yields: > >>>>> > >>>>> GenBank:AL591065.17.17 > >>>>> > >>>>> and the place where the used Bio/Annotation/DBLink.pm resides. > >>>>> > >>>>> Can others repeat this? > >>>>> > >>>>> I have dug into the source a little and Bio::Annotation::DBLink > >>>>> seems to > >>>>> be the place where this happens: it has a concatenation which > >>>>> leads to > >>>>> that repeated version number. > >>>>> > >>>>> It this something that I should fix "client-side", so to speak, or > >>>>> is it > >>>>> worthwhile to add some logic to that concatenation to prevent > >>>>> this? > >>>>> > >>>>> > >>>>> Thanks, > >>>>> > >>>>> Eric > >>>>> > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> Bioperl-l mailing list > >>>>> Bioperl-l at lists.open-bio.org > >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>>> > >>>> -- > >>>> Jason Stajich, PhD > >>>> Miller Research Fellow > >>>> University of California > >>>> Dept of Plant and Microbial Biology > >>>> 321 Koshland Hall #3102 > >>>> Berkeley, CA 94720-3102 > >>>> lab: 510.642.8441 > >>>> http://pmb.berkeley.edu/~taylor/people/js.html > >>>> > >>>> > >>>> _______________________________________________ > >>>> Bioperl-l mailing list > >>>> Bioperl-l at lists.open-bio.org > >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>>> > >>> > >>> -- > >>> =========================================================== > >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > >>> =========================================================== > >>> > >>> > >>> > >>> > >>> > >> > >> -- > >> Jason Stajich, PhD > >> Miller Research Fellow > >> University of California > >> Dept of Plant and Microbial Biology > >> 321 Koshland Hall #3102 > >> Berkeley, CA 94720-3102 > >> lab: 510.642.8441 > >> http://pmb.berkeley.edu/~taylor/people/js.html > >> > >> > > > > -- > > =========================================================== > > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > > =========================================================== > > > > > > > > > > > > -- > Jason Stajich, PhD > Miller Research Fellow > University of California > Dept of Plant and Microbial Biology > 321 Koshland Hall #3102 > Berkeley, CA 94720-3102 > lab: 510.642.8441 > http://pmb.berkeley.edu/~taylor/people/js.html > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From cjfields at uiuc.edu Thu Oct 19 18:35:08 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 13:35:08 -0500 Subject: [Bioperl-l] bioperl-run executable In-Reply-To: <1161281694.4537c09ebd0f8@webmail.shef.ac.uk> Message-ID: <000301c6f3ad$4ec9af10$15327e82@pyrimidine> I think a lot of the bioperl-run modules use Bio::Tools::Run::WrapperBase but I'm not sure. I haven't used them very much myself but plan on making wrappers at some point soon for some programs I use. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: Nathan Haigh [mailto:N.Haigh at sheffield.ac.uk] > Sent: Thursday, October 19, 2006 1:15 PM > To: Chris Fields > Cc: 'bioperl-l' > Subject: bioperl-run executable > > I have a few questions about How bioperl-run modules. > > 1) How do modules define what the name of the executable is that it uses? > 2) Is there a way to test what this is? > 3) Does $factory->executable return this or does it only return the name > if it successfully found it? > > Thanks > Nath From N.Haigh at sheffield.ac.uk Thu Oct 19 18:47:01 2006 From: N.Haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 19 Oct 2006 19:47:01 +0100 Subject: [Bioperl-l] bioperl-run executable In-Reply-To: <000301c6f3ad$4ec9af10$15327e82@pyrimidine> References: <000301c6f3ad$4ec9af10$15327e82@pyrimidine> Message-ID: <1161283620.4537c82501c43@webmail.shef.ac.uk> Quoting Chris Fields : > I think a lot of the bioperl-run modules use Bio::Tools::Run::WrapperBase > but I'm not sure. I haven't used them very much myself but plan on making > wrappers at some point soon for some programs I use. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > On closer inspection of a couple of other modules (Clustalw.pm and TCoffee.pm), they seem to hardcode the exe name in $PROGRAM_NAME and have a sub (program_name) that simply returns this value. I'd like to see the program_name become a getter/setter so users can change the default and have the string stored in the factory object. Does it matter that Bio::Tools::Run::WrapperBase is in bioperl-core not bioperl-run? I suppose not since bioperl-core is a prerep for bioperl-run but wouldn't it make sence to go in bioperl-run? Nath From cjfields at uiuc.edu Thu Oct 19 19:07:05 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 14:07:05 -0500 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: Message-ID: <000701c6f3b1$c5914230$15327e82@pyrimidine> Jason, Hilmar, How about changing the default parsed dblink in SeqIO::genbank (line 520) to if( $dbsource =~ /^(\S*?)\s*accession\s+(\S+)\.(\d+)/ ) { my ($db,$id,$version) = ($1,$2,$3); $annotation->add_Annotation ('dblink', Bio::Annotation::DBLink->new (-primary_id => $id, -version => $version, -database => $db || 'GenBank', -tagname => 'dblink')); } It passes tests and catches the optional database ('embl' for the bugzilla report). The output sequence still doesn't print the DB if it isn't GenBank via write_seq(), but that should be too hard to fix (famous last words). Okay, okay, back to the assays... Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Jason Stajich > Sent: Thursday, October 19, 2006 12:45 PM > To: Hilmar Lapp > Cc: bioperl-l at lists.open-bio.org; Erikjan > Subject: Re: [Bioperl-l] Annotation-DBLink- version numbers repeating > > Yikes - I was worried that it might have been me..... > > Okay I'll look into fixing it -- ChrisF - check in with me before > diving in, in case I've gotten it done and I expect your enzyme > assays might take up the time. > > -jason > On Oct 19, 2006, at 10:11 AM, Hilmar Lapp wrote: > > > Actually you did that Jason: http://tinyurl.com/ye2edk > > > > Apparently the motivation was to "parse swissprot fields in genpept > > file (dbsource)"? > > > > It clearly looks wrong to add the version. You've probably had a > > reason why you did this at the time but if we (you :) can't recover > > that I guess it's best to just fix it to do the right thing (in > > both places obviously). > > > > -hilmar > > > > On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote: > > > >> Well there is explicit addition of the version to the primary id > >> so it isn't so much a parsing error as a deliberate decision to > >> append it. > >> see Bio::SeqIO::genbank > >> > >> to make the dblink > >> $annotation- > >> >add_Annotation > >> ('dblink', > >> > >> Bio::Annotation::DBLink->new > >> (-primary_id > >> => $id . "." . $version, > >> -version => > >> $version, > >> -database => > >> $db, > >> -tagname => > >> 'dblink')); > >> > >> and the code to print the dblink back out in the writer already > >> assumes the version number is appended... > >> > >> foreach my $ref ( $seq->annotation->get_Annotations > >> ('dblink') ) { > >> # if ($ref->comment eq 'DBSOURCE') { > >> $self->_print('DBSOURCE accession ', > >> $ref->primary_id, "\n"); > >> # } > >> } > >> > >> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote: > >> > >>> Here is the overload code: > >>> > >>> use overload '""' => sub { > >>> (($_[0]->database ? $_[0]->database . ':' : '' ) > >>> . ($_[0]->primary_id ? $_[0]->primary_id : '') > >>> . ($_[0]->version ? '.' . $_[0]->version : '')) > >>> || '' }; > >>> > >>> Except that the last '||' is redundant and unnecessary (it either > >>> does nothing or replaces an empty string with an empty string), I > >>> don't see the potential for duplicating the version number here - > >>> unless primary_id() did that, which I don't see it doing. > >>> > >>> So, to me this seems to come from a parsing error in the > >>> beginning, rather than an erroneous mangling of version into > >>> primary_id later. > >>> > >>> Is someone in the position to confirm this? > >>> > >>> -hilmar > >>> > >>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: > >>> > >>>> So I'm unsure what we should do here. > >>>> > >>>> We can certainly fix the problem which you report which is > >>>> relying on > >>>> the "" method -- if you were to do instead: > >>>> print $_->database, ":", $_->primary_id, "\n"; > >>>> > >>>> you'll get the right answer. We at a minimum just fix the auto- > >>>> string converting method to do The Right Thing. > >>>> > >>>> But I am not sure if we should keep the version out of the > >>>> primary_id > >>>> field. This will require some rejiggering in several modules > >>>> when it > >>>> comes to printing DBlinks and I don't want to do this before the > >>>> release. I also am not sure if there was an explicit reason why > >>>> someone did put the version information in the primary_id. (I > >>>> hope it > >>>> wasn't me because I don't think I'm going to remember why). > >>>> > >>>> Does anyone else have a strong feeling? > >>>> > >>>> -jason > >>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote: > >>>> > >>>>> Hello, > >>>>> > >>>>> I noticed a little problem with the Annotation "DBLink" from > >>>>> GenBank entries > >>>>> > >>>>> When I run: > >>>>> > >>>>> perl -MBio::DB::GenBank -e 'my $gi = > >>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my > >>>>> $seqio = > >>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my > >>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations > >>>>> ("dblink"); > >>>>> for(@annotations) { print $_, "\n";} print $INC{ > >>>>> "Bio/Annotation/DBLink.pm" }, "\n"; ' > >>>>> > >>>>> This yields: > >>>>> > >>>>> GenBank:AL591065.17.17 > >>>>> > >>>>> and the place where the used Bio/Annotation/DBLink.pm resides. > >>>>> > >>>>> Can others repeat this? > >>>>> > >>>>> I have dug into the source a little and Bio::Annotation::DBLink > >>>>> seems to > >>>>> be the place where this happens: it has a concatenation which > >>>>> leads to > >>>>> that repeated version number. > >>>>> > >>>>> It this something that I should fix "client-side", so to speak, or > >>>>> is it > >>>>> worthwhile to add some logic to that concatenation to prevent > >>>>> this? > >>>>> > >>>>> > >>>>> Thanks, > >>>>> > >>>>> Eric > >>>>> > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> Bioperl-l mailing list > >>>>> Bioperl-l at lists.open-bio.org > >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>>> > >>>> -- > >>>> Jason Stajich, PhD > >>>> Miller Research Fellow > >>>> University of California > >>>> Dept of Plant and Microbial Biology > >>>> 321 Koshland Hall #3102 > >>>> Berkeley, CA 94720-3102 > >>>> lab: 510.642.8441 > >>>> http://pmb.berkeley.edu/~taylor/people/js.html > >>>> > >>>> > >>>> _______________________________________________ > >>>> Bioperl-l mailing list > >>>> Bioperl-l at lists.open-bio.org > >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l > >>>> > >>> > >>> -- > >>> =========================================================== > >>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > >>> =========================================================== > >>> > >>> > >>> > >>> > >>> > >> > >> -- > >> Jason Stajich, PhD > >> Miller Research Fellow > >> University of California > >> Dept of Plant and Microbial Biology > >> 321 Koshland Hall #3102 > >> Berkeley, CA 94720-3102 > >> lab: 510.642.8441 > >> http://pmb.berkeley.edu/~taylor/people/js.html > >> > >> > > > > -- > > =========================================================== > > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > > =========================================================== > > > > > > > > > > > > -- > Jason Stajich, PhD > Miller Research Fellow > University of California > Dept of Plant and Microbial Biology > 321 Koshland Hall #3102 > Berkeley, CA 94720-3102 > lab: 510.642.8441 > http://pmb.berkeley.edu/~taylor/people/js.html > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From jason at bioperl.org Thu Oct 19 18:48:28 2006 From: jason at bioperl.org (Jason Stajich) Date: Thu, 19 Oct 2006 11:48:28 -0700 Subject: [Bioperl-l] bioperl-run executable In-Reply-To: <1161281694.4537c09ebd0f8@webmail.shef.ac.uk> References: <002001c6f3a3$c00b9080$15327e82@pyrimidine> <1161281694.4537c09ebd0f8@webmail.shef.ac.uk> Message-ID: <67650240-D61B-4842-AE7C-75F15F608F6F@bioperl.org> program_name() Should return the name of the program executable() Is a function that you don't have to mess with that tries to find the executable named program_name() based on your PATH. -jason On Oct 19, 2006, at 11:14 AM, Nathan Haigh wrote: > I have a few questions about How bioperl-run modules. > > 1) How do modules define what the name of the executable is that it > uses? > 2) Is there a way to test what this is? > 3) Does $factory->executable return this or does it only return the > name if it successfully found it? > > Thanks > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From jason at bioperl.org Thu Oct 19 21:06:43 2006 From: jason at bioperl.org (Jason Stajich) Date: Thu, 19 Oct 2006 14:06:43 -0700 Subject: [Bioperl-l] bioperl-run executable In-Reply-To: <1161283620.4537c82501c43@webmail.shef.ac.uk> References: <000301c6f3ad$4ec9af10$15327e82@pyrimidine> <1161283620.4537c82501c43@webmail.shef.ac.uk> Message-ID: It can be reset now but of course this not a very nice way of doing it: $Bio::Tools::Run::Alignment::Clustalw::PROGRAM_NAME = 'clustalw_smp'; I am not sure if there are pros and cons to making it a getter- setter, but if you want to run with it, please do. The whole run system has been hard to keep people adhering to a standard (and the standard has changed a bit) so some auditing is warranted. -jason On Oct 19, 2006, at 11:47 AM, Nathan Haigh wrote: > Quoting Chris Fields : > >> I think a lot of the bioperl-run modules use >> Bio::Tools::Run::WrapperBase >> but I'm not sure. I haven't used them very much myself but plan >> on making >> wrappers at some point soon for some programs I use. >> >> Christopher Fields >> Postdoctoral Researcher - Switzer Lab >> Dept. of Biochemistry >> University of Illinois Urbana-Champaign >> > > On closer inspection of a couple of other modules (Clustalw.pm and > TCoffee.pm), they seem to hardcode the exe name in $PROGRAM_NAME > and have a sub > (program_name) that simply returns this value. I'd like to see the > program_name become a getter/setter so users can change the default > and have the > string stored in the factory object. > > Does it matter that Bio::Tools::Run::WrapperBase is in bioperl-core > not bioperl-run? I suppose not since bioperl-core is a prerep for > bioperl-run but > wouldn't it make sence to go in bioperl-run? > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From torsten.seemann at infotech.monash.edu.au Thu Oct 19 23:24:03 2006 From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann) Date: Fri, 20 Oct 2006 09:24:03 +1000 Subject: [Bioperl-l] test::more template In-Reply-To: <1161279505.4537b811e143f@webmail.shef.ac.uk> References: <002001c6f3a3$c00b9080$15327e82@pyrimidine> <1161279505.4537b811e143f@webmail.shef.ac.uk> Message-ID: <45380913.3070506@infotech.monash.edu.au> Nathan, > use strict; > use Bio::Root::IO; # cant test for this, might be needed to get Test::More use File::Spec->catfile as Bio::Root::IO->catfile just delegates anyway, and File::Spec is "guaranteed" to be installed with Perl 5.6+. > use lib Bio::Root::IO->catfile('t','lib'); Simpler as: use lib 't/lib'; I understand the 'lib.pm' accepts Unix style directories REGARDLESS of native platform. -- Torsten Seemann Victorian Bioinformatics Consortium, Monash University, Australia From prabubio at gmail.com Fri Oct 20 00:11:36 2006 From: prabubio at gmail.com (Prabu Raja) Date: 20 Oct 2006 00:11:36 -0000 Subject: [Bioperl-l] Prabu Raja sent you this link Message-ID: <20061020001136.86586.qmail@x05.namesdatabase.com> Remember your link from Prabu Raja: http://namesdatabase.com/2m.pl?k2=40892642637&s=nsender2 1 -> Use Prabu Raja's link by clicking above. 2 -> Enter your info for a membership connected to Prabu. 3 -> Share links with other friends, family and co-workers. 4 -> Use the members-only people search tools. Prabu selected you for this on 09-02-2004 22:52 ET. prabubio at gmail.com (Prabu Raja) initiated this to bioperl-l at lists.open-bio.org at 10-05-2006 02:30 on namesdatabase.com from the IP address 59.92.88.99. If you do not know a Prabu Raja, use http://namesdatabase.com/u.pl?bb2=40892642637&s=nsender2 to halt more reminders about this. For reference, the address of The Names Database is 1253 N. Research Way, Suite Q-2500, Orem, UT 84097. From cjfields at uiuc.edu Fri Oct 20 00:29:11 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 19:29:11 -0500 Subject: [Bioperl-l] test::more template In-Reply-To: <45380913.3070506@infotech.monash.edu.au> Message-ID: <000f01c6f3de$c3d91170$15327e82@pyrimidine> > Nathan, > > > use strict; > > use Bio::Root::IO; # cant test for this, might be needed to get > Test::More > > use File::Spec->catfile as Bio::Root::IO->catfile just delegates anyway, > and File::Spec is "guaranteed" to be installed with Perl 5.6+. > > > use lib Bio::Root::IO->catfile('t','lib'); > > Simpler as: > use lib 't/lib'; > I understand the 'lib.pm' accepts Unix style directories REGARDLESS of > native > platform. > > -- > Torsten Seemann > Victorian Bioinformatics Consortium, Monash University, Australia That is true, at least for WinXP (not sure about older Windows versions out there). I was using 'Root::IO->catfile' but found 'use lib 't/lib' works. I may have a few of the 'catfile' versions floating around out there, which may be where that originated. Note that if you plan on using Test::More with the bioperl-run test suite, you should add it to the bioperl-run CVS distribution directory in 't/lib'. Most people will have it installed, but you never know. Chris From cjfields at uiuc.edu Fri Oct 20 00:33:22 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 19 Oct 2006 19:33:22 -0500 Subject: [Bioperl-l] Prabu Raja sent you this link In-Reply-To: <20061020001136.86586.qmail@x05.namesdatabase.com> Message-ID: <001001c6f3df$598a24c0$15327e82@pyrimidine> That Prabu Raja sure gets around... Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Prabu Raja > Sent: Thursday, October 19, 2006 7:12 PM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] Prabu Raja sent you this link > > Remember your link from Prabu Raja: > > http://namesdatabase.com/2m.pl?k2=40892642637&s=nsender2 > > > 1 -> Use Prabu Raja's link by clicking above. > > 2 -> Enter your info for a membership connected to Prabu. > > 3 -> Share links with other friends, family and co-workers. > > 4 -> Use the members-only people search tools. > > Prabu selected you for this on 09-02-2004 22:52 ET. > > > prabubio at gmail.com (Prabu Raja) initiated this to bioperl-l at lists.open- > bio.org > at 10-05-2006 02:30 on namesdatabase.com from the IP address 59.92.88.99. > If you do not know a Prabu Raja, use > http://namesdatabase.com/u.pl?bb2=40892642637&s=nsender2 to halt more > reminders about this. > For reference, the address of The Names Database is 1253 N. Research Way, > Suite Q-2500, Orem, UT 84097. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From keithplayer at hotmail.com Fri Oct 20 02:13:52 2006 From: keithplayer at hotmail.com (Keith Player) Date: Fri, 20 Oct 2006 02:13:52 +0000 (UTC) Subject: [Bioperl-l] Bio::DB::GFF::Util::Binning Message-ID: I know that there may be some changes resulting from new GFF3 implementations, but thought I would see if the following is useful anyway. I implemented the R-tree binning schema as used by Bio::DB::GFF::Util::Binning and as mention in this article: I tested the following query on a normal table (no binning), but it assumes that you know the longest range in the table. So for example with a table of human genes, where the longest gene we know of is around 2.4Mb. SELECT COUNT(*) as count FROM groups WHERE start > max(0,[start-2.4Mb]) AND g.start < [end] AND g.end > [start] AND g.chromosome = '1' so for 100Mb:101Mb SELECT COUNT(*) as count FROM groups WHERE start > 97600000 AND g.start < 101000000 AND g.end > 100000000 AND g.chromosome = '1' where [start] and [end] define the region of interest. This query outperforms the R-Tree implementation on all tests that I have performed (for lengths of 200bp to 10Mb across a whole chromsome). Could this be of some practical use? From jason at bioperl.org Thu Oct 19 15:50:49 2006 From: jason at bioperl.org (Jason Stajich) Date: Thu, 19 Oct 2006 08:50:49 -0700 Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating In-Reply-To: <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net> References: <001301c6f215$07a9a070$15327e82@pyrimidine> <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl> <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org> <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net> Message-ID: <8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org> Well there is explicit addition of the version to the primary id so it isn't so much a parsing error as a deliberate decision to append it. see Bio::SeqIO::genbank to make the dblink $annotation- >add_Annotation ('dblink', Bio::Annotation::DBLink->new (-primary_id => $id . "." . $version, -version => $version, -database => $db, -tagname => 'dblink')); and the code to print the dblink back out in the writer already assumes the version number is appended... foreach my $ref ( $seq->annotation->get_Annotations ('dblink') ) { # if ($ref->comment eq 'DBSOURCE') { $self->_print('DBSOURCE accession ', $ref->primary_id, "\n"); # } } On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote: > Here is the overload code: > > use overload '""' => sub { > (($_[0]->database ? $_[0]->database . ':' : '' ) > . ($_[0]->primary_id ? $_[0]->primary_id : '') > . ($_[0]->version ? '.' . $_[0]->version : '')) > || '' }; > > Except that the last '||' is redundant and unnecessary (it either > does nothing or replaces an empty string with an empty string), I > don't see the potential for duplicating the version number here - > unless primary_id() did that, which I don't see it doing. > > So, to me this seems to come from a parsing error in the beginning, > rather than an erroneous mangling of version into primary_id later. > > Is someone in the position to confirm this? > > -hilmar > > On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote: > >> So I'm unsure what we should do here. >> >> We can certainly fix the problem which you report which is relying on >> the "" method -- if you were to do instead: >> print $_->database, ":", $_->primary_id, "\n"; >> >> you'll get the right answer. We at a minimum just fix the auto- >> string converting method to do The Right Thing. >> >> But I am not sure if we should keep the version out of the primary_id >> field. This will require some rejiggering in several modules when it >> comes to printing DBlinks and I don't want to do this before the >> release. I also am not sure if there was an explicit reason why >> someone did put the version information in the primary_id. (I hope it >> wasn't me because I don't think I'm going to remember why). >> >> Does anyone else have a strong feeling? >> >> -jason >> On Oct 17, 2006, at 12:01 PM, Erikjan wrote: >> >>> Hello, >>> >>> I noticed a little problem with the Annotation "DBLink" from >>> GenBank entries >>> >>> When I run: >>> >>> perl -MBio::DB::GenBank -e 'my $gi = >>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my >>> $seqio = >>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my >>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations >>> ("dblink"); >>> for(@annotations) { print $_, "\n";} print $INC{ >>> "Bio/Annotation/DBLink.pm" }, "\n"; ' >>> >>> This yields: >>> >>> GenBank:AL591065.17.17 >>> >>> and the place where the used Bio/Annotation/DBLink.pm resides. >>> >>> Can others repeat this? >>> >>> I have dug into the source a little and Bio::Annotation::DBLink >>> seems to >>> be the place where this happens: it has a concatenation which >>> leads to >>> that repeated version number. >>> >>> It this something that I should fix "client-side", so to speak, or >>> is it >>> worthwhile to add some logic to that concatenation to prevent this? >>> >>> >>> Thanks, >>> >>> Eric >>> >>> >>> >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> >> -- >> Jason Stajich, PhD >> Miller Research Fellow >> University of California >> Dept of Plant and Microbial Biology >> 321 Koshland Hall #3102 >> Berkeley, CA 94720-3102 >> lab: 510.642.8441 >> http://pmb.berkeley.edu/~taylor/people/js.html >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From n.haigh at sheffield.ac.uk Fri Oct 20 08:35:03 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 20 Oct 2006 08:35:03 +0000 Subject: [Bioperl-l] test::more template In-Reply-To: <000f01c6f3de$c3d91170$15327e82@pyrimidine> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> Message-ID: <45388A37.7040505@sheffield.ac.uk> Chris Fields wrote: >> Nathan, >> >> >>> use strict; >>> use Bio::Root::IO; # cant test for this, might be needed to get >>> >> Test::More >> >> use File::Spec->catfile as Bio::Root::IO->catfile just delegates anyway, >> and File::Spec is "guaranteed" to be installed with Perl 5.6+. >> >> >>> use lib Bio::Root::IO->catfile('t','lib'); >>> >> Simpler as: >> use lib 't/lib'; >> I understand the 'lib.pm' accepts Unix style directories REGARDLESS of >> native >> platform. >> >> -- >> Torsten Seemann >> Victorian Bioinformatics Consortium, Monash University, Australia >> > > That is true, at least for WinXP (not sure about older Windows versions out > there). I was using 'Root::IO->catfile' but found 'use lib 't/lib' works. > I may have a few of the 'catfile' versions floating around out there, which > may be where that originated. > > Note that if you plan on using Test::More with the bioperl-run test suite, > you should add it to the bioperl-run CVS distribution directory in 't/lib'. > Most people will have it installed, but you never know. > > Chris > > > What is the reason for including Test::More in 't/lib' rather than having it as a prereq? -- > A: Yes. >> Q: Are you sure? >> >>> A: Because it reverses the logical flow of conversation. >>> >>>> Q: Why is top posting frowned upon? >>>> Get Thunderbird From n.haigh at sheffield.ac.uk Fri Oct 20 09:27:19 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Fri, 20 Oct 2006 10:27:19 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <000f01c6f3de$c3d91170$15327e82@pyrimidine> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> Message-ID: <45389677.1000709@sheffield.ac.uk> Is it really necessary to specify the number of tests that are to be conducted in advance? It seems a bit annoying to have to count the number of tests in the script or to run the test just to see how many tests were done, we could just use: use Test::More 'no_plan'; And then it's up to Test::More to keep a track of how many tests it's run. The only thing then to worry about is how many tests are in a SKIP block if the skip criteria are met. This is unless there is a good reason to use it that I am unaware of. Thanks Nath From bix at sendu.me.uk Fri Oct 20 10:01:09 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 20 Oct 2006 11:01:09 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <45389677.1000709@sheffield.ac.uk> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> <45389677.1000709@sheffield.ac.uk> Message-ID: <45389E65.6080908@sendu.me.uk> Nathan Haigh wrote: > Is it really necessary to specify the number of tests that are to be > conducted in advance? It seems a bit annoying to have to count the > number of tests in the script or to run the test just to see how many > tests were done, we could just use: > use Test::More 'no_plan'; It's very important to have a plan. That way you know all the tests actually ran and weren't skipped (either due to an actual SKIP block or an if block that returned false due to a bug, or a for/foreach/while that didn't loop enough times due to a bug, or any number of other reasons). From bix at sendu.me.uk Fri Oct 20 10:04:48 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 20 Oct 2006 11:04:48 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <45388A37.7040505@sheffield.ac.uk> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> <45388A37.7040505@sheffield.ac.uk> Message-ID: <45389F40.5060601@sendu.me.uk> Nathan S. Haigh wrote: > Chris Fields wrote: > >> Note that if you plan on using Test::More with the bioperl-run test suite, >> you should add it to the bioperl-run CVS distribution directory in 't/lib'. >> Most people will have it installed, but you never know. > > What is the reason for including Test::More in 't/lib' rather than > having it as a prereq? Because we want to ensure that the test suite runs and tells you real problems (if any) about the code (Bioperl) that it is testing, not problems about actually running the tests (which are NOT required for using Bioperl, so cannot be considered 'pre-requisites'). From n.haigh at sheffield.ac.uk Fri Oct 20 10:54:30 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Fri, 20 Oct 2006 11:54:30 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <45389E65.6080908@sendu.me.uk> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> <45389677.1000709@sheffield.ac.uk> <45389E65.6080908@sendu.me.uk> Message-ID: <4538AAE6.5070600@sheffield.ac.uk> If there are known bugs in a particular version of software, what is the best approach for dealing with tests that would fail due to this bug? Simply skip those tests that would be affected by the bug, or to fail if the affected version is detected and report the reason so the user is informed? Or simply bump the minimum version to one above the affected versions? For example, t/Clustalw has a test for at least version 1.8. It then has some profile alignment tests that are only run if version > 1.82 is installed. It states that versions 1.81 and 1.82 are affected by a profile alignment bug - which i assume would make the tests fail. Cheers Nath From bix at sendu.me.uk Fri Oct 20 11:06:07 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 20 Oct 2006 12:06:07 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <4538AAE6.5070600@sheffield.ac.uk> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> <45389677.1000709@sheffield.ac.uk> <45389E65.6080908@sendu.me.uk> <4538AAE6.5070600@sheffield.ac.uk> Message-ID: <4538AD9F.8040003@sendu.me.uk> Nathan Haigh wrote: > If there are known bugs in a particular version of software, what is the > best approach for dealing with tests that would fail due to this bug? > Simply skip those tests that would be affected by the bug, or to fail if > the affected version is detected and report the reason so the user is > informed? Or simply bump the minimum version to one above the affected > versions? > > For example, t/Clustalw has a test for at least version 1.8. It then has > some profile alignment tests that are only run if version > 1.82 is > installed. It states that versions 1.81 and 1.82 are affected by a > profile alignment bug - which i assume would make the tests fail. Specific cases like this, I'd discuss on the list/ with the author of the module in question. Maybe there is some great need to allow usage with <1.81? My view, based purely on what you've said above, bump the pre-requisite to a version that works. From cjfields at uiuc.edu Fri Oct 20 12:36:37 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 20 Oct 2006 07:36:37 -0500 Subject: [Bioperl-l] test::more template In-Reply-To: <45388A37.7040505@sheffield.ac.uk> References: <000f01c6f3de$c3d91170$15327e82@pyrimidine> <45388A37.7040505@sheffield.ac.uk> Message-ID: <80A2D210-B0DB-4CD2-9B56-A38097F4F63F@uiuc.edu> >> ,,, >> > What is the reason for including Test::More in 't/lib' rather than > having it as a prereq? We could do that. Many CPAN modules include it in 't/lib' b/c it is only needed for testing purposes. Chris > > -- >> A: Yes. >>> Q: Are you sure? >>> >>>> A: Because it reverses the logical flow of conversation. >>>> >>>>> Q: Why is top posting frowned upon? >>>>> > Get Thunderbird Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Fri Oct 20 14:44:29 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 20 Oct 2006 15:44:29 +0100 Subject: [Bioperl-l] Updated Makefile.PL Message-ID: <4538E0CD.1030908@sendu.me.uk> Hi, I've just committed an updated Makefile.PL to HEAD for bioperl-live. Could some people test it on multiple platforms and confirm it is ok (try out the different possible options as well)? (NB. in the below, 'pre-reqs' are things the makefile considers optional dependencies) Note that some pre-reqs have been removed: # DBD::mysql (nothing in Bioperl-live uses it; Bioperl modules may end up requiring it but only after the user makes an explicit choice by typing 'DBD::mysql' in their own code to supply as an option to Bioperl code) # File::Temp (standard in 5.6.1) This pre-req was wrong: # Data::Stag::Writer and has been replaced with: Data::Stag::XMLWriter Also, I note that very many Bioperl modules need IO::String, including Bio::SeqIO, so I'm not sure to what extent we can pretend it is an optional module. I didn't make any change though. I don't know if these changes affect the Windows ppm Nathan, or anything else (Bundle?)? The INSTALL docs need updating with these new and improved pre-reqs (note that some pre-reqs had wrong/not enough Bioperl modules listed as needing them); does someone want to correct the wiki (based on the new Makefile.PL) and then Chris can re-create the text version? From hlapp at gmx.net Fri Oct 20 15:03:34 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 20 Oct 2006 11:03:34 -0400 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <4538E0CD.1030908@sendu.me.uk> References: <4538E0CD.1030908@sendu.me.uk> Message-ID: <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net> On Oct 20, 2006, at 10:44 AM, Sendu Bala wrote: > Also, I note that very many Bioperl modules need IO::String, including > Bio::SeqIO, so I'm not sure to what extent we can pretend it is an > optional module. I didn't make any change though. I agree. There's really not that many terribly useful things you can do with Bioperl w/o having IO::String installed, which is in stark contrast to many other dependencies. I don't have a problem with making it (and a few others used all over the place) required, to better contrast them with the dependencies that are really optional (and not needed for 90% of users). -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Fri Oct 20 15:18:32 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 20 Oct 2006 10:18:32 -0500 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <4538E0CD.1030908@sendu.me.uk> Message-ID: <001501c6f45b$019103c0$15327e82@pyrimidine> > Hi, > I've just committed an updated Makefile.PL to HEAD for bioperl-live. > Could some people test it on multiple platforms and confirm it is ok > (try out the different possible options as well)? > > (NB. in the below, 'pre-reqs' are things the makefile considers optional > dependencies) > > Note that some pre-reqs have been removed: > # DBD::mysql (nothing in Bioperl-live uses it; Bioperl modules may end > up requiring it but only after the user makes an explicit choice by > typing 'DBD::mysql' in their own code to supply as an option to Bioperl > code) > # File::Temp (standard in 5.6.1) I'll try it out on WinXP and Mac OS X. BTW, do any of Lincoln's Bio::DB* use DBD::mySQL? Bio::DB::GFF comes to mind. I don't think it should be an absolute requirement, though. If we plan on removing those, then we should also remove them from Bundle::Bioperl (if they are present). > This pre-req was wrong: > # Data::Stag::Writer > and has been replaced with: > Data::Stag::XMLWriter > > > Also, I note that very many Bioperl modules need IO::String, including > Bio::SeqIO, so I'm not sure to what extent we can pretend it is an > optional module. I didn't make any change though. Do they all require IO::String or is it an option? There are a few instances (WebDBSeqI-implementing, for instance) where this is presented as an option for most OS's (along with the default, pipeline, and tempfile). However, it is currently used by default with Windows due to lack of pipe/fork support at the time. BTW, the latter may now work with WinXP ActivePerl. ActiveState has been working on WinXP fork() emulation for a while, but I think it is still somewhat experimental. > I don't know if these changes affect the Windows ppm Nathan, or anything > else (Bundle?)? > > The INSTALL docs need updating with these new and improved pre-reqs > (note that some pre-reqs had wrong/not enough Bioperl modules listed as > needing them); does someone want to correct the wiki (based on the new > Makefile.PL) and then Chris can re-create the text version? Easier to just modify the text version based on what is changed in the wiki, at least for the time being. The text dumping from elinks/lynx isn't full-proof re: tables and such, which is one reason I think we should move the prereqs to a separate file as it's easier to maintain long-term (this seems to be where most changes occur anyway). Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Fri Oct 20 15:23:38 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 20 Oct 2006 16:23:38 +0100 Subject: [Bioperl-l] test::more template In-Reply-To: <1161270180.453793a432e4f@webmail.shef.ac.uk> References: <45375615.1020603@sheffield.ac.uk> <45379BBB.1040400@sheffield.ac.uk> <1161270180.453793a432e4f@webmail.shef.ac.uk> Message-ID: <4538E9FA.60701@sendu.me.uk> Nathan Haigh wrote: > I thought I'd have my first proper try at writing some tests. I was wondering if there is a template test file that I should use/study in order to be > consistent with other tests. > > Failing that - Is there a good test writing style I should follow in one of the other test files? I originally based mine on one of Chris's EUtilities tests, but now refer to t/ESEfinder.t since it is small and demonstrates all the major tricky things you might have to do - skip remote tests if no BIOPERLDEBUG, skip remote tests on remote server failure, skip all tests under some condition, fall-back to t/lib for Test::More if necessary. (Though I just spotted an oops in the latter...) From cjfields at uiuc.edu Fri Oct 20 15:38:02 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 20 Oct 2006 10:38:02 -0500 Subject: [Bioperl-l] test::more template In-Reply-To: <4538E9FA.60701@sendu.me.uk> Message-ID: <001601c6f45d$bb824350$15327e82@pyrimidine> > Nathan Haigh wrote: > > I thought I'd have my first proper try at writing some tests. I was > wondering if there is a template test file that I should use/study in > order to be > > consistent with other tests. > > > > Failing that - Is there a good test writing style I should follow in one > of the other test files? > > I originally based mine on one of Chris's EUtilities tests, but now > refer to t/ESEfinder.t since it is small and demonstrates all the major > tricky things you might have to do - skip remote tests if no > BIOPERLDEBUG, skip remote tests on remote server failure, skip all tests > under some condition, fall-back to t/lib for Test::More if necessary. > > (Though I just spotted an oops in the latter...) I agree. The EUtilities tests are quite long. I plan on eventually cutting out some of them Making them somewhat less prone to changes in returned XML data has also been a pain, as demonstrated by some of the tests from MAIN now failing... d'oh! Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Fri Oct 20 15:39:32 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 20 Oct 2006 16:39:32 +0100 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <001501c6f45b$019103c0$15327e82@pyrimidine> References: <001501c6f45b$019103c0$15327e82@pyrimidine> Message-ID: <4538EDB4.3030500@sendu.me.uk> Chris Fields wrote: > BTW, do any of Lincoln's Bio::DB* > use DBD::mySQL? Bio::DB::GFF comes to mind. No, just a require on a user-passed variable as I described. >> Also, I note that very many Bioperl modules need IO::String, including >> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an >> optional module. I didn't make any change though. > > Do they all require IO::String or is it an option? Oops, I take that back. Bio::SeqIO doesn't use IO::String. That's what you get for relying on grep output... It's still many modules that use it, but I suppose you could do useful things without. So actually, let's keep it optional. From cjfields at uiuc.edu Fri Oct 20 20:32:32 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 20 Oct 2006 15:32:32 -0500 Subject: [Bioperl-l] Error retrieving sequence from BioSQL Message-ID: <000001c6f486$df508930$15327e82@pyrimidine> Seth, Did you work out the problem here? There was a recent CVS update to OBDA tests (16obda.t) that fixed similar problems on Mac OS X. Old OBDA tests apparently left data from tests in the database, which caused problems with repeated test runs. Chris > > -----Original Message----- > > From: bioperl-l-bounces lists.open-bio.org [mailto:bioperl-l- > > bounces lists.open-bio.org] On Behalf Of Seth Johnson > > Sent: Saturday, September 30, 2006 6:35 PM > > To: Hilmar Lapp > > Cc: Chris Fields; Bioperl List > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > > > Here're complete test details: > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > ... > > > FAILED tests 10-12 > > Failed 3/12 tests, 75.00% okay > > Failed Test Stat Wstat Total Fail Failed List of Failed > > > -------------------------------------------------------------------------- > > ----- > > t\02species.t 65 2 3.08% 63 65 > > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > > t\12ontology.t 2 512 738 1471 199.32% 3-738 > > t\16obda.t 12 3 25.00% 10-12 > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- Best Regards, Seth Johnson Senior Bioinformatics Associate Ph: (202) 470-0900 Fx: (775) 251-0358 Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From olenka.m at gmail.com Fri Oct 20 21:47:15 2006 From: olenka.m at gmail.com (Olena Morozova) Date: Fri, 20 Oct 2006 14:47:15 -0700 Subject: [Bioperl-l] GO annotations Message-ID: <259a224c0610201447i44bd5226x3d66c31dbf3f715d@mail.gmail.com> Dear all, Does anyone know an easy way to get GO-BP annotations for ensembl genes? Thank you very much for your help, Olena From olenka.m at gmail.com Fri Oct 20 21:47:15 2006 From: olenka.m at gmail.com (Olena Morozova) Date: Fri, 20 Oct 2006 14:47:15 -0700 Subject: [Bioperl-l] GO annotations Message-ID: <259a224c0610201447i44bd5226x3d66c31dbf3f715d@mail.gmail.com> Dear all, Does anyone know an easy way to get GO-BP annotations for ensembl genes? Thank you very much for your help, Olena From sdavis2 at mail.nih.gov Sat Oct 21 15:05:26 2006 From: sdavis2 at mail.nih.gov (Davis, Sean (NIH/NCI) [E]) Date: Sat, 21 Oct 2006 11:05:26 -0400 Subject: [Bioperl-l] GO annotations References: <259a224c0610201447i44bd5226x3d66c31dbf3f715d@mail.gmail.com> Message-ID: <014DBF86B19310419F0DF8910FC56457240CE3@nihcesmlbx10.nih.gov> You can use the ensembl perl API, or (more simply) use the Ensembl MART interface: http://www.ensembl.org/Multi/martview Sean -----Original Message----- From: Olena Morozova [mailto:olenka.m at gmail.com] Sent: Fri 10/20/2006 5:47 PM To: bioperl-l Subject: [Bioperl-l] GO annotations Dear all, Does anyone know an easy way to get GO-BP annotations for ensembl genes? Thank you very much for your help, Olena _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l From n.haigh at sheffield.ac.uk Sun Oct 22 10:34:51 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sun, 22 Oct 2006 10:34:51 +0000 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net> References: <4538E0CD.1030908@sendu.me.uk> <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net> Message-ID: <453B494B.7040702@sheffield.ac.uk> Hilmar Lapp wrote: > On Oct 20, 2006, at 10:44 AM, Sendu Bala wrote: > > >> Also, I note that very many Bioperl modules need IO::String, including >> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an >> optional module. I didn't make any change though. >> > > I agree. There's really not that many terribly useful things you can > do with Bioperl w/o having IO::String installed, which is in stark > contrast to many other dependencies. > > I don't have a problem with making it (and a few others used all over > the place) required, to better contrast them with the dependencies > that are really optional (and not needed for 90% of users). > > -hilmar > > Is it possible to make a distinction in Makefile.PL between those modules that are an absolute must for Bioperl-core and those which are optional and should go into Bundle::BioPerl? Once I'm sure what should be "option" I'll do the Bundle::BioPerl package and PPD's. Cheers Nath From vitacolonna at appliedgenomics.org Sun Oct 22 13:04:48 2006 From: vitacolonna at appliedgenomics.org (Nicola Vitacolonna) Date: Sun, 22 Oct 2006 15:04:48 +0200 Subject: [Bioperl-l] Submission proposal: ABIF module Message-ID: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org> Hi everybody, I would like to submit to CPAN a module for reading and parsing the ABIF files (with .ab1 suffix) produced by Applied Biosequence sequencers. The need for such a module arose in our lab because the existing ABI module we found on CPAN had too limited functionality. As an example, our module allows us to easily produce analysis reports similar to the ones generated by the Sequencing Analysis software. May I call the module Bio::ABIF? Or should I follow other conventions? Nicola From cjfields at uiuc.edu Sun Oct 22 13:54:51 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 22 Oct 2006 08:54:51 -0500 Subject: [Bioperl-l] Submission proposal: ABIF module In-Reply-To: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org> References: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org> Message-ID: On Oct 22, 2006, at 8:04 AM, Nicola Vitacolonna wrote: > Hi everybody, > I would like to submit to CPAN a module for reading and parsing the > ABIF files (with .ab1 suffix) produced by Applied Biosequence > sequencers. The need for such a module arose in our lab because the > existing ABI module we found on CPAN had too limited functionality. > As an example, our module allows us to easily produce analysis > reports similar to the ones generated by the Sequencing Analysis > software. > > May I call the module Bio::ABIF? Or should I follow other conventions? > > Nicola It depends. Does it interact with bioperl in any way? Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Sun Oct 22 13:57:18 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 22 Oct 2006 08:57:18 -0500 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <453B494B.7040702@sheffield.ac.uk> References: <4538E0CD.1030908@sendu.me.uk> <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net> <453B494B.7040702@sheffield.ac.uk> Message-ID: <7E563B1E-11D7-44F4-9E3D-3E319CE8DDA4@uiuc.edu> On Oct 22, 2006, at 5:34 AM, Nathan S. Haigh wrote: > Is it possible to make a distinction in Makefile.PL between those > modules that are an absolute must for Bioperl-core and those which are > optional and should go into Bundle::BioPerl? > > Once I'm sure what should be "option" I'll do the Bundle::BioPerl > package and PPD's. > > Cheers > Nath We probably should steer this way eventually. Do you aim on placing prereqs required for bioperl core in the bioperl PPD and the 'optional' ones with the bundle? Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From vitacolonna at appliedgenomics.org Sun Oct 22 14:16:26 2006 From: vitacolonna at appliedgenomics.org (Nicola Vitacolonna) Date: Sun, 22 Oct 2006 16:16:26 +0200 Subject: [Bioperl-l] Submission proposal: ABIF module In-Reply-To: References: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org> Message-ID: <8619131E-0F3D-4718-B7E3-7B5118B1139E@appliedgenomics.org> On 22/ott/06, at 15:54, Chris Fields wrote: > > On Oct 22, 2006, at 8:04 AM, Nicola Vitacolonna wrote: > >> Hi everybody, >> I would like to submit to CPAN a module for reading and parsing the >> ABIF files (with .ab1 suffix) [...] >> May I call the module Bio::ABIF? Or should I follow other >> conventions? > > It depends. Does it interact with bioperl in any way? No. Can you suggest a suitable pattern for the name? Nicola From cjfields at uiuc.edu Sun Oct 22 14:55:46 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 22 Oct 2006 09:55:46 -0500 Subject: [Bioperl-l] Submission proposal: ABIF module In-Reply-To: <8619131E-0F3D-4718-B7E3-7B5118B1139E@appliedgenomics.org> References: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org> <8619131E-0F3D-4718-B7E3-7B5118B1139E@appliedgenomics.org> Message-ID: On Oct 22, 2006, at 9:16 AM, Nicola Vitacolonna wrote: > On 22/ott/06, at 15:54, Chris Fields wrote: > >> >> On Oct 22, 2006, at 8:04 AM, Nicola Vitacolonna wrote: >> >>> Hi everybody, >>> I would like to submit to CPAN a module for reading and parsing the >>> ABIF files (with .ab1 suffix) [...] >>> May I call the module Bio::ABIF? Or should I follow other >>> conventions? >> >> It depends. Does it interact with bioperl in any way? > > No. Can you suggest a suitable pattern for the name? > > Nicola I don't think it will be a problem to name it Bio::ABIF; there is already a Bio::ASN1::EntrezGene, and Rutger Vos's Bio::Phylo modules (the latter doesn't require BioPerl either). Saying that, if you plan on contributing more CPAN modules with similar functionality (such as parsing other trace files), you might want to consider using a namespace that isn't limiting but doesn't conflict with Bioperl core (like Bio::Trace or similar, then name your module Bio::Trace::ABIF). You can use search.cpan.org to check namespaces for conflicts. Just as an note: we have bioperl-ext, which also parses ABI and other trace file formats. It's a bit old now and needs updating, but is supposed to be quite fast (it uses the Staden io_lib C library via PerlXS). -c Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From arareko at campus.iztacala.unam.mx Sun Oct 22 17:26:37 2006 From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra) Date: Sun, 22 Oct 2006 12:26:37 -0500 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <4538E0CD.1030908@sendu.me.uk> References: <4538E0CD.1030908@sendu.me.uk> Message-ID: <453BA9CD.4060107@campus.iztacala.unam.mx> Works fine on FreeBSD. Mauricio. Sendu Bala wrote: > Hi, > I've just committed an updated Makefile.PL to HEAD for bioperl-live. > Could some people test it on multiple platforms and confirm it is ok > (try out the different possible options as well)? > > (NB. in the below, 'pre-reqs' are things the makefile considers optional > dependencies) > > Note that some pre-reqs have been removed: > # DBD::mysql (nothing in Bioperl-live uses it; Bioperl modules may end > up requiring it but only after the user makes an explicit choice by > typing 'DBD::mysql' in their own code to supply as an option to Bioperl > code) > # File::Temp (standard in 5.6.1) > > > This pre-req was wrong: > # Data::Stag::Writer > and has been replaced with: > Data::Stag::XMLWriter > > > Also, I note that very many Bioperl modules need IO::String, including > Bio::SeqIO, so I'm not sure to what extent we can pretend it is an > optional module. I didn't make any change though. > > > I don't know if these changes affect the Windows ppm Nathan, or anything > else (Bundle?)? > > The INSTALL docs need updating with these new and improved pre-reqs > (note that some pre-reqs had wrong/not enough Bioperl modules listed as > needing them); does someone want to correct the wiki (based on the new > Makefile.PL) and then Chris can re-create the text version? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- MAURICIO HERRERA CUADRA arareko at campus.iztacala.unam.mx Laboratorio de Gen?tica Unidad de Morfofisiolog?a y Funci?n Facultad de Estudios Superiores Iztacala, UNAM From n.haigh at sheffield.ac.uk Sun Oct 22 19:37:07 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sun, 22 Oct 2006 20:37:07 +0100 Subject: [Bioperl-l] Updated Makefile.PL In-Reply-To: <7E563B1E-11D7-44F4-9E3D-3E319CE8DDA4@uiuc.edu> References: <4538E0CD.1030908@sendu.me.uk> <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net> <453B494B.7040702@sheffield.ac.uk> <7E563B1E-11D7-44F4-9E3D-3E319CE8DDA4@uiuc.edu> Message-ID: <453BC863.4090803@sheffield.ac.uk> Chris Fields wrote: > > On Oct 22, 2006, at 5:34 AM, Nathan S. Haigh wrote: > >> Is it possible to make a distinction in Makefile.PL between those >> modules that are an absolute must for Bioperl-core and those which are >> optional and should go into Bundle::BioPerl? >> >> Once I'm sure what should be "option" I'll do the Bundle::BioPerl >> package and PPD's. >> >> Cheers >> Nath > > We probably should steer this way eventually. Do you aim on placing > prereqs required for bioperl core in the bioperl PPD and the > 'optional' ones with the bundle? > That's correct. However, PPM will always try to update packages to the latest available. Therefore, if at some point in the future, a dependency is removed, and thus removed from Bundle::BioPerl, a situation may arise where an older version of BioPerl is running with the a recent version of Bundle::BioPerl and could have missing dependencies - not ideal but it is how things currently stand. The process of making the Bundle::BioPerl PPD would be simplified if these "optional" dependencies are separated from the "core" dependencies. If one of the following solutions is possible (i'm not sure if they are), it would be very useful: 1) Maintain 2 hashes in Makefile.PL that contain the "core" and "optional" dependencies. In unsure of the way dependencies are ordered during a "make ppd", but it may be possible to pass hash references of both to PREREQS_PM in MakeMakefile and have the "optional" depenencies grouped separately from "core" depenedcies in the ppd file - thus making it easy to stip them out into a Bundle::BioPerl ppd. 2) Again, maintain 2 hashes in Makefile.PL that contain the "core" and "optional" dependencies. Have some Makefile setup that allows the generation of a Bundle::BioPerl ppd separately from the main Bioperl ppd. Like I said, these are just some thoughts and I'm not sure if they are even viable options. Nath From chhalling at alumni.ls.berkeley.edu Sun Oct 22 23:45:33 2006 From: chhalling at alumni.ls.berkeley.edu (Conrad Halling) Date: Sun, 22 Oct 2006 19:45:33 -0400 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl Message-ID: <453C029D.1070708@alumni.ls.berkeley.edu> I have found three misspellings in Bundle::BioPerl 2.1.6 of 17-Oct-2006 that prevent these modules from being installed: Data::Stag::Writer (listed as Data::Stag::writer) HTTP::Request::Common (listed as HTTP::Request::Common-) Spreadsheet::ParseExcel (listed as Spreadhseet::ParseExcel) -- Conrad Halling chhalling at alumni.ls.berkeley.edu From cjfields at uiuc.edu Mon Oct 23 02:24:07 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 22 Oct 2006 21:24:07 -0500 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C029D.1070708@alumni.ls.berkeley.edu> References: <453C029D.1070708@alumni.ls.berkeley.edu> Message-ID: <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> Thanks for letting us know! Did PPM4 throw errors or just silently pass them over? Chris On Oct 22, 2006, at 6:45 PM, Conrad Halling wrote: > I have found three misspellings in Bundle::BioPerl 2.1.6 of 17- > Oct-2006 > that prevent these modules from being installed: > > Data::Stag::Writer (listed as Data::Stag::writer) > HTTP::Request::Common (listed as HTTP::Request::Common-) > Spreadsheet::ParseExcel (listed as Spreadhseet::ParseExcel) > > -- > Conrad Halling > chhalling at alumni.ls.berkeley.edu > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Mon Oct 23 06:45:29 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 06:45:29 +0000 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> Message-ID: <453C6509.90005@sheffield.ac.uk> Chris Fields wrote: > Thanks for letting us know! Did PPM4 throw errors or just silently > pass them over? > > Chris > > On Oct 22, 2006, at 6:45 PM, Conrad Halling wrote: > > I believe he is talking about the bundle on cpan and not the ppd. I will get this updated as soon as possible. Sendu/Chris - can you confirm to me which Bioperl modules are essential to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any reason for not putting *all* dependencies into the bundle? Nath From bix at sendu.me.uk Mon Oct 23 06:43:36 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 07:43:36 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C029D.1070708@alumni.ls.berkeley.edu> References: <453C029D.1070708@alumni.ls.berkeley.edu> Message-ID: <453C6498.5@sendu.me.uk> Conrad Halling wrote: > I have found three misspellings in Bundle::BioPerl 2.1.6 of 17-Oct-2006 > that prevent these modules from being installed: > > Data::Stag::Writer (listed as Data::Stag::writer) This should be Data::Stag::XMLWriter > HTTP::Request::Common (listed as HTTP::Request::Common-) > Spreadsheet::ParseExcel (listed as Spreadhseet::ParseExcel) From bix at sendu.me.uk Mon Oct 23 06:52:47 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 07:52:47 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C6509.90005@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> Message-ID: <453C66BF.1060008@sendu.me.uk> Nathan S. Haigh wrote: > Sendu/Chris - can you confirm to me which Bioperl modules are essential > to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any > reason for not putting *all* dependencies into the bundle? AFAIK, there are no essential external dependencies. Everything in %packages in Makefile.PL, for example, is optional. We had the discussion about making all the easy-to-install ones a forced requirement anyway (so that most things work out of the box), but perhaps we'll hold off on making such a change until after 1.5.2. From jyotikshah at gmail.com Mon Oct 23 07:10:43 2006 From: jyotikshah at gmail.com (Jyoti Shah) Date: Mon, 23 Oct 2006 00:10:43 -0700 Subject: [Bioperl-l] short motif searches Message-ID: <769931430610230010t5f68e54eg7fbc1be658c2cb3a@mail.gmail.com> Hi, I am interested in searching motifs as small as 6 or 7 nucleotides in genomic databases. I need exact matches. Is there any bioperl module available which can help me do this? I tried WU BLAST with word size one, but I am getting warning messages such as "WARNING: the maximum achievable score of 7 in context 0 (frame +1) is less than the ungapped cutoff score S2 (=13). Exit code 0...". Any suggestions? Thanks in advance, Jyoti From bix at sendu.me.uk Mon Oct 23 07:55:40 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 08:55:40 +0100 Subject: [Bioperl-l] short motif searches In-Reply-To: <769931430610230010t5f68e54eg7fbc1be658c2cb3a@mail.gmail.com> References: <769931430610230010t5f68e54eg7fbc1be658c2cb3a@mail.gmail.com> Message-ID: <453C757C.1010408@sendu.me.uk> Jyoti Shah wrote: > Hi, > > I am interested in searching motifs as small as 6 or 7 nucleotides in > genomic databases. I need exact matches. Is there any bioperl module > available which can help me do this? At 6 or 7bp long doing a simple exact match I should point out you're going to get very many hits; are you sure this is an appropriate thing to do for your purposes? Assuming yes, you can use Bio::SeqIO, Bio::Index or Bio::DB:: to get your genomic sequences of interest, then simply use a normal perl regexp on the resulting $seq->seq strings. If your motifs are anything like transcription factor binding sites, and you have more information than just a single sequence string for the motif, investigate Bio::Matrix::PSM. From bix at sendu.me.uk Mon Oct 23 08:29:52 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 09:29:52 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C7648.8030004@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> Message-ID: <453C7D80.80207@sendu.me.uk> Nathan S. Haigh wrote: > Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> Sendu/Chris - can you confirm to me which Bioperl modules are essential >>> to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any >>> reason for not putting *all* dependencies into the bundle? >> AFAIK, there are no essential external dependencies. Everything in >> %packages in Makefile.PL, for example, is optional. >> >> We had the discussion about making all the easy-to-install ones a >> forced requirement anyway (so that most things work out of the box), >> but perhaps we'll hold off on making such a change until after 1.5.2. > > How are they forced? They're not. Right now they're optional. I'm suggesting we might change that in the future. If you're asking how we /would/ force them, probably by adding PREREQ_FATAL to the WriteMakefile() call. As it is, Bioperl installs successfully (or should!) without its optional dependencies given in PREREQ_PM because make test succeeds (because tests skip ok when the optional dependency isn't there). I don't really know how CPAN discovers dependencies and auto-installs them before a dependent module though. Anyone care to explain? From n.haigh at sheffield.ac.uk Mon Oct 23 10:09:12 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 10:09:12 +0000 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C7D80.80207@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> Message-ID: <453C94C8.5040900@sheffield.ac.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: >> Sendu Bala wrote: >>> Nathan S. Haigh wrote: >>>> Sendu/Chris - can you confirm to me which Bioperl modules are >>>> essential >>>> to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any >>>> reason for not putting *all* dependencies into the bundle? >>> AFAIK, there are no essential external dependencies. Everything in >>> %packages in Makefile.PL, for example, is optional. >>> >>> We had the discussion about making all the easy-to-install ones a >>> forced requirement anyway (so that most things work out of the box), >>> but perhaps we'll hold off on making such a change until after 1.5.2. > > >> How are they forced? > > They're not. Right now they're optional. I'm suggesting we might > change that in the future. > If you're asking how we /would/ force them, probably by adding > PREREQ_FATAL to the WriteMakefile() call. As it is, Bioperl installs > successfully (or should!) without its optional dependencies given in > PREREQ_PM because make test succeeds (because tests skip ok when the > optional dependency isn't there). > > I don't really know how CPAN discovers dependencies and auto-installs > them before a dependent module though. Anyone care to explain? I thought so! I misunderstood something earlier which confused me. Just to clarify for my own sanities sake: 1) Currently all dependencies are optional. 2) All dependencies are in %packages 3) all these are passed to PREREQ_PM As far as CPAN discovering dependencies, here is a snip from the CPAN FAQ's: --snip-- I installed a Bundle and had a couple of fails. When I retried, everything resolved nicely. Can this be fixed to work on first try? The reason for this is that CPAN does not know the dependencies of all modules when it starts out. To decide about the additional items to install, it just uses data found in the META.yml file or the generated Makefile. An undetected missing piece breaks the process. But it may well be that your Bundle installs some prerequisite later than some depending item and thus your second try is able to resolve everything. Please note, CPAN.pm does not know the dependency tree in advance and cannot sort the queue of things to install in a topologically correct order. It resolves perfectly well IF all modules declare the prerequisites correctly with the PREREQ_PM attribute to MakeMaker or the |requires| stanza of Module::Build. For bundles which fail and you need to install often, it is recommended to sort the Bundle definition file manually. --snip-- Therefore, recent modifications to Makefile.PL should result in a fully operational Bioperl installation, if installed via CPAN. Although only Bioperl 1.4 is available via CPAN currently. It is possible to upload a developer release to CPAN which can only be ownloaded via CPAN if specifically asked for - would be good for 1.5.x.: --snip-- How do I install a "DEVELOPER RELEASE" of a module? By default, CPAN will install the latest non-developer release of a module. If you want to install a dev release, you have to specify the partial path starting with the author id to the tarball you wish to install, like so: cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz Note that you can use the |ls| command to get this path listed. --snip-- HTH Nath From bix at sendu.me.uk Mon Oct 23 09:41:52 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 10:41:52 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C94C8.5040900@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> Message-ID: <453C8E60.7000105@sendu.me.uk> Nathan S. Haigh wrote: > Sendu Bala wrote: > >> I don't really know how CPAN discovers dependencies and auto-installs >> them before a dependent module though. Anyone care to explain? > > I thought so! I misunderstood something earlier which confused me. Just > to clarify for my own sanities sake: > > 1) Currently all dependencies are optional. > 2) All dependencies are in %packages > 3) all these are passed to PREREQ_PM All correct. > As far as CPAN discovering dependencies, here is a snip from the CPAN FAQ's: > --snip-- > > I installed a Bundle and had a couple of fails. When I retried, > everything resolved nicely. Can this be fixed to work on first try? > > The reason for this is that CPAN does not know the dependencies of > all modules when it starts out. To decide about the additional items > to install, it just uses data found in the META.yml file or the > generated Makefile. An undetected missing piece breaks the process. > But it may well be that your Bundle installs some prerequisite later > than some depending item and thus your second try is able to resolve > everything. Please note, CPAN.pm does not know the dependency tree > in advance and cannot sort the queue of things to install in a > topologically correct order. It resolves perfectly well IF all > modules declare the prerequisites correctly with the PREREQ_PM > attribute to MakeMaker or the |requires| stanza of Module::Build. > For bundles which fail and you need to install often, it is > recommended to sort the Bundle definition file manually. > > --snip-- > > Therefore, recent modifications to Makefile.PL should result in a fully > operational Bioperl installation, if installed via CPAN. Right, thanks for that. > Although only Bioperl 1.4 is available via CPAN currently. It is possible to upload a > developer release to CPAN which can only be ownloaded via CPAN if > specifically asked for - would be good for 1.5.x.: > --snip-- > > How do I install a "DEVELOPER RELEASE" of a module? > > By default, CPAN will install the latest non-developer release of a > module. If you want to install a dev release, you have to specify > the partial path starting with the author id to the tarball you wish > to install, like so: > > cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz > > Note that you can use the |ls| command to get this path listed. > > --snip-- That's the user point of view - how does the developer actually tell CPAN that something is a developer release so that normal users don't automatically install it? From bix at sendu.me.uk Mon Oct 23 09:59:52 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 10:59:52 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C8E60.7000105@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> Message-ID: <453C9298.9000900@sendu.me.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: >> As far as CPAN discovering dependencies, here is a snip from the CPAN >> FAQ's: >> --snip-- >> >> I installed a Bundle and had a couple of fails. When I retried, >> everything resolved nicely. Can this be fixed to work on first try? >> >> The reason for this is that CPAN does not know the dependencies of >> all modules when it starts out. To decide about the additional items >> to install, it just uses data found in the META.yml file or the >> generated Makefile. An undetected missing piece breaks the process. >> But it may well be that your Bundle installs some prerequisite later >> than some depending item and thus your second try is able to resolve >> everything. Please note, CPAN.pm does not know the dependency tree >> in advance and cannot sort the queue of things to install in a >> topologically correct order. It resolves perfectly well IF all >> modules declare the prerequisites correctly with the PREREQ_PM >> attribute to MakeMaker or the |requires| stanza of Module::Build. >> For bundles which fail and you need to install often, it is >> recommended to sort the Bundle definition file manually. >> >> --snip-- >> >> Therefore, recent modifications to Makefile.PL should result in a fully >> operational Bioperl installation, if installed via CPAN. > > Right, thanks for that. Oh, so this effectively means that our 'optional' dependencies are installed for CPAN users, which matches up to my 'force the optional ones anyway' desire, leaving Bundle::BioPerl without any use. Makefile.PL could be altered again to remove from PREREQ_PM those modules the user didn't already have installed, thus CPAN would only install Bioperl itself and nothing optional. The user could then install Bundle::BioPerl if they wanted a quick way of getting all the optional stuff to work. I'm happy either way; what do other people think? From n.haigh at sheffield.ac.uk Mon Oct 23 11:22:17 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 11:22:17 +0000 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C9298.9000900@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453C9298.9000900@sendu.me.uk> Message-ID: <453CA5E9.1060406@sheffield.ac.uk> Sendu Bala wrote: > Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> As far as CPAN discovering dependencies, here is a snip from the >>> CPAN FAQ's: >>> --snip-- >>> >>> I installed a Bundle and had a couple of fails. When I retried, >>> everything resolved nicely. Can this be fixed to work on first try? >>> >>> The reason for this is that CPAN does not know the dependencies of >>> all modules when it starts out. To decide about the additional >>> items >>> to install, it just uses data found in the META.yml file or the >>> generated Makefile. An undetected missing piece breaks the process. >>> But it may well be that your Bundle installs some prerequisite >>> later >>> than some depending item and thus your second try is able to >>> resolve >>> everything. Please note, CPAN.pm does not know the dependency tree >>> in advance and cannot sort the queue of things to install in a >>> topologically correct order. It resolves perfectly well IF all >>> modules declare the prerequisites correctly with the PREREQ_PM >>> attribute to MakeMaker or the |requires| stanza of Module::Build. >>> For bundles which fail and you need to install often, it is >>> recommended to sort the Bundle definition file manually. >>> >>> --snip-- >>> >>> Therefore, recent modifications to Makefile.PL should result in a fully >>> operational Bioperl installation, if installed via CPAN. >> >> Right, thanks for that. > > Oh, so this effectively means that our 'optional' dependencies are > installed for CPAN users, which matches up to my 'force the optional > ones anyway' desire, leaving Bundle::BioPerl without any use. > > Makefile.PL could be altered again to remove from PREREQ_PM those > modules the user didn't already have installed, thus CPAN would only > install Bioperl itself and nothing optional. The user could then > install Bundle::BioPerl if they wanted a quick way of getting all the > optional stuff to work. > > I'm happy either way; what do other people think? >From my point of view, removing them from PREREQ_PM means building the Bundle::BioPerl a bit of a pain :o( I prefer the way it is currently set up - most people have fast internet connections and GB of harddrive space. Other than the reason "why install something I won't ever need" I don't see much point maintaining Bundle::BioPerl and having "optional" dependencies. I think if there are any modules which are not going to be used by the majority of users, then this could be used as the rationale for removing them from bioperl-core into another package? Nath From n.haigh at sheffield.ac.uk Mon Oct 23 11:38:05 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 11:38:05 +0000 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C8E60.7000105@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> Message-ID: <453CA99D.9060009@sheffield.ac.uk> >> Although only Bioperl 1.4 is available via CPAN currently. It is >> possible to upload a >> developer release to CPAN which can only be ownloaded via CPAN if >> specifically asked for - would be good for 1.5.x.: >> --snip-- >> >> How do I install a "DEVELOPER RELEASE" of a module? >> >> By default, CPAN will install the latest non-developer release of a >> module. If you want to install a dev release, you have to specify >> the partial path starting with the author id to the tarball you wish >> to install, like so: >> >> cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz >> >> Note that you can use the |ls| command to get this path listed. >> >> --snip-- > > That's the user point of view - how does the developer actually tell > CPAN that something is a developer release so that normal users don't > automatically install it? I found this: http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt Is says that $VERSION should simply be changed from a naked number into a single quoted number and this should be recognized by the CPAN indexer. Nath From bix at sendu.me.uk Mon Oct 23 10:47:38 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 11:47:38 +0100 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> <4535EBF9.1090706@sendu.me.uk> <4536113D.1080307@sheffield.ac.uk> Message-ID: <453C9DCA.4020802@sendu.me.uk> Hilmar Lapp wrote: > On Oct 18, 2006, at 7:34 AM, Nathan Haigh wrote: > >> For example, I have made no effort to setup biosql-schema but I >> thought that maybe there would be a test that would detect this > > I'm afraid there isn't. Bioperl-db is meaningless without > biosql-schema. Can you suggest a way we might detect if biosql-schema has been installed prior to running the test suite, so we can give some meaningful error message? From bix at sendu.me.uk Mon Oct 23 12:43:30 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 13:43:30 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453CA5E9.1060406@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk> Message-ID: <453CB8F2.7070703@sendu.me.uk> Nathan S. Haigh wrote: > Sendu Bala wrote: > >> Makefile.PL could be altered again to remove from PREREQ_PM those >> modules the user didn't already have installed, thus CPAN would only >> install Bioperl itself and nothing optional. The user could then >> install Bundle::BioPerl if they wanted a quick way of getting all the >> optional stuff to work. >> >> I'm happy either way; what do other people think? > > From my point of view, removing them from PREREQ_PM means building the > Bundle::BioPerl a bit of a pain :o( Can I ask how you're generating Bundle::BioPerl? That is, how did the typos get in there? Is there a way to certainly avoid typos in the future? From n.haigh at sheffield.ac.uk Mon Oct 23 13:46:17 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 13:46:17 +0000 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453CB8F2.7070703@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk> <453CB8F2.7070703@sendu.me.uk> Message-ID: <453CC7A9.6090609@sheffield.ac.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: >> Sendu Bala wrote: >> >>> Makefile.PL could be altered again to remove from PREREQ_PM those >>> modules the user didn't already have installed, thus CPAN would only >>> install Bioperl itself and nothing optional. The user could then >>> install Bundle::BioPerl if they wanted a quick way of getting all the >>> optional stuff to work. >>> >>> I'm happy either way; what do other people think? > > >> From my point of view, removing them from PREREQ_PM means building the >> Bundle::BioPerl a bit of a pain :o( > > Can I ask how you're generating Bundle::BioPerl? That is, how did the > typos get in there? Is there a way to certainly avoid typos in the > future? I just modified the list by hand a while back :o( - I'm sure there must be a better way. From bix at sendu.me.uk Mon Oct 23 12:58:13 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 13:58:13 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453CC7A9.6090609@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk> <453CB8F2.7070703@sendu.me.uk> <453CC7A9.6090609@sheffield.ac.uk> Message-ID: <453CBC65.2020202@sendu.me.uk> Nathan S. Haigh wrote: > Sendu Bala wrote: >> Nathan S. Haigh wrote: >>> Sendu Bala wrote: >>> >>>> Makefile.PL could be altered again to remove from PREREQ_PM those >>>> modules the user didn't already have installed, thus CPAN would only >>>> install Bioperl itself and nothing optional. The user could then >>>> install Bundle::BioPerl if they wanted a quick way of getting all the >>>> optional stuff to work. >>>> >>>> I'm happy either way; what do other people think? >>> >>> From my point of view, removing them from PREREQ_PM means building the >>> Bundle::BioPerl a bit of a pain :o( >> >> Can I ask how you're generating Bundle::BioPerl? That is, how did the >> typos get in there? Is there a way to certainly avoid typos in the >> future? > > I just modified the list by hand a while back :o( - I'm sure there must > be a better way. I'm not sure I understand why removing things from PREREQ_PM would be a problem for you then; the %packages hash would remain unchanged (ie. have everything) so you have something to refer to when manually editing the Bundle. http://www.cpan.org/misc/cpan-faq.html#How_make_bundle might be helpful? I didn't really pay too much attention to the advice - does it offer a typo-avoiding solution? From n.haigh at sheffield.ac.uk Mon Oct 23 14:04:12 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 14:04:12 +0000 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453CBC65.2020202@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk> <453CB8F2.7070703@sendu.me.uk> <453CC7A9.6090609@sheffield.ac.uk> <453CBC65.2020202@sendu.me.uk> Message-ID: <453CCBDC.6030904@sheffield.ac.uk> > I'm not sure I understand why removing things from PREREQ_PM would be > a problem for you then; the %packages hash would remain unchanged (ie. > have everything) so you have something to refer to when manually > editing the Bundle. > > http://www.cpan.org/misc/cpan-faq.html#How_make_bundle > might be helpful? I didn't really pay too much attention to the advice > - does it offer a typo-avoiding solution? It's helpful in producing the Bundle PPD as all the XML tags are present in the Bioperl PPD and they simply need to be copied over to a Bundle-BioPerl PPD file. Looks like manual editing of the relevant file is required for making a CPAN bundle. Unfortunately - no typo-avoiding solution. :o( From dhoworth at mrc-lmb.cam.ac.uk Mon Oct 23 12:46:29 2006 From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth) Date: Mon, 23 Oct 2006 13:46:29 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453CA99D.9060009@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> Message-ID: <453CB9A5.2020409@mrc-lmb.cam.ac.uk> >> That's the user point of view - how does the developer actually tell >> CPAN that something is a developer release so that normal users don't >> automatically install it? > > I found this: > http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt > > Is says that $VERSION should simply be changed from a naked number into > a single quoted number and this should be recognized by the CPAN indexer. Cheers, Dave From hlapp at gmx.net Mon Oct 23 13:40:29 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Mon, 23 Oct 2006 09:40:29 -0400 Subject: [Bioperl-l] Bioperl 1.5.2 RC2 In-Reply-To: <453C9DCA.4020802@sendu.me.uk> References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk> <4534E207.8030508@sendu.me.uk> <45350BA6.3040102@genomics.dk> <4535EBF9.1090706@sendu.me.uk> <4536113D.1080307@sheffield.ac.uk> <453C9DCA.4020802@sendu.me.uk> Message-ID: <5C22B9C8-CEF0-457B-8565-793D56389A86@gmx.net> You would need a lot of information to make that determination (host, port, db driver, db name, user, password; i.e., the entire connection information, and there is no 'standard'). You might just ask a simple question in Makefile.PL as to whether biosql is installed or not, similar to the DB::GFF tests. -hilmar On Oct 23, 2006, at 6:47 AM, Sendu Bala wrote: > Hilmar Lapp wrote: >> On Oct 18, 2006, at 7:34 AM, Nathan Haigh wrote: >> >>> For example, I have made no effort to setup biosql-schema but I >>> thought that maybe there would be a test that would detect this >> >> I'm afraid there isn't. Bioperl-db is meaningless without >> biosql-schema. > > Can you suggest a way we might detect if biosql-schema has been > installed prior to running the test suite, so we can give some > meaningful error message? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Mon Oct 23 13:59:23 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 14:59:23 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CB9A5.2020409@mrc-lmb.cam.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> Message-ID: <453CCABB.2060308@sendu.me.uk> Dave Howorth wrote: >>> That's the user point of view - how does the developer actually tell >>> CPAN that something is a developer release so that normal users don't >>> automatically install it? >> I found this: >> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt >> >> Is says that $VERSION should simply be changed from a naked number into >> a single quoted number and this should be recognized by the CPAN indexer. > > Thanks for that. I guess from that the 1.5.2 version number should be: $VERSION = 1.05_02 And 1.6 would be $VERSION = 1.06 But will this cause a problem wrt 1.4? 1.4 has: $VERSION = 1.4; Is 1.4 lower than 1.06? Should we keep to a single digit version, so 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them version fifty and version sixty? 1.50_02, 1.60? From cjfields at uiuc.edu Mon Oct 23 14:12:16 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 09:12:16 -0500 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C9298.9000900@sendu.me.uk> Message-ID: <002001c6f6ad$3f68dd90$15327e82@pyrimidine> ... > > Right, thanks for that. > > Oh, so this effectively means that our 'optional' dependencies are > installed for CPAN users, which matches up to my 'force the optional > ones anyway' desire, leaving Bundle::BioPerl without any use. > > Makefile.PL could be altered again to remove from PREREQ_PM those > modules the user didn't already have installed, thus CPAN would only > install Bioperl itself and nothing optional. The user could then install > Bundle::BioPerl if they wanted a quick way of getting all the optional > stuff to work. > > I'm happy either way; what do other people think? I think that we should have it so Bioperl installs as-is (no additional reqs) and have Bundle::BioPerl used as a convenient way to install all optional modules for full functionality. The catch is to make sure that any optional installations do not crash tests during a CPAN bioperl installation, otherwise they aren't considered optional by CPAN, and the install won't work without forcing it. Frankly, most users will find themselves wanting to install the Bundle anyway to get full functionality, so we could always 'strongly recommend' preceding the bioperl installation with a Bundle::Bioperl CPAN installation to avoid problems, at least for this release. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 23 14:23:04 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 09:23:04 -0500 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453CA5E9.1060406@sheffield.ac.uk> Message-ID: <002101c6f6ae$c14d7860$15327e82@pyrimidine> ... > >> Right, thanks for that. > > > > Oh, so this effectively means that our 'optional' dependencies are > > installed for CPAN users, which matches up to my 'force the optional > > ones anyway' desire, leaving Bundle::BioPerl without any use. > > > > Makefile.PL could be altered again to remove from PREREQ_PM those > > modules the user didn't already have installed, thus CPAN would only > > install Bioperl itself and nothing optional. The user could then > > install Bundle::BioPerl if they wanted a quick way of getting all the > > optional stuff to work. > > > > I'm happy either way; what do other people think? > >From my point of view, removing them from PREREQ_PM means building the > Bundle::BioPerl a bit of a pain :o( > > I prefer the way it is currently set up - most people have fast internet > connections and GB of harddrive space. Other than the reason "why > install something I won't ever need" I don't see much point maintaining > Bundle::BioPerl and having "optional" dependencies. I think if there are > any modules which are not going to be used by the majority of users, > then this could be used as the rationale for removing them from > bioperl-core into another package? > > Nath I think you'll likely find it much easier to maintain a Bundle package long-term and indicate that it should be installed along with bioperl, than to have users complain about a particular Bioperl module failing b/c a particular dependency wasn't installed. If we have the Bundle around in CPAN and in PPM for Win32 users, and indicate in the INSTALL docs and the wiki our preference that it be installed prior to or along with a Bioperl installation for beginners, we can mitigate most of those problems. Nip it in the bud, to quote a Mr. Barney Fife. My 2c Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 23 14:29:33 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 09:29:33 -0500 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CCABB.2060308@sendu.me.uk> Message-ID: <002201c6f6af$a91e4200$15327e82@pyrimidine> > Dave Howorth wrote: > >>> That's the user point of view - how does the developer actually tell > >>> CPAN that something is a developer release so that normal users don't > >>> automatically install it? > >> I found this: > >> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt > >> > >> Is says that $VERSION should simply be changed from a naked number into > >> a single quoted number and this should be recognized by the CPAN > indexer. > > > > 5.8.8/pod/perlmodstyle.pod#Version_numbering> > > Thanks for that. > > I guess from that the 1.5.2 version number should be: > > $VERSION = 1.05_02 > > And 1.6 would be > > $VERSION = 1.06 > > But will this cause a problem wrt 1.4? 1.4 has: > > $VERSION = 1.4; > > Is 1.4 lower than 1.06? Should we keep to a single digit version, so > 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them > version fifty and version sixty? 1.50_02, 1.60? Doesn't perl 5.6.1 and up use the 'x.y.z' versioning syntax? It would be much simpler to use that. Simon Cozens wrote about this a while back: http://www.perl.com/pub/a/2000/04/whatsnew.html ... Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Mon Oct 23 14:41:24 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 15:41:24 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <002201c6f6af$a91e4200$15327e82@pyrimidine> References: <002201c6f6af$a91e4200$15327e82@pyrimidine> Message-ID: <453CD494.8070905@sendu.me.uk> Chris Fields wrote: >> Dave Howorth wrote: >>>>> That's the user point of view - how does the developer actually tell >>>>> CPAN that something is a developer release so that normal users don't >>>>> automatically install it? >>>> I found this: >>>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt >>>> >>>> Is says that $VERSION should simply be changed from a naked number into >>>> a single quoted number and this should be recognized by the CPAN >> indexer. >>> > 5.8.8/pod/perlmodstyle.pod#Version_numbering> >> >> Thanks for that. >> >> I guess from that the 1.5.2 version number should be: >> >> $VERSION = 1.05_02 >> >> And 1.6 would be >> >> $VERSION = 1.06 >> >> But will this cause a problem wrt 1.4? 1.4 has: >> >> $VERSION = 1.4; >> >> Is 1.4 lower than 1.06? Should we keep to a single digit version, so >> 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them >> version fifty and version sixty? 1.50_02, 1.60? > > Doesn't perl 5.6.1 and up use the 'x.y.z' versioning syntax? It would be > much simpler to use that. That does not present us with a way to have 1.5.2 marked as a developer release in CPAN. Also, see the discussion here: http://perldoc.perl.org/functions/require.html Since we require 5.6.1 the backwards-compatible issues maybe don't apply to us, but do these ideas work with modules, or just Perl itself? Is CPAN et al. happy with this form of versioning? /Something/ needs to be done about Bioperl versioning, because the current 1.4 or 1.5 is completely inadequate. From bix at sendu.me.uk Mon Oct 23 14:51:25 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 15:51:25 +0100 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <002001c6f6ad$3f68dd90$15327e82@pyrimidine> References: <002001c6f6ad$3f68dd90$15327e82@pyrimidine> Message-ID: <453CD6ED.5050507@sendu.me.uk> Chris Fields wrote: [option 1] >> Oh, so this effectively means that our 'optional' dependencies are >> installed for CPAN users, which matches up to my 'force the >> optional ones anyway' desire, leaving Bundle::BioPerl without any >> use. [option 2] >> Makefile.PL could be altered again to remove from PREREQ_PM those >> modules the user didn't already have installed, thus CPAN would >> only install Bioperl itself and nothing optional. The user could >> then install Bundle::BioPerl if they wanted a quick way of getting >> all the optional stuff to work. >> >> I'm happy either way; what do other people think? > > I think that we should have it so Bioperl installs as-is (no > additional reqs) and have Bundle::BioPerl used as a convenient way to > install all optional modules for full functionality. Note we're specifically considering a CPAN install here. If you download the tar.gz or use cvs, [option 1] doesn't affect you. Bundle::Bioperl is still needed as a convenience if you want to install the optional external dependencies. > The catch is to make sure that any optional installations do not > crash tests during a CPAN bioperl installation, otherwise they aren't > considered optional by CPAN, and the install won't work without > forcing it. I'm pretty sure this isn't a problem, though it would be nice if someone could test it on a clean system: does 'make test' pass all ok with none of the optional modules installed? Anyway, to reiterate the question: Do we care if CPAN users get all the optional external dependencies installed for them automatically, or do we want to force them to install Bundle? The current situation is: CPAN users will get all optional external dependencies without using Bundle::BioPerl. Manual installers of bioperl (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to get full functionality. From n.haigh at sheffield.ac.uk Mon Oct 23 16:30:34 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 16:30:34 +0000 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CCABB.2060308@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> Message-ID: <453CEE2A.8000002@sheffield.ac.uk> Sendu Bala wrote: > Dave Howorth wrote: > >>>> That's the user point of view - how does the developer actually tell >>>> CPAN that something is a developer release so that normal users don't >>>> automatically install it? >>>> >>> I found this: >>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt >>> >>> Is says that $VERSION should simply be changed from a naked number into >>> a single quoted number and this should be recognized by the CPAN indexer. >>> >> >> > > Thanks for that. > > I guess from that the 1.5.2 version number should be: > > $VERSION = 1.05_02 > > And 1.6 would be > > $VERSION = 1.06 > > But will this cause a problem wrt 1.4? 1.4 has: > > $VERSION = 1.4; > > Is 1.4 lower than 1.06? Should we keep to a single digit version, so > 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them > version fifty and version sixty? 1.50_02, 1.60? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > I believe the link to the documentation above describes a common CPAN versioning scheme as follows: 1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32 Therefore version 1.5 of Bioperl would be: 1.50 and version 1.5.2 would be better as 1.52. Then to indicate that the 1.5 series is a developer release, you append the underscore and at least 2 digits. Thus resulting in the following: Bioperl 1.5 would be 1.50_01 and 1.5.2 would be 1.52_01. The only thing i'm unsure about would be when does the _01 get incremented? I suspect we would probably not increment this number since each release would be an increment of the minor release number e.g. 1.52_01, 1.53_01, 1.54_01 etc. Although I'm still not sure how this versioning would affect bioperl 1.4 since 1.4 uses a non-standard versioning scheme :o( As I understand it, the versioning of the Perl releases uses the x.y.z scheme. But apparently CPAN modules should use the above versioning scheme. Nath From cjfields at uiuc.edu Mon Oct 23 15:36:37 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 10:36:37 -0500 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <453CD6ED.5050507@sendu.me.uk> Message-ID: <000c01c6f6b9$0781af40$15327e82@pyrimidine> ... > > Note we're specifically considering a CPAN install here. If you download > the tar.gz or use cvs, [option 1] doesn't affect you. Bundle::Bioperl is > still needed as a convenience if you want to install the optional > external dependencies. > Agreed. I don't think the Bundle is dispensable. For instance, it's very easy for us to just state to beginners to install Bundle::Bioperl before installing bioperl itself, as opposed to having them inundate the mail list with requests on why x.pl script didn't work, which could be simply from lack of the required module. > I'm pretty sure this isn't a problem, though it would be nice if someone > could test it on a clean system: does 'make test' pass all ok with none > of the optional modules installed? So far on WinXP everything passes; I ran a clean perl installation a while ago using nmake and tests passed. > Anyway, to reiterate the question: Do we care if CPAN users get all the > optional external dependencies installed for them automatically, or do > we want to force them to install Bundle? > > The current situation is: CPAN users will get all optional external > dependencies without using Bundle::BioPerl. Manual installers of bioperl > (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to > get full functionality. I don't think forcing is necessary, so a CPAN installation shouldn't force someone to install optional modules. Graph.pm, for instance has a few optional modules, and the tests which use those get skipped and pass so the installation proceeds w/o problems. We could do the same (any tests using those optional modules display the reason why they are skipped). I would strongly state in the INSTALL and INSTALL.WIN docs that (new) users should install Bundle::Bioperl before installing Bioperl core for full functionality. If you are an advanced user and know your way around CPAN/Perl, then you can install the various independent requirements depending on your particular requirements. Chris From n.haigh at sheffield.ac.uk Mon Oct 23 16:38:00 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 16:38:00 +0000 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <453CD6ED.5050507@sendu.me.uk> References: <002001c6f6ad$3f68dd90$15327e82@pyrimidine> <453CD6ED.5050507@sendu.me.uk> Message-ID: <453CEFE8.4000704@sheffield.ac.uk> Sendu Bala wrote: > Chris Fields wrote: > > [option 1] > >>> Oh, so this effectively means that our 'optional' dependencies are >>> installed for CPAN users, which matches up to my 'force the >>> optional ones anyway' desire, leaving Bundle::BioPerl without any >>> use. >>> > > [option 2] > >>> Makefile.PL could be altered again to remove from PREREQ_PM those >>> modules the user didn't already have installed, thus CPAN would >>> only install Bioperl itself and nothing optional. The user could >>> then install Bundle::BioPerl if they wanted a quick way of getting >>> all the optional stuff to work. >>> >>> I'm happy either way; what do other people think? >>> >> I think that we should have it so Bioperl installs as-is (no >> additional reqs) and have Bundle::BioPerl used as a convenient way to >> install all optional modules for full functionality. >> > > Note we're specifically considering a CPAN install here. If you download > the tar.gz or use cvs, [option 1] doesn't affect you. Bundle::Bioperl is > still needed as a convenience if you want to install the optional > external dependencies. > > > >> The catch is to make sure that any optional installations do not >> crash tests during a CPAN bioperl installation, otherwise they aren't >> considered optional by CPAN, and the install won't work without >> forcing it. >> > > I'm pretty sure this isn't a problem, though it would be nice if someone > could test it on a clean system: does 'make test' pass all ok with none > of the optional modules installed? > > I could definitely do this on WinXP and *possibly* on a Linux system. > Anyway, to reiterate the question: Do we care if CPAN users get all the > optional external dependencies installed for them automatically, or do > we want to force them to install Bundle? > > I'd prefer any dependencies, whether the are seen as vital to the main functionality of Bioperl or not actually specified in PREREQ_PM (as they currently are). A dependency is a dependency - is it not? If a distinction is to be made based on whether the requiring module is simply adding additional functionality to Bioperl-core, then shouldn't it be moved out of core and into another package as with the run modules if we are to have "optional" dependencies? my 2p Nath > The current situation is: CPAN users will get all optional external > dependencies without using Bundle::BioPerl. Manual installers of bioperl > (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to > get full functionality. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at uiuc.edu Mon Oct 23 15:39:09 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 10:39:09 -0500 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CD494.8070905@sendu.me.uk> Message-ID: <000d01c6f6b9$62033d80$15327e82@pyrimidine> ... > That does not present us with a way to have 1.5.2 marked as a developer > release in CPAN. > > Also, see the discussion here: > http://perldoc.perl.org/functions/require.html > > Since we require 5.6.1 the backwards-compatible issues maybe don't apply > to us, but do these ideas work with modules, or just Perl itself? Is > CPAN et al. happy with this form of versioning? > > /Something/ needs to be done about Bioperl versioning, because the > current 1.4 or 1.5 is completely inadequate. I think using 'require Foo x.y.z' is applicable to modules as well. There is something in Programming Perl about this, just don't have it on hand... Not sure about CPAN, so we need to look into it. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From bix at sendu.me.uk Mon Oct 23 15:42:15 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 16:42:15 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CEE2A.8000002@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> Message-ID: <453CE2D7.5080608@sendu.me.uk> Nathan S. Haigh wrote: > I believe the link to the documentation above describes a common CPAN > versioning scheme as follows: > > 1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32 > > Therefore version 1.5 of Bioperl would be: 1.50 and version 1.5.2 would > be better as 1.52. Then to indicate that the 1.5 series is a developer > release, you append the underscore and at least 2 digits. Thus resulting > in the following: Bioperl 1.5 would be 1.50_01 and 1.5.2 would be > 1.52_01. The only thing i'm unsure about would be when does the _01 get > incremented? I suspect we would probably not increment this number since > each release would be an increment of the minor release number e.g. > 1.52_01, 1.53_01, 1.54_01 etc. > > Although I'm still not sure how this versioning would affect bioperl 1.4 > since 1.4 uses a non-standard versioning scheme :o( Ok, I'm going to go ahead and call it 1.52_01 then. Surely 1.60 will be treated higher than 1.4? Anyway, we can cross that bridge when we get there, but this seems appropriate now. Cheers, Sendu. From bix at sendu.me.uk Mon Oct 23 15:59:01 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 23 Oct 2006 16:59:01 +0100 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <000c01c6f6b9$0781af40$15327e82@pyrimidine> References: <000c01c6f6b9$0781af40$15327e82@pyrimidine> Message-ID: <453CE6C5.6000108@sendu.me.uk> Chris Fields wrote: > ... >> The current situation is: CPAN users will get all optional external >> dependencies without using Bundle::BioPerl. Manual installers of bioperl >> (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to >> get full functionality. > > I don't think forcing is necessary, so a CPAN installation shouldn't force > someone to install optional modules. Graph.pm, for instance has a few > optional modules, and the tests which use those get skipped and pass so the > installation proceeds w/o problems. We could do the same (any tests using > those optional modules display the reason why they are skipped). I should clarify and say that that's what happens in Bioperl as well. The 'forcing' that I talk about is simply what I assume will happen if the user has CPAN set to automatically install dependencies. The user could say 'no' to every question regarding the installation of dependencies that CPAN discovers and Bioperl would still install fine. So really the difference between the current situation and, say, the situation when 1.5.1 was released, is that the CPAN user doesn't have to use Bundle::BioPerl for full functionality anymore, but can still chose not to install all the optional external modules. The difference is the possible default behaviour. Those users that auto-install dependencies get all the optional ones, whereas in the past they would not have. I have to point out the benefit of this behaviour: those people that don't care and just want it to work are more likely to get an installation that does just work. People who know what they're doing can still do what they want. Before we decide what to do I guess we need hard confirmation of how CPAN will actually behave with the current Makefile.PL. Any ideas how we can find out? It would also be good to have more options to break the current tie (Nathan is for keeping PREREQ_PM populated, Chris is for having it empty, I can go either way)... From dhoworth at mrc-lmb.cam.ac.uk Mon Oct 23 15:55:42 2006 From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth) Date: Mon, 23 Oct 2006 16:55:42 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CD494.8070905@sendu.me.uk> References: <002201c6f6af$a91e4200$15327e82@pyrimidine> <453CD494.8070905@sendu.me.uk> Message-ID: <453CE5FE.9070001@mrc-lmb.cam.ac.uk> Sendu Bala wrote: > Chris Fields wrote: >>> Dave Howorth wrote: >>>>>> That's the user point of view - how does the developer actually tell >>>>>> CPAN that something is a developer release so that normal users don't >>>>>> automatically install it? >>>>> I found this: >>>>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt >>>>> >>>>> Is says that $VERSION should simply be changed from a naked number into >>>>> a single quoted number and this should be recognized by the CPAN >>> indexer. >>>> >> 5.8.8/pod/perlmodstyle.pod#Version_numbering> >>> >>> Thanks for that. >>> >>> I guess from that the 1.5.2 version number should be: >>> >>> $VERSION = 1.05_02 I believe so - the underscore is key. Look at your favourite CPAN modules and see what they do. >>> And 1.6 would be >>> >>> $VERSION = 1.06 >>> >>> But will this cause a problem wrt 1.4? 1.4 has: I think it will cause a problem, yes. 1.4 > 1.06 As a workaround, you could remove 1.4 from CPAN and require everybody who installs from CPAN to uninstall it before installing 1.06. >>> $VERSION = 1.4; >>> >>> Is 1.4 lower than 1.06? Should we keep to a single digit version, so >>> 1.5_02 and 1.6? Does this really not work with CPAN? I think that would work but see at the end. >> Should we call them >>> version fifty and version sixty? 1.50_02, 1.60? Then you can count 1.50_02, 1.50_03, 1.52, 1.53_01 ... if you wish. >> Doesn't perl 5.6.1 and up use the 'x.y.z' versioning syntax? It would be >> much simpler to use that. > > That does not present us with a way to have 1.5.2 marked as a developer > release in CPAN. > > Also, see the discussion here: > http://perldoc.perl.org/functions/require.html > > Since we require 5.6.1 the backwards-compatible issues maybe don't apply > to us, but do these ideas work with modules, or just Perl itself? Is > CPAN et al. happy with this form of versioning? I'm not an expert :( It's my understanding that there is an awful lot of flexibility in Perl module version numbering (as you might expect :) However, I believe there are some gotchas. So I would recommend (a) finding an expert and (b) trying an experiment! > /Something/ needs to be done about Bioperl versioning, because the > current 1.4 or 1.5 is completely inadequate. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From n.haigh at sheffield.ac.uk Mon Oct 23 17:37:13 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 17:37:13 +0000 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <453CE6C5.6000108@sendu.me.uk> References: <000c01c6f6b9$0781af40$15327e82@pyrimidine> <453CE6C5.6000108@sendu.me.uk> Message-ID: <453CFDC9.8030107@sheffield.ac.uk> Sendu Bala wrote: > Chris Fields wrote: > >> ... >> >>> The current situation is: CPAN users will get all optional external >>> dependencies without using Bundle::BioPerl. Manual installers of bioperl >>> (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to >>> get full functionality. >>> >> I don't think forcing is necessary, so a CPAN installation shouldn't force >> someone to install optional modules. Graph.pm, for instance has a few >> optional modules, and the tests which use those get skipped and pass so the >> installation proceeds w/o problems. We could do the same (any tests using >> those optional modules display the reason why they are skipped). >> > > I should clarify and say that that's what happens in Bioperl as well. > The 'forcing' that I talk about is simply what I assume will happen if > the user has CPAN set to automatically install dependencies. The user > could say 'no' to every question regarding the installation of > dependencies that CPAN discovers and Bioperl would still install fine. > > So really the difference between the current situation and, say, the > situation when 1.5.1 was released, is that the CPAN user doesn't have to > use Bundle::BioPerl for full functionality anymore, but can still chose > not to install all the optional external modules. > > --snip-- Obviously, we could maintain a Bundle::BioPerl which includes all dependencies required for a fully functional Bioperl. I think the whole idea for a Bundle is to provide a common environment for a particular package. If for example, someone chooses not to install the dependencies through CPAN (in the current setup), that can easily go back and install Bundle::BioPerl and it would retrieve any missing dependencies for a fully functional Bioperl-core. Nath From n.haigh at sheffield.ac.uk Mon Oct 23 18:06:16 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 18:06:16 +0000 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CE2D7.5080608@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> Message-ID: <453D0498.8050206@sheffield.ac.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: > >> I believe the link to the documentation above describes a common CPAN >> versioning scheme as follows: >> >> 1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32 >> >> Therefore version 1.5 of Bioperl would be: 1.50 and version 1.5.2 would >> be better as 1.52. Then to indicate that the 1.5 series is a developer >> release, you append the underscore and at least 2 digits. Thus resulting >> in the following: Bioperl 1.5 would be 1.50_01 and 1.5.2 would be >> 1.52_01. The only thing i'm unsure about would be when does the _01 get >> incremented? I suspect we would probably not increment this number since >> each release would be an increment of the minor release number e.g. >> 1.52_01, 1.53_01, 1.54_01 etc. >> >> Although I'm still not sure how this versioning would affect bioperl 1.4 >> since 1.4 uses a non-standard versioning scheme :o( >> > > Ok, I'm going to go ahead and call it 1.52_01 then. Surely 1.60 will be > treated higher than 1.4? Anyway, we can cross that bridge when we get > there, but this seems appropriate now. > > > Cheers, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > Just tried the suggested: perl -MExtUtils::MakeMaker -le 'print MM->parse_version(shift)' bioperl-1-5-2/Bio/Root/Version.pm To see how it parses the various different version schemes - here are the results: 1.5 -> 1.5 1.4 -> 1.4 1.60 -> 1.60 1.05_01 -> 1.0501 1.5_01 -> 1.501 1.50_01 -> 1.5001 Nath From cjfields at uiuc.edu Mon Oct 23 17:15:44 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 12:15:44 -0500 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <453CE6C5.6000108@sendu.me.uk> Message-ID: <002701c6f6c6$e2622c40$15327e82@pyrimidine> ... > I should clarify and say that that's what happens in Bioperl as well. > The 'forcing' that I talk about is simply what I assume will happen if > the user has CPAN set to automatically install dependencies. The user > could say 'no' to every question regarding the installation of > dependencies that CPAN discovers and Bioperl would still install fine. > > So really the difference between the current situation and, say, the > situation when 1.5.1 was released, is that the CPAN user doesn't have to > use Bundle::BioPerl for full functionality anymore, but can still chose > not to install all the optional external modules. > > The difference is the possible default behaviour. Those users that > auto-install dependencies get all the optional ones, whereas in the past > they would not have. I have to point out the benefit of this behaviour: > those people that don't care and just want it to work are more likely to > get an installation that does just work. People who know what they're > doing can still do what they want. OK with me. Any way we go about it, we have to assume that anyone who set CPAN to automatically install dependencies would want this behavior. > Before we decide what to do I guess we need hard confirmation of how > CPAN will actually behave with the current Makefile.PL. Any ideas how we > can find out? > > It would also be good to have more options to break the current tie > (Nathan is for keeping PREREQ_PM populated, Chris is for having it > empty, I can go either way)... Frankly I'm for whatever is easiest for the end-user. I think we should continue maintaining Bundle::Bioperl b/c of its convenience (easier for us to say 'install Bundle::Bioperl' as opposed to 'install modules a b d d e f g...' ). I should note that Chris D. maintains Bundle::Bioperl via CPAN and can easily add/remove modules as needed, so all that would be necessary prior to a release is to make sure the various modules present in the Bundle are up-to-date. The only difficulty would updating the bundle PPM version for Win32; I agree with Nathan that it would be nice if it were easier to maintain. The PPD file generated using 'nmake ppd' needs modifications, likely b/c these are probably still generated as PPM3-compatible vs PPM4-compatible. I also think the idea of having the developer releases available via CPAN is a good one, as long as they are marked as such (which you are taking care of with versioning changes). It makes them a little more official, even if they are interim developer releases. Chris From cjfields at uiuc.edu Mon Oct 23 17:19:08 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 12:19:08 -0500 Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs In-Reply-To: <453CFDC9.8030107@sheffield.ac.uk> Message-ID: <002801c6f6c7$5a58ed60$15327e82@pyrimidine> ... > > So really the difference between the current situation and, say, the > > situation when 1.5.1 was released, is that the CPAN user doesn't have to > > use Bundle::BioPerl for full functionality anymore, but can still chose > > not to install all the optional external modules. > > > > > --snip-- > > Obviously, we could maintain a Bundle::BioPerl which includes all > dependencies required for a fully functional Bioperl. I think the whole > idea for a Bundle is to provide a common environment for a particular > package. If for example, someone chooses not to install the dependencies > through CPAN (in the current setup), that can easily go back and install > Bundle::BioPerl and it would retrieve any missing dependencies for a > fully functional Bioperl-core. > > Nath Succinctly put; I would've spent five paragraphs describing that! Too much coffee (from lab meetings...) Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Mon Oct 23 17:26:57 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 12:26:57 -0500 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: Message-ID: <002c01c6f6c8$7163dd20$15327e82@pyrimidine> Seth, Did you try this with a clean, taxonomy-installed database? There may be some junk left over tfrom the previous test runs. I'm looking into it this week; it may not make the developer release but we'll try to get it in. BTW, the 02sinmpleseq.t test failures have to do with a call to gzip. I'll look into a workaround for that. Sendu has posted a Bio::Root::Root fix which does get rid of some bugs but introduces others. One alternative which I found works is cygwin, but there's a catch: DBD-mysql is hard to install. If it isn't one thing it's another... Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign _____ From: Seth Johnson [mailto:johnson.biotech at gmail.com] Sent: Monday, October 23, 2006 11:37 AM To: Chris Fields Cc: bioperl-l Subject: Re: Error retrieving sequence from BioSQL Chris, There's definite improvement: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Failed Test Stat Wstat Total Fail Failed List of Failed ---------------------------------------------------------------------------- --- t/02species.t 65 2 3.08% 63 65 t/03simpleseq.t 1 256 59 106 179.66% 7-59 t/04swiss.t 52 14 26.92% 25 27-34 38-42 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There's some weirdness going on during the 'swiss.t' test. It almost seems to me that expectations of some tests are swapped (27 & 39, 28 & 40, 29 & 41, 31 & 27, 32 & 28, 33 & 29, 39 & 31): ================================ not ok 25 # Test 25 got: '10097078' (t/04swiss.t at line 79) # Expected: '91309150' ok 26 not ok 27 # Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t at line 85) # Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' not ok 28 # Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic mitochondrial matrix protein' (t/04swiss.t at line 86) # Expected: 'Functional expression of cloned human splicing factor SF2: homology to RNA-binding proteins, U1 70K, and Drosophila splicing regulators' not ok 29 # Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' (t/04swiss.t at line 87) # Expected: 'Cell 66 (2), 383-394 (1991)' not ok 30 # Test 30 got: (t/04swiss.t at line 88) # Expected: '91309150' not ok 31 # Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' (t/04swiss.t at line 85 fail #2) # Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., Celis, J.E. and Leffers,H.' not ok 32 # Test 32 got: 'Functional expression of cloned human splicing factor SF2: homology to RNA-binding proteins, U1 70K, and Drosophila splicing regulators' (t/04swiss.t at line 86 fail #2) # Expected: 'Cloning and expression of a cDNA covering the complete coding region of the P32 subunit of human pre-mRNA splicing factor SF2' not ok 33 # Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail #2) # Expected: 'Gene 134 (2), 283-287 (1993)' not ok 34 # Test 34 got: (t/04swiss.t at line 88 fail #2) # Expected: '94085792' ok 35 ok 36 ok 37 not ok 38 # Test 38 got: (t/04swiss.t at line 88 fail #3) # Expected: '94253723' not ok 39 # Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4) # Expected: 'Jiang,J., Zhang,Y., Krainer, A.R. and Xu,R.M.' not ok 40 # Test 40 got: 'Cloning and expression of a cDNA covering the complete coding region of the P32 subunit of human pre-mRNA splicing factor SF2' (t/04swiss.t at line 86 fail #4) # Expected: 'Crystal structure of human p32, a doughnut-shaped acidic mitochondrial matrix protein' not ok 41 # Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail #4) # Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' not ok 42 # Test 42 got: (t/04swiss.t at line 88 fail #4) # Expected: '99199225' ============================== On 10/20/06, Chris Fields < cjfields at uiuc.edu> wrote: Seth, Did you work out the problem here? There was a recent CVS update to OBDA tests (16obda.t) that fixed similar problems on Mac OS X. Old OBDA tests apparently left data from tests in the database, which caused problems with repeated test runs. Chris > > -----Original Message----- > > From: bioperl-l-bounces lists.open-bio.org [mailto: bioperl-l- > > bounces lists.open-bio.org] On Behalf Of Seth Johnson > > Sent: Saturday, September 30, 2006 6:35 PM > > To: Hilmar Lapp > > Cc: Chris Fields; Bioperl List > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > > > Here're complete test details: > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > ... > > > FAILED tests 10-12 > > Failed 3/12 tests, 75.00% okay > > Failed Test Stat Wstat Total Fail Failed List of Failed > > > -------------------------------------------------------------------------- > > ----- > > t\02species.t 65 2 3.08% 63 65 > > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > > t\12ontology.t 2 512 738 1471 199.32% 3-738 > > t\16obda.t 12 3 25.00% 10-12 > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From johnson.biotech at gmail.com Mon Oct 23 16:36:36 2006 From: johnson.biotech at gmail.com (Seth Johnson) Date: Mon, 23 Oct 2006 12:36:36 -0400 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: <000001c6f486$df508930$15327e82@pyrimidine> References: <000001c6f486$df508930$15327e82@pyrimidine> Message-ID: Chris, There's definite improvement: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Failed Test Stat Wstat Total Fail Failed List of Failed ------------------------------------------------------------------------------- t/02species.t 65 2 3.08% 63 65 t/03simpleseq.t 1 256 59 106 179.66% 7-59 t/04swiss.t 52 14 26.92% 25 27-34 38-42 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There's some weirdness going on during the 'swiss.t' test. It almost seems to me that expectations of some tests are swapped (27 & 39, 28 & 40, 29 & 41, 31 & 27, 32 & 28, 33 & 29, 39 & 31): ================================ not ok 25 # Test 25 got: '10097078' (t/04swiss.t at line 79) # Expected: '91309150' ok 26 not ok 27 # Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t at line 85) # Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' not ok 28 # Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic mitochondrial matrix protein' (t/04swiss.t at line 86) # Expected: 'Functional expression of cloned human splicing factor SF2: homology to RNA-binding proteins, U1 70K, and Drosophila splicing regulators' not ok 29 # Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' (t/04swiss.t at line 87) # Expected: 'Cell 66 (2), 383-394 (1991)' not ok 30 # Test 30 got: (t/04swiss.t at line 88) # Expected: '91309150' not ok 31 # Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' (t/04swiss.t at line 85 fail #2) # Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., Celis,J.E. and Leffers,H.' not ok 32 # Test 32 got: 'Functional expression of cloned human splicing factor SF2: homology to RNA-binding proteins, U1 70K, and Drosophila splicing regulators' (t/04swiss.t at line 86 fail #2) # Expected: 'Cloning and expression of a cDNA covering the complete coding region of the P32 subunit of human pre-mRNA splicing factor SF2' not ok 33 # Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail #2) # Expected: 'Gene 134 (2), 283-287 (1993)' not ok 34 # Test 34 got: (t/04swiss.t at line 88 fail #2) # Expected: '94085792' ok 35 ok 36 ok 37 not ok 38 # Test 38 got: (t/04swiss.t at line 88 fail #3) # Expected: '94253723' not ok 39 # Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4) # Expected: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' not ok 40 # Test 40 got: 'Cloning and expression of a cDNA covering the complete coding region of the P32 subunit of human pre-mRNA splicing factor SF2' (t/04swiss.t at line 86 fail #4) # Expected: 'Crystal structure of human p32, a doughnut-shaped acidic mitochondrial matrix protein' not ok 41 # Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail #4) # Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' not ok 42 # Test 42 got: (t/04swiss.t at line 88 fail #4) # Expected: '99199225' ============================== On 10/20/06, Chris Fields wrote: > > > > Seth, > > Did you work out the problem here? There was a recent CVS update to OBDA > tests (16obda.t) that fixed similar problems on Mac OS X. Old OBDA tests > apparently left data from tests in the database, which caused problems > with > repeated test runs. > > Chris > > > > -----Original Message----- > > > From: bioperl-l-bounces lists.open-bio.org [mailto: bioperl-l- > > > bounces lists.open-bio.org] On Behalf Of Seth Johnson > > > Sent: Saturday, September 30, 2006 6:35 PM > > > To: Hilmar Lapp > > > Cc: Chris Fields; Bioperl List > > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > > > > > Here're complete test details: > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > ... > > > > > FAILED tests 10-12 > > > Failed 3/12 tests, 75.00% okay > > > Failed Test Stat Wstat Total Fail Failed List of Failed > > > > > > -------------------------------------------------------------------------- > > > ----- > > > t\02species.t 65 2 3.08% 63 65 > > > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > > > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > > > t\12ontology.t 2 512 738 1471 199.32% 3-738 > > > t\16obda.t 12 3 25.00% 10-12 > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > > From n.haigh at sheffield.ac.uk Mon Oct 23 20:08:00 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 23 Oct 2006 20:08:00 +0000 Subject: [Bioperl-l] CPAN testing Service Message-ID: <453D2120.9010301@sheffield.ac.uk> We should also check the CPAN testing service (CPANTS) to see how "good" our package is for CPAN and try to increase the Kwalitee score. There only appears to be details for bioperl-1.2.3 for some reason: http://cpants.perl.org/dist/bioperl Nath From pabloivan at gmail.com Sun Oct 22 19:54:35 2006 From: pabloivan at gmail.com (Pablo Ivan) Date: Sun, 22 Oct 2006 16:54:35 -0300 Subject: [Bioperl-l] Bioperl installation under Windows Message-ID: Hello, I have been trying to install Bioperl 1.4 on a Windows XP system, but I didn't get too far; my perl installation was made using ActiveState 5.8.8build 816. I then tried the ppm method of searching for bioperl in the repositories and installing the core package 1.4. It says that the installation was made successfully, but the /Bio folder doesn't show up in /lib, and it's like nothing new was installed at all. I was wondering if using that version of ActiveState could be causing it, but the uninstall option for it isn't showing in Add/Remove, and I'm afraid just deleting the folders and installing version 5.6 of AS could somehow damage and make things worse. Or should I just forget about it and try using Cygwin? Thank you, Pablo. From cjfields at uiuc.edu Mon Oct 23 21:34:47 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 16:34:47 -0500 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: Message-ID: <000401c6f6eb$111df040$15327e82@pyrimidine> Don't know what that particular error is, but it looks ActivePerl-related (PPM generates HTML from the blib directory). You may need to run 'nmake clean' in between test cycles get rid of old blib and other files. The carryover issue from old test runs was a definite problem. Brian fixed that in the bioperl-db CVS recently. Also, I tried Sendu's fixes from CVS head to Bio::Root::Root and they seem to fix the problems with Bio::Root::Root. The issue came down to a use of indirect syntax (a bad perl practice). There are other errors popping up related to Bio::Species, but these seem fixable at least. I committed a few changes to bioperl-db CVS to fix 03simpleseq.t test failures due to a lack of gzip on WinXP (I didn't see them b/c I had a copy on GNU gzip in my path). These should pass w/o problems now on WinXP. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign _____ From: Seth Johnson [mailto:johnson.biotech at gmail.com] Sent: Monday, October 23, 2006 4:22 PM To: Chris Fields Cc: bioperl-l Subject: Re: Error retrieving sequence from BioSQL Chris, I have not cleaned my test database yet. I'll purge it and redo the tests. This error keeps popping up in unexpected places while running nmake during installation: "Undefined subroutine &main::UpdateHTML_blib called at -e line 1. NMAKE : fatal error U1077: 'C:\WINDOWS\system32\cmd.exe' : return code '0xff'" Is there a way around it?? Seth On 10/23/06, Chris Fields wrote: Seth, Did you try this with a clean, taxonomy-installed database? There may be some junk left over tfrom the previous test runs. I'm looking into it this week; it may not make the developer release but we'll try to get it in. BTW, the 02sinmpleseq.t test failures have to do with a call to gzip. I'll look into a workaround for that. Sendu has posted a Bio::Root::Root fix which does get rid of some bugs but introduces others. One alternative which I found works is cygwin, but there's a catch: DBD-mysql is hard to install. If it isn't one thing it's another... Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign _____ From: Seth Johnson [mailto:johnson.biotech at gmail.com] Sent: Monday, October 23, 2006 11:37 AM To: Chris Fields Cc: bioperl-l Subject: Re: Error retrieving sequence from BioSQL Chris, There's definite improvement: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Failed Test Stat Wstat Total Fail Failed List of Failed ---------------------------------------------------------------------------- --- t/02species.t 65 2 3.08% 63 65 t/03simpleseq.t 1 256 59 106 179.66% 7-59 t/04swiss.t 52 14 26.92% 25 27-34 38-42 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There's some weirdness going on during the 'swiss.t' test. It almost seems to me that expectations of some tests are swapped (27 & 39, 28 & 40, 29 & 41, 31 & 27, 32 & 28, 33 & 29, 39 & 31): ================================ not ok 25 # Test 25 got: '10097078' (t/04swiss.t at line 79) # Expected: '91309150' ok 26 not ok 27 # Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t at line 85) # Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' not ok 28 # Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic mitochondrial matrix protein' (t/04swiss.t at line 86) # Expected: 'Functional expression of cloned human splicing factor SF2: homology to RNA-binding proteins, U1 70K, and Drosophila splicing regulators' not ok 29 # Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' (t/04swiss.t at line 87) # Expected: 'Cell 66 (2), 383-394 (1991)' not ok 30 # Test 30 got: (t/04swiss.t at line 88) # Expected: '91309150' not ok 31 # Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' (t/04swiss.t at line 85 fail #2) # Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., Celis, J.E. and Leffers,H.' not ok 32 # Test 32 got: 'Functional expression of cloned human splicing factor SF2: homology to RNA-binding proteins, U1 70K, and Drosophila splicing regulators' (t/04swiss.t at line 86 fail #2) # Expected: 'Cloning and expression of a cDNA covering the complete coding region of the P32 subunit of human pre-mRNA splicing factor SF2' not ok 33 # Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail #2) # Expected: 'Gene 134 (2), 283-287 (1993)' not ok 34 # Test 34 got: (t/04swiss.t at line 88 fail #2) # Expected: '94085792' ok 35 ok 36 ok 37 not ok 38 # Test 38 got: (t/04swiss.t at line 88 fail #3) # Expected: '94253723' not ok 39 # Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4) # Expected: 'Jiang,J., Zhang,Y., Krainer, A.R. and Xu,R.M.' not ok 40 # Test 40 got: 'Cloning and expression of a cDNA covering the complete coding region of the P32 subunit of human pre-mRNA splicing factor SF2' (t/04swiss.t at line 86 fail #4) # Expected: 'Crystal structure of human p32, a doughnut-shaped acidic mitochondrial matrix protein' not ok 41 # Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail #4) # Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' not ok 42 # Test 42 got: (t/04swiss.t at line 88 fail #4) # Expected: '99199225' ============================== On 10/20/06, Chris Fields < cjfields at uiuc.edu > wrote: Seth, Did you work out the problem here? There was a recent CVS update to OBDA tests (16obda.t) that fixed similar problems on Mac OS X. Old OBDA tests apparently left data from tests in the database, which caused problems with repeated test runs. Chris > > -----Original Message----- > > From: bioperl-l-bounces lists.open-bio.org [mailto: bioperl-l- > > bounces lists.open-bio.org] On Behalf Of Seth Johnson > > Sent: Saturday, September 30, 2006 6:35 PM > > To: Hilmar Lapp > > Cc: Chris Fields; Bioperl List > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > > > Here're complete test details: > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > ... > > > FAILED tests 10-12 > > Failed 3/12 tests, 75.00% okay > > Failed Test Stat Wstat Total Fail Failed List of Failed > > > -------------------------------------------------------------------------- > > ----- > > t\02species.t 65 2 3.08% 63 65 > > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > > t\12ontology.t 2 512 738 1471 199.32% 3-738 > > t\16obda.t 12 3 25.00% 10-12 > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- Best Regards, Seth Johnson Senior Bioinformatics Associate Ph: (202) 470-0900 Fx: (775) 251-0358 From cjfields at uiuc.edu Mon Oct 23 21:53:27 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 23 Oct 2006 16:53:27 -0500 Subject: [Bioperl-l] Bioperl installation under Windows In-Reply-To: References: Message-ID: <9994CFF6-FCA1-4C7F-9A33-31765C6AE255@uiuc.edu> It won't install in Perl\lib, but in Perl\site\lib. Check there. We are working intently on the next developer release for BioPerl and plan on having several PPMs available, but we only are supporting ActivePerl 5.8.8.819. I would suggest that you upgrade your ActivePerl installation to that if possible since PPM has undergone major changes (they use PPM4 now, which has a GUI by default). Most repositories are now moving over to using PPM4 so you'll likely be seeing less PPM3-compatible packages being made. Chris On Oct 22, 2006, at 2:54 PM, Pablo Ivan wrote: > Hello, > > I have been trying to install Bioperl 1.4 on a Windows XP system, > but I > didn't get too far; my perl installation was made using ActiveState > 5.8.8build 816. I then tried the ppm method of searching for bioperl > in the > repositories and installing the core package 1.4. It says that the > installation was made successfully, but the /Bio folder doesn't > show up in > /lib, and it's like nothing new was installed at all. I was > wondering if > using that version of ActiveState could be causing it, but the > uninstall > option for it isn't showing in Add/Remove, and I'm afraid just > deleting the > folders and installing version 5.6 of AS could somehow damage and make > things worse. Or should I just forget about it and try using Cygwin? > > Thank you, > > Pablo. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From johnson.biotech at gmail.com Mon Oct 23 21:22:13 2006 From: johnson.biotech at gmail.com (Seth Johnson) Date: Mon, 23 Oct 2006 17:22:13 -0400 Subject: [Bioperl-l] Error retrieving sequence from BioSQL In-Reply-To: <002c01c6f6c8$7163dd20$15327e82@pyrimidine> References: <002c01c6f6c8$7163dd20$15327e82@pyrimidine> Message-ID: Chris, I have not cleaned my test database yet. I'll purge it and redo the tests. This error keeps popping up in unexpected places while running nmake during installation: "Undefined subroutine &main::UpdateHTML_blib called at -e line 1. NMAKE : fatal error U1077: 'C:\WINDOWS\system32\cmd.exe' : return code '0xff'" Is there a way around it?? Seth On 10/23/06, Chris Fields wrote: > > Seth, > > Did you try this with a clean, taxonomy-installed database? There may be > some junk left over tfrom the previous test runs. > > I'm looking into it this week; it may not make the developer release but > we'll try to get it in. BTW, the 02sinmpleseq.t test failures have to do > with a call to gzip. I'll look into a workaround for that. > > Sendu has posted a Bio::Root::Root fix which does get rid of some bugs but > introduces others. One alternative which I found works is cygwin, but > there's a catch: DBD-mysql is hard to install. If it isn't one thing it's > another... > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > ------------------------------ > > *From:* Seth Johnson [mailto:johnson.biotech at gmail.com] > *Sent:* Monday, October 23, 2006 11:37 AM > *To:* Chris Fields > *Cc:* bioperl-l > *Subject:* Re: Error retrieving sequence from BioSQL > > > > Chris, > > There's definite improvement: > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Failed Test Stat Wstat Total Fail Failed List of Failed > ------------------------------------------------------------------------------- > > t/02species.t 65 2 3.08% 63 65 > t/03simpleseq.t 1 256 59 106 179.66% 7-59 > t/04swiss.t 52 14 26.92% 25 27-34 38-42 > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > There's some weirdness going on during the 'swiss.t' test. It almost > seems to me that expectations of some tests are swapped (27 & 39, 28 & 40, > 29 & 41, 31 & 27, 32 & 28, 33 & 29, 39 & 31): > ================================ > not ok 25 > # Test 25 got: '10097078' (t/04swiss.t at line 79) > # Expected: '91309150' > ok 26 > not ok 27 > # Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t > at line 85) > # Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' > not ok 28 > # Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic > mitochondrial matrix protein' (t/04swiss.t at line 86) > # Expected: 'Functional expression of cloned human splicing factor SF2: > homology to RNA-binding proteins, U1 70K, and Drosophila splicing > regulators' > not ok 29 > # Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' > (t/04swiss.t at line 87) > # Expected: 'Cell 66 (2), 383-394 (1991)' > not ok 30 > # Test 30 got: (t/04swiss.t at line 88) > # Expected: '91309150' > not ok 31 > # Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' > (t/04swiss.t at line 85 fail #2) > # Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., > Celis, J.E. and Leffers,H.' > not ok 32 > # Test 32 got: 'Functional expression of cloned human splicing factor SF2: > homology to RNA-binding proteins, U1 70K, and Drosophila splicing > regulators' (t/04swiss.t at line 86 fail #2) > # Expected: 'Cloning and expression of a cDNA covering the complete > coding region of the P32 subunit of human pre-mRNA splicing factor SF2' > not ok 33 > # Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail > #2) > # Expected: 'Gene 134 (2), 283-287 (1993)' > not ok 34 > # Test 34 got: (t/04swiss.t at line 88 fail #2) > # Expected: '94085792' > ok 35 > ok 36 > ok 37 > not ok 38 > # Test 38 got: (t/04swiss.t at line 88 fail #3) > # Expected: '94253723' > not ok 39 > # Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J., > Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4) > # Expected: 'Jiang,J., Zhang,Y., Krainer, A.R. and Xu,R.M.' > not ok 40 > # Test 40 got: 'Cloning and expression of a cDNA covering the complete > coding region of the P32 subunit of human pre-mRNA splicing factor SF2' > (t/04swiss.t at line 86 fail #4) > # Expected: 'Crystal structure of human p32, a doughnut-shaped acidic > mitochondrial matrix protein' > not ok 41 > # Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail > #4) > # Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)' > not ok 42 > # Test 42 got: (t/04swiss.t at line 88 fail #4) > # Expected: '99199225' > ============================== > > On 10/20/06, *Chris Fields* < cjfields at uiuc.edu> wrote: > > > > Seth, > > Did you work out the problem here? There was a recent CVS update to OBDA > tests (16obda.t) that fixed similar problems on Mac OS X. Old OBDA tests > apparently left data from tests in the database, which caused problems > with > repeated test runs. > > Chris > > > > -----Original Message----- > > > From: bioperl-l-bounces lists.open-bio.org [mailto: bioperl-l- > > > bounces lists.open-bio.org] On Behalf Of Seth Johnson > > > Sent: Saturday, September 30, 2006 6:35 PM > > > To: Hilmar Lapp > > > Cc: Chris Fields; Bioperl List > > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL > > > > > > Here're complete test details: > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > ... > > > > > FAILED tests 10-12 > > > Failed 3/12 tests, 75.00% okay > > > Failed Test Stat Wstat Total Fail Failed List of Failed > > > > > > -------------------------------------------------------------------------- > > > ----- > > > t\02species.t 65 2 3.08% 63 65 > > > t\03simpleseq.t 1 256 59 106 179.66% 7-59 > > > t\04swiss.t 52 14 26.92% 25 27-34 38-42 > > > t\12ontology.t 2 512 738 1471 199.32% 3-738 > > > t\16obda.t 12 3 25.00% 10-12 > > > _______________________________________________ > > > Bioperl-l mailing list > > > Bioperl-l lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > > > -- Best Regards, Seth Johnson Senior Bioinformatics Associate Ph: (202) 470-0900 Fx: (775) 251-0358 From chhalling at alumni.ls.berkeley.edu Tue Oct 24 01:02:24 2006 From: chhalling at alumni.ls.berkeley.edu (Conrad Halling) Date: Mon, 23 Oct 2006 21:02:24 -0400 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453C6509.90005@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> Message-ID: <453D6620.5020401@alumni.ls.berkeley.edu> Sorry, I should know better about giving all the details. This was using the Bundle::BioPerl on CPAN, installing for Perl 5.8.8 (a fresh compile) with Mac OS X 10.4.8. -- Conrad Nathan S. Haigh wrote: > Chris Fields wrote: > >> Thanks for letting us know! Did PPM4 throw errors or just silently >> pass them over? >> >> Chris >> >> On Oct 22, 2006, at 6:45 PM, Conrad Halling wrote: >> >> >> > I believe he is talking about the bundle on cpan and not the ppd. I will > get this updated as soon as possible. > > Sendu/Chris - can you confirm to me which Bioperl modules are essential > to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any > reason for not putting *all* dependencies into the bundle? > > Nath > > > > > > -- Conrad Halling chhalling at alumni.ls.berkeley.edu From n.haigh at sheffield.ac.uk Tue Oct 24 07:05:53 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Tue, 24 Oct 2006 08:05:53 +0100 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453D6620.5020401@alumni.ls.berkeley.edu> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453D6620.5020401@alumni.ls.berkeley.edu> Message-ID: <453DBB51.6010505@sheffield.ac.uk> Conrad Halling wrote: > Sorry, I should know better about giving all the details. > > This was using the Bundle::BioPerl on CPAN, installing for Perl 5.8.8 (a > fresh compile) with Mac OS X 10.4.8. > > -- Conrad > > My apologies Conrad, this was my bad! Are you in need of the corrections being made swiftly or can you wait until the Bioperl 1.5.2 release when I'll ensure the Bundle is updated correctly for that release? Cheers Nath From n.haigh at sheffield.ac.uk Tue Oct 24 09:57:25 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Tue, 24 Oct 2006 10:57:25 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453CE2D7.5080608@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> Message-ID: <453DE385.8010700@sheffield.ac.uk> --snip-- > Ok, I'm going to go ahead and call it 1.52_01 then. Surely 1.60 will be > treated higher than 1.4? Anyway, we can cross that bridge when we get > there, but this seems appropriate now. > > > Cheers, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > Just been having a think about this versioning. Does this work well and is it intuitive with versioning the official 1.5.2 developer release and also the 1.6 stable release? I'd like to put forward the following versioning scheme for consideration (most is the same as what it is now, but with some clarification - hopefully): major-version . minor-version sub-version _ developer-release-version RC-version The sub-version represents bug-fixes and possibly some minor feature enhancements with no API changes. The minor-version represents some significant feature enhancements/API changes/bug fixes. The major-version represents significant rewrites of Bioperl. For an RC of a developer release the version would have _0x (where x=the RC number) For a non RC of a developer release the version would have _10 For an RC of a stable release the version would have _0x (where x=RC number) Fo a non RC of a stable release the version would not have the underscore suffix Therefore I would see the following $VERSION being applied: 1.5.2 RC1 = 1.52_01 1.5.2 RC2 = 1.52_02 1.5.2 RC3 = 1.52_03 1.5.2 = 1.52_10 1.6 RC1 = 1.60_01 1.6 RC2 = 1.60_02 1.6 = 1.60 1.6.1 RC1 = 1.61_01 1.6.1 = 1.61 This should satisfy the requirement of CPAN for having underscores in versions to indicate a developer release, which here is a Bioperl release with an odd minor version number or any RC whether it be of a developer release or a stable release. This should mean that we could have the RC's on CPAN, but by default, CPAN would only install the latest "non developer release" (i.e. the last package without an underscore in the version). If we are going ahead with the new $VERSION scheme (as it currently is in HEAD), we should, for the sake of clarity, try to talk about Bioperl 1.52 instead of Bioperl 1.5.2 and make an effort to sync the documentation with regards to this. Nath From bix at sendu.me.uk Tue Oct 24 10:19:05 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 24 Oct 2006 11:19:05 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453DE385.8010700@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> <453DE385.8010700@sheffield.ac.uk> Message-ID: <453DE899.4030603@sendu.me.uk> Nathan Haigh wrote: > > Therefore I would see the following $VERSION being applied: > 1.5.2 RC1 = 1.52_01 > 1.5.2 RC2 = 1.52_02 > 1.5.2 RC3 = 1.52_03 > 1.5.2 = 1.52_10 > 1.6 RC1 = 1.60_01 > 1.6 RC2 = 1.60_02 > 1.6 = 1.60 > 1.6.1 RC1 = 1.61_01 > 1.6.1 = 1.61 > > This should satisfy the requirement of CPAN for having underscores in > versions to indicate a developer release, which here is a Bioperl > release with an odd minor version number or any RC whether it be of a > developer release or a stable release. This should mean that we could > have the RC's on CPAN, but by default, CPAN would only install the > latest "non developer release" (i.e. the last package without an > underscore in the version). That all sounds good to me, except I worry about potential confusion if people look manually at the things available in CPAN, see 1.60_02 and think it is more recent than 1.60 and try to install it manually. Since $VERSION = 1.52_10; is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, final release version should be $VERSION = 1.6010. > If we are going ahead with the new $VERSION scheme (as it currently is > in HEAD), we should, for the sake of clarity, try to talk about Bioperl > 1.52 instead of Bioperl 1.5.2 and make an effort to sync the > documentation with regards to this. I might disagree with this though. I think perl people, and perhaps unix people in general, should be used to version numbers like '1.5.2', but then getting '1.52' from the code since such a number allows simple numerical comparisons while the former does not. The former is easier to read and understand. This is just how Perl itself behaves. Most users who wouldn't expect such a behaviour aren't going to be checking the version number programatically anyway. BTW. do we have someone with a CPAN account, or should I get one? From n.haigh at sheffield.ac.uk Tue Oct 24 11:37:12 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Tue, 24 Oct 2006 12:37:12 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453DE899.4030603@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> <453DE385.8010700@sheffield.ac.uk> <453DE899.4030603@sendu.me.uk> Message-ID: <453DFAE8.5050602@sheffield.ac.uk> Sendu Bala wrote: > Nathan Haigh wrote: > >> Therefore I would see the following $VERSION being applied: >> 1.5.2 RC1 = 1.52_01 >> 1.5.2 RC2 = 1.52_02 >> 1.5.2 RC3 = 1.52_03 >> 1.5.2 = 1.52_10 >> 1.6 RC1 = 1.60_01 >> 1.6 RC2 = 1.60_02 >> 1.6 = 1.60 >> 1.6.1 RC1 = 1.61_01 >> 1.6.1 = 1.61 >> >> This should satisfy the requirement of CPAN for having underscores in >> versions to indicate a developer release, which here is a Bioperl >> release with an odd minor version number or any RC whether it be of a >> developer release or a stable release. This should mean that we could >> have the RC's on CPAN, but by default, CPAN would only install the >> latest "non developer release" (i.e. the last package without an >> underscore in the version). >> > > That all sounds good to me, except I worry about potential confusion if > people look manually at the things available in CPAN, see 1.60_02 and > think it is more recent than 1.60 and try to install it manually. > > I not sure if this would be a problem. As far as I understand, CPAN treats these packages with underscores in $VERSION as something distinctly different to the others releases (i.e. developer releases). If you look at such a page, it is clearly evident that it is a developers release. For example, if you search on CPAN for the latest version of the CPAN module is shows 1.8802. if you go to that page: http://search.cpan.org/~andk/CPAN-1.8802/ There is also a link for the latest developer release, released 1 day after 1.8802 with a version of 1.88_57 (which would convert to 1.8857). This too appears to be later that 1.8802, but since it is dealt with as a developer release it doesn't seem to matter - CPAN will only deal with the stable (non-developer) releases, while the developer releases can be used as a convenient way to access developer releases. Although I'm thinking CPAN uses some hocus pocus with release dates too. > Since > $VERSION = 1.52_10; > is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, > final release version should be > $VERSION = 1.6010. > > > Because they are dealt with separately, I don't think this is an issue (see above). >> If we are going ahead with the new $VERSION scheme (as it currently is >> in HEAD), we should, for the sake of clarity, try to talk about Bioperl >> 1.52 instead of Bioperl 1.5.2 and make an effort to sync the >> documentation with regards to this. >> > > I might disagree with this though. I think perl people, and perhaps unix > people in general, should be used to version numbers like '1.5.2', but > then getting '1.52' from the code since such a number allows simple > numerical comparisons while the former does not. The former is easier to > read and understand. This is just how Perl itself behaves. > > Most users who wouldn't expect such a behaviour aren't going to be > checking the version number programatically anyway. > > > BTW. do we have someone with a CPAN account, or should I get one? > It says Ewan Birney is the author of Bioperl - I assume it must be possible to have multiple people have the permissions to update a single package. Nath From chhalling at alumni.ls.berkeley.edu Tue Oct 24 11:15:12 2006 From: chhalling at alumni.ls.berkeley.edu (Conrad Halling) Date: Tue, 24 Oct 2006 07:15:12 -0400 Subject: [Bioperl-l] Misspellings in Bundle::BioPerl In-Reply-To: <453DBB51.6010505@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453D6620.5020401@alumni.ls.berkeley.edu> <453DBB51.6010505@sheffield.ac.uk> Message-ID: <453DF5C0.3040104@alumni.ls.berkeley.edu> Nathan S. Haigh wrote: > Conrad Halling wrote: >> Sorry, I should know better about giving all the details. >> >> This was using the Bundle::BioPerl on CPAN, installing for Perl 5.8.8 >> (a fresh compile) with Mac OS X 10.4.8. >> >> -- Conrad > My apologies Conrad, this was my bad! Are you in need of the > corrections being made swiftly or can you wait until the Bioperl 1.5.2 > release when I'll ensure the Bundle is updated correctly for that > release? > > Cheers > Nath No, I'm fine. I used the cpan utility to load the three modules manually. -- Conrad -- Conrad Halling chhalling at alumni.ls.berkeley.edu From bix at sendu.me.uk Tue Oct 24 12:16:54 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 24 Oct 2006 13:16:54 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453DFAE8.5050602@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> <453DE385.8010700@sheffield.ac.uk> <453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk> Message-ID: <453E0436.3050903@sendu.me.uk> Nathan Haigh wrote: > Sendu Bala wrote: > >> That all sounds good to me, except I worry about potential confusion if >> people look manually at the things available in CPAN, see 1.60_02 and >> think it is more recent than 1.60 and try to install it manually. > > I not sure if this would be a problem. As far as I understand, CPAN > treats these packages with underscores in $VERSION as something > distinctly different to the others releases (i.e. developer releases). > If you look at such a page, it is clearly evident that it is a > developers release. For example, if you search on CPAN for the latest > version of the CPAN module is shows 1.8802. if you go to that page: > http://search.cpan.org/~andk/CPAN-1.8802/ > There is also a link for the latest developer release, released 1 day > after 1.8802 with a version of 1.88_57 (which would convert to 1.8857). [snip] >> Since >> $VERSION = 1.52_10; >> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, >> final release version should be >> $VERSION = 1.6010. > > Because they are dealt with separately, I don't think this is an issue > (see above). If you don't notice the dates, or are doing numerical version number comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may not be automatic, but you can still chose to download the developer releases. Which means if we say to someone 'use Bioperl 1.6 or better' they may choose to get the latest version and think it is 1.6002 when infact 1.60 was the more recent version. 1.6010 solves the problem, is consistent with your 1.50_10 suggestion, and doesn't cause any problems as far as I can see. >>> If we are going ahead with the new $VERSION scheme (as it currently is >>> in HEAD), we should, for the sake of clarity, try to talk about Bioperl >>> 1.52 instead of Bioperl 1.5.2 and make an effort to sync the >>> documentation with regards to this. >>> >> I might disagree with this though. I think perl people, and perhaps unix >> people in general, should be used to version numbers like '1.5.2', but >> then getting '1.52' from the code since such a number allows simple >> numerical comparisons while the former does not. The former is easier to >> read and understand. This is just how Perl itself behaves. >> >> Most users who wouldn't expect such a behaviour aren't going to be >> checking the version number programatically anyway. >> >> >> BTW. do we have someone with a CPAN account, or should I get one? >> > > It says Ewan Birney is the author of Bioperl - I assume it must be > possible to have multiple people have the permissions to update a single > package. How did you get Bundle::BioPerl updated? Did you just ask Chris Dagdigian to do it for you? Or do you have access to his account? I'll ask Ewan about it. From n.haigh at sheffield.ac.uk Tue Oct 24 12:21:56 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Tue, 24 Oct 2006 13:21:56 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453E0436.3050903@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> <453DE385.8010700@sheffield.ac.uk> <453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk> <453E0436.3050903@sendu.me.uk> Message-ID: <453E0564.9030302@sheffield.ac.uk> Sendu Bala wrote: > Nathan Haigh wrote: >> Sendu Bala wrote: >> >>> That all sounds good to me, except I worry about potential confusion >>> if people look manually at the things available in CPAN, see 1.60_02 >>> and think it is more recent than 1.60 and try to install it manually. >> >> I not sure if this would be a problem. As far as I understand, CPAN >> treats these packages with underscores in $VERSION as something >> distinctly different to the others releases (i.e. developer releases). >> If you look at such a page, it is clearly evident that it is a >> developers release. For example, if you search on CPAN for the latest >> version of the CPAN module is shows 1.8802. if you go to that page: >> http://search.cpan.org/~andk/CPAN-1.8802/ >> There is also a link for the latest developer release, released 1 day >> after 1.8802 with a version of 1.88_57 (which would convert to 1.8857). > > [snip] > >>> Since >>> $VERSION = 1.52_10; >>> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before >>> release, final release version should be >>> $VERSION = 1.6010. >> >> Because they are dealt with separately, I don't think this is an issue >> (see above). > > If you don't notice the dates, or are doing numerical version number > comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may > not be automatic, but you can still chose to download the developer > releases. Which means if we say to someone 'use Bioperl 1.6 or better' > they may choose to get the latest version and think it is 1.6002 when > infact 1.60 was the more recent version. 1.6010 solves the problem, is > consistent with your 1.50_10 suggestion, and doesn't cause any > problems as far as I can see. > > I see - you mean for a non-RC release append 10 to the version number and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to the version. --snip-- > > How did you get Bundle::BioPerl updated? Did you just ask Chris > Dagdigian to do it for you? Or do you have access to his account? I'll > ask Ewan about it. I just asked Chris D. to do it for me :o) Nath From bix at sendu.me.uk Tue Oct 24 13:01:22 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 24 Oct 2006 14:01:22 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453E0564.9030302@sheffield.ac.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> <453DE385.8010700@sheffield.ac.uk> <453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk> <453E0436.3050903@sendu.me.uk> <453E0564.9030302@sheffield.ac.uk> Message-ID: <453E0EA2.6050306@sendu.me.uk> Nathan Haigh wrote: > I see - you mean for a non-RC release append 10 to the version number > and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to > the version. Precisely. 1.5.2 RC3 will have in Bio::Root::Version : $VERSION = 1.52_03; $VERSION = eval $VERSION; # $VERSION is 1.5203 1.5.2 final release would have: $VERSION = 1.52_10; $VERSION = eval $VERSION; # $VERSION is 1.5210 1.6.0 RC1 would have: $VERSION = 1.60_01; $VERSION = eval $VERSION; # $VERSION is 1.6001 1.6.0 final release would have: $VERSION = 1.6010; Nice thing about putting RCs up on CPAN is that I suppose we'd see the test results from cpantesters. The more test results the better :) From n.haigh at sheffield.ac.uk Tue Oct 24 13:05:54 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Tue, 24 Oct 2006 14:05:54 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453E0EA2.6050306@sendu.me.uk> References: <453C029D.1070708@alumni.ls.berkeley.edu> <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu> <453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk> <453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk> <453C8E60.7000105@sendu.me.uk> <453CA99D.9060009@sheffield.ac.uk> <453CB9A5.2020409@mrc-lmb.cam.ac.uk> <453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk> <453DE385.8010700@sheffield.ac.uk> <453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk> <453E0436.3050903@sendu.me.uk> <453E0564.9030302@sheffield.ac.uk> <453E0EA2.6050306@sendu.me.uk> Message-ID: <453E0FB2.4080002@sheffield.ac.uk> Sendu Bala wrote: > Nathan Haigh wrote: >> I see - you mean for a non-RC release append 10 to the version number >> and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to >> the version. > > Precisely. > > 1.5.2 RC3 will have in Bio::Root::Version : > > $VERSION = 1.52_03; > $VERSION = eval $VERSION; # $VERSION is 1.5203 > > 1.5.2 final release would have: > > $VERSION = 1.52_10; > $VERSION = eval $VERSION; # $VERSION is 1.5210 > > 1.6.0 RC1 would have: > > $VERSION = 1.60_01; > $VERSION = eval $VERSION; # $VERSION is 1.6001 > > 1.6.0 final release would have: > > $VERSION = 1.6010; > > > Nice thing about putting RCs up on CPAN is that I suppose we'd see the > test results from cpantesters. The more test results the better :) Did you see the cpants site I sent earlier: http://cpants.perl.org/dist/bioperl But I'm not sure why 1.4 didn't make it in there instead of 1.2.3 From bix at sendu.me.uk Tue Oct 24 13:14:08 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 24 Oct 2006 14:14:08 +0100 Subject: [Bioperl-l] CPAN testing Service In-Reply-To: <453D2120.9010301@sheffield.ac.uk> References: <453D2120.9010301@sheffield.ac.uk> Message-ID: <453E11A0.20304@sendu.me.uk> Nathan S. Haigh wrote: > We should also check the CPAN testing service (CPANTS) to see how "good" > our package is for CPAN and try to increase the Kwalitee score. There > only appears to be details for bioperl-1.2.3 for some reason: > http://cpants.perl.org/dist/bioperl Yes, but I think it will be pretty similar score this time round. We'll resolve the remaining issues for 1.6. From cjfields at uiuc.edu Tue Oct 24 14:24:44 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 24 Oct 2006 09:24:44 -0500 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453E0436.3050903@sendu.me.uk> Message-ID: <000501c6f778$279cee10$15327e82@pyrimidine> ... > >> Since > >> $VERSION = 1.52_10; > >> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, > >> final release version should be > >> $VERSION = 1.6010. > > > > Because they are dealt with separately, I don't think this is an issue > > (see above). > > If you don't notice the dates, or are doing numerical version number > comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may > not be automatic, but you can still chose to download the developer > releases. Which means if we say to someone 'use Bioperl 1.6 or better' > they may choose to get the latest version and think it is 1.6002 when > infact 1.60 was the more recent version. 1.6010 solves the problem, is > consistent with your 1.50_10 suggestion, and doesn't cause any problems > as far as I can see. CPAN looks like it can handle 'x.y.z', at least for Pugs: http://search.cpan.org/~audreyt/Perl6-Pugs-6.2.13/ >From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013': our $VERSION = 6.002013; That's also a very perlish-way to do it. And there are no developer versions of Pugs, since it is always under active development. We could try something like: our $VERSION = 1.005002_01; just to tag it as a developer release or release candidate, if that's what you want; I'm neutral to that point. I don't think it's necessary to post every RC to CPAN, though, unless you feel very strongly about it. It just seems like more hassle than it's worth, esp. since you've been releasing about one per week leading up to a final 1.5.2 (due soon). > >> I might disagree with this though. I think perl people, and perhaps > unix > >> people in general, should be used to version numbers like '1.5.2', but > >> then getting '1.52' from the code since such a number allows simple > >> numerical comparisons while the former does not. The former is easier > to > >> read and understand. This is just how Perl itself behaves. > >> > >> Most users who wouldn't expect such a behaviour aren't going to be > >> checking the version number programatically anyway. > >> > >> > >> BTW. do we have someone with a CPAN account, or should I get one? > >> > > > > It says Ewan Birney is the author of Bioperl - I assume it must be > > possible to have multiple people have the permissions to update a single > > package. As a quick response to the above, I would read 'rel. 1.5.2' as the second patched release of the second revision (here in a developer cycle) of the first major release. I would read 'rel 1.52' as the 52nd release of the major release (just can't quite make it to version 2, I guess). I don't think we can use the latter as it is just too confusing, especially since we've adopted the 'major.minor.patch' versioning quite early on. As for CPAN, I believe there is usually a person or group responsible for maintaining each distribution. As Ewan seems to be the point man, you'll have to ask him. I suppose it is possible to add more if needed > How did you get Bundle::BioPerl updated? Did you just ask Chris > Dagdigian to do it for you? Or do you have access to his account? I'll > ask Ewan about it. When I inquired about XML::Simple, I emailed Chris D. via his contact information from CPAN. He let me know that adding it would be pretty easy, so all you need to do is let him know about any errors/additions/deletions. I think his wiki page also has some contact info. Which reminds me, if anyone contacts him, could you make sure that XML::Simple is added? I can't remember if it has been. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 24 14:29:11 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 24 Oct 2006 09:29:11 -0500 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453E0FB2.4080002@sheffield.ac.uk> Message-ID: <000601c6f778$c639f0e0$15327e82@pyrimidine> > Sendu Bala wrote: > > Nathan Haigh wrote: > >> I see - you mean for a non-RC release append 10 to the version number > >> and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to > >> the version. > > > > Precisely. > > > > 1.5.2 RC3 will have in Bio::Root::Version : > > > > $VERSION = 1.52_03; > > $VERSION = eval $VERSION; # $VERSION is 1.5203 > > > > 1.5.2 final release would have: > > > > $VERSION = 1.52_10; > > $VERSION = eval $VERSION; # $VERSION is 1.5210 > > > > 1.6.0 RC1 would have: > > > > $VERSION = 1.60_01; > > $VERSION = eval $VERSION; # $VERSION is 1.6001 > > > > 1.6.0 final release would have: > > > > $VERSION = 1.6010; > > > > > > Nice thing about putting RCs up on CPAN is that I suppose we'd see the > > test results from cpantesters. The more test results the better :) > Did you see the cpants site I sent earlier: > http://cpants.perl.org/dist/bioperl > > But I'm not sure why 1.4 didn't make it in there instead of 1.2.3 Yes, odd. Another thing to note is that CPAN also list two bugs related to bioperl 1.4. We may need to have some way of either redirecting users from there to bugzilla, or routinely checking the CPAN site. Otherwise we'll miss those. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From JK at novozymes.com Tue Oct 24 14:45:26 2006 From: JK at novozymes.com (JK (Jesper Agerbo Krogh)) Date: Tue, 24 Oct 2006 16:45:26 +0200 Subject: [Bioperl-l] Keeping references around in the objects? Message-ID: <934F95E71B6C9347A873C42AE3C196191299E011@NZT0004E.dknz.nzcorp.net> Hi All. When getting a Bio::Seq object back from a feature it would be really nice to have access to the old objects through the new object as: $featseq->feature()->parent_seq(); Would it be possible to keep the references around for (as an example) to be able to access the global information through the particular feature. Most of the annotation in the general header of a EMBL/Genbank-record also applies to the specific features. Jesper From JK at novozymes.com Tue Oct 24 14:28:22 2006 From: JK at novozymes.com (JK (Jesper Agerbo Krogh)) Date: Tue, 24 Oct 2006 16:28:22 +0200 Subject: [Bioperl-l] Subclassing Bio::Seq ? Extending Bio::Perl Message-ID: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net> Hi. We're trying to "extend" bioperl in our own setup. We have some funtions that we'd like to "allways" have available on a Bio::Seq-object. As an example, I'd like to have the sequence-digest available on ->digest that just returns A hex-encoded message-digest of the sequence in the object. This is really comfortable when trying to figure out wether we've got some computations stored in the cache for this particular sequence. Another example is that we have some fields we want to be mandatory in the objects, thus adding additional checks in the constructor is nessesary. Our approach has been to "subclass" Bio::Seq in a new object: (Nz::Seq) and add the functionality there. This generally works fine (->translate() calls ->can_call_new() and instantiates the correct subclassed object. But the logic fails when the ->seq of a feature just instantiates a Bio::PrimarySeq without trying to get the subclassed object. So the question basically is: What is the preferred way of extending/subclassing Bio-perl -objects with our own methods? Jesper From bix at sendu.me.uk Tue Oct 24 15:26:19 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 24 Oct 2006 16:26:19 +0100 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <000501c6f778$279cee10$15327e82@pyrimidine> References: <000501c6f778$279cee10$15327e82@pyrimidine> Message-ID: <453E309B.9090007@sendu.me.uk> Chris Fields wrote: > ... >>>> Since >>>> $VERSION = 1.52_10; >>>> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, >>>> final release version should be >>>> $VERSION = 1.6010. >>> Because they are dealt with separately, I don't think this is an issue >>> (see above). >> If you don't notice the dates, or are doing numerical version number >> comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may >> not be automatic, but you can still chose to download the developer >> releases. Which means if we say to someone 'use Bioperl 1.6 or better' >> they may choose to get the latest version and think it is 1.6002 when >> infact 1.60 was the more recent version. 1.6010 solves the problem, is >> consistent with your 1.50_10 suggestion, and doesn't cause any problems >> as far as I can see. > > CPAN looks like it can handle 'x.y.z', at least for Pugs: > > http://search.cpan.org/~audreyt/Perl6-Pugs-6.2.13/ 'handle'? I think it shows up as '6.2.13' simply because it was uploaded with the filename Perl6-Pugs-6.2.13.tar.gz As you point out, the code has the kind of $VERSION number we've been suggesting in this thread: > From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013': > > our $VERSION = 6.002013; > > That's also a very perlish-way to do it. And there are no developer > versions of Pugs, since it is always under active development. We could try > something like: > > our $VERSION = 1.005002_01; Yes, this was already like one of my suggestions (1.0502_01), but I brought up the concern that 1.05 might be < 1.4. So then we have a question: do we try and fumble a 1.4 compatible number by using 1.60_10, or do we have a clean break, remove 1.4 from CPAN if it causes problems, and go for the 'proper' 1.006000 (1.6.0) with no room for RC numbering, or 1.006000010 (1.6.0.10) - the first final release following some 1.006000_001 (1.6.0.01 == rc1) RCs? > just to tag it as a developer release or release candidate, if that's what > you want; I'm neutral to that point. I don't think it's necessary to post > every RC to CPAN, though, unless you feel very strongly about it. It just > seems like more hassle than it's worth, esp. since you've been releasing > about one per week leading up to a final 1.5.2 (due soon). I don't think it would be a hassle; on the contrary it would be very useful to know the CPAN distribution actually works. I'm very happy with the idea that a release candidate gets fully tested... From bix at sendu.me.uk Tue Oct 24 15:39:16 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 24 Oct 2006 16:39:16 +0100 Subject: [Bioperl-l] Subclassing Bio::Seq ? Extending Bio::Perl In-Reply-To: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net> References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net> Message-ID: <453E33A4.5060004@sendu.me.uk> JK (Jesper Agerbo Krogh) wrote: > Hi. > > We're trying to "extend" bioperl in our own setup. We have some funtions > that we'd like to "allways" have available on a Bio::Seq-object. [snip] > So the question basically is: > What is the preferred way of extending/subclassing Bio-perl -objects > with our own methods? http://www.bioperl.org/wiki/Advanced_BioPerl#Extending_the_toolkit From hlapp at gmx.net Tue Oct 24 16:24:09 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 24 Oct 2006 12:24:09 -0400 Subject: [Bioperl-l] Subclassing Bio::Seq ? Extending Bio::Perl In-Reply-To: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net> References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net> Message-ID: I think you've generally taken the right path, but see below. First off, object factories are used extensively already but not yet in each and every place where Bioperl creates an object internally. Achieving your goal may entail fixes to Bioperl to use a factory instead of a hard-coded module name. Also be on the lookout for factory() or seq_factory() methods for classes whose work entails creating sequence objects and that already give you control over the type to be created. The problem that hits you here though isn't one of determining the type of the object to be created, because the respective method doesn't create a sequence object. It only returns the sequence object that the feature has a reference to. The reason that this is a Bio::PrimarySeq and not a Bio::Seq or your extension of the latter is that the Perl garbage collector can't deal with circular references. The way we've circumvented the problem with sequence (who hold references to their feature objects) and feature objects (who need to hold a reference to their sequence object) is to make Bio::Seq a wrapper around Bio::PrimarySeq (i.e., Bio::Seq implements Bio::PrimarySeqI by delegating all the Bio::PrimarySeqI methods to an instance of Bio::PrimarySeq, and then adds implementations of the Bio::SeqI methods), and then make feature objects only hold a reference to the 'base' Bio::PrimarySeq instance. This works because Bio::PrimarySeq doesn't hold features, only Bio::SeqI objects do. Having said all that, note that if all what you want to do is defining computations on Bio::Seq objects, as opposed to storing values for additional attributes, the best design approach is not to extend the class but to create a class with those computations as static methods (which would accept the seq object on which to compute as an argument; e.g., print $seqComputations->message_digest($seq)). -hlmar On Oct 24, 2006, at 10:28 AM, JK ((Jesper Agerbo Krogh)) wrote: > Hi. > > We're trying to "extend" bioperl in our own setup. We have some > funtions > > that we'd like to "allways" have available on a Bio::Seq-object. As an > example, > I'd like to have the sequence-digest available on ->digest that just > returns > A hex-encoded message-digest of the sequence in the object. This is > really comfortable > when trying to figure out wether we've got some computations stored in > the cache > for this particular sequence. > > Another example is that we have some fields we want to be mandatory in > the objects, > thus adding additional checks in the constructor is nessesary. > > Our approach has been to "subclass" Bio::Seq in a new object: > (Nz::Seq) > and add > the functionality there. This generally works fine (->translate() > calls > ->can_call_new() > and instantiates the correct subclassed object. > > But the logic fails when the ->seq of a feature just instantiates a > Bio::PrimarySeq > without trying to get the subclassed object. > > So the question basically is: > What is the preferred way of extending/subclassing Bio-perl -objects > with > our own methods? > > Jesper > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Tue Oct 24 16:45:25 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 24 Oct 2006 11:45:25 -0500 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <453E309B.9090007@sendu.me.uk> Message-ID: <000001c6f78b$d1c65a30$15327e82@pyrimidine> ... > > 'handle'? I think it shows up as '6.2.13' simply because it was uploaded > with the filename Perl6-Pugs-6.2.13.tar.gz Sorry, my point was that when Audrey T. uses '6.2.13', her $VERSION is '6.002013'. So maybe we should follow a similar convention. Seems easier and less confusing to me, at least. > As you point out, the code has the kind of $VERSION number we've been > suggesting in this thread: > > > From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013': > > > > our $VERSION = 6.002013; > > > > That's also a very perlish-way to do it. And there are no developer > > versions of Pugs, since it is always under active development. We could > try > > something like: > > > > our $VERSION = 1.005002_01; > > Yes, this was already like one of my suggestions (1.0502_01), but I > brought up the concern that 1.05 might be < 1.4. > > So then we have a question: do we try and fumble a 1.4 compatible number > by using 1.60_10, or do we have a clean break, remove 1.4 from CPAN if > it causes problems, and go for the 'proper' 1.006000 (1.6.0) with no > room for RC numbering, or 1.006000010 (1.6.0.10) - the first final > release following some 1.006000_001 (1.6.0.01 == rc1) RCs? I would go for the clean break if it follows perl/CPAN convention. '1.60_10' looks like '1.60.10', not 1.6 or 1.6 RC1, so it's too confusing. If we can use '1.00600x' for 1.6, 1.6.1, etc, and '1.006000_00x' for 1.6 RC1, 1.6 RC2 etc then that would be consistent and perl-compatible. BTW, the reason I looked at Pugs was to see what some of the Perl6 developers were using. Who knows; they'll probably change it! ... > I don't think it would be a hassle; on the contrary it would be very > useful to know the CPAN distribution actually works. I'm very happy with > the idea that a release candidate gets fully tested... So you obviously feel strongly about it! ;> I don't have a problem as long as we stick with doing this from now on (i.e. have a consistent versioning scheme, release policy, CPAN release policy, etc). Would be nice for Jason/Brian/Hilmar to chime in as to the reasoning behind the older versioning scheme. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From JK at novozymes.com Tue Oct 24 17:59:10 2006 From: JK at novozymes.com (JK (Jesper Agerbo Krogh)) Date: Tue, 24 Oct 2006 19:59:10 +0200 Subject: [Bioperl-l] Subclassing Bio::Seq ? Extending Bio::Perl References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.n et> Message-ID: <934F95E71B6C9347A873C42AE3C196190B84C5FC@NZT0004E.dknz.nzcorp.net> > > I think you've generally taken the right path, but see below. > > First off, object factories are used extensively already but not yet > in each and every place where Bioperl creates an object internally. > Achieving your goal may entail fixes to Bioperl to use a factory > instead of a hard-coded module name. Also be on the lookout for > factory() or seq_factory() methods for classes whose work entails > creating sequence objects and that already give you control over the > type to be created. Can you elaborate/describe this a bit more? > The problem that hits you here though isn't one of determining the > type of the object to be created, because the respective method > doesn't create a sequence object. It only returns the sequence object > that the feature has a reference to. This was what Data::Dumper told me, but stuff I'd likewise would like to change was to get a RichSeq object returned every-time from Bio::Seq, adding in the stuff that allways seems appropriate. > The reason that this is a Bio::PrimarySeq and not a Bio::Seq or your > extension of the latter is that the Perl garbage collector can't deal > with circular references. Doesn't Scalar::Util::weaken solve that? > Having said all that, note that if all what you want to do is > defining computations on Bio::Seq objects, as opposed to storing > values for additional attributes, the best design approach is not to > extend the class but to create a class with those computations as > static methods (which would accept the seq object on which to compute > as an argument; e.g., print $seqComputations->message_digest($seq)). I could but there are some functionality that I'd by design would like to have available on every sequence in the system. This way I would end up coding the functionality for getting the message_digest every place that I needed to get the value (which would be quite often in this application), whereas it by design belongs into the Bio::Seq-stuff. Jesper From JK at novozymes.com Tue Oct 24 17:59:19 2006 From: JK at novozymes.com (JK (Jesper Agerbo Krogh)) Date: Tue, 24 Oct 2006 19:59:19 +0200 Subject: [Bioperl-l] Subclassing Bio::Seq ? Extending Bio::Perl References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.n et> <453E33A4.5060004@sendu.me.uk> Message-ID: <934F95E71B6C9347A873C42AE3C196190B84C5FD@NZT0004E.dknz.nzcorp.net> > JK (Jesper Agerbo Krogh) wrote: > > Hi. > > > > We're trying to "extend" bioperl in our own setup. We have some funtions > > that we'd like to "allways" have available on a Bio::Seq-object. > [snip] > > So the question basically is: > > What is the preferred way of extending/subclassing Bio-perl -objects > > with our own methods? > > http://www.bioperl.org/wiki/Advanced_BioPerl#Extending_the_toolkit That is definately a way of extending Bio-perl, thanks. Jesper From hlapp at gmx.net Tue Oct 24 18:57:02 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 24 Oct 2006 14:57:02 -0400 Subject: [Bioperl-l] Subclassing Bio::Seq ? Extending Bio::Perl In-Reply-To: <934F95E71B6C9347A873C42AE3C196190B84C5FC@NZT0004E.dknz.nzcorp.net> References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.n et> <934F95E71B6C9347A873C42AE3C196190B84C5FC@NZT0004E.dknz.nzcorp.net> Message-ID: On Oct 24, 2006, at 1:59 PM, JK ((Jesper Agerbo Krogh)) wrote: >> >> I think you've generally taken the right path, but see below. >> >> First off, object factories are used extensively already but not yet >> in each and every place where Bioperl creates an object internally. >> Achieving your goal may entail fixes to Bioperl to use a factory >> instead of a hard-coded module name. Also be on the lookout for >> factory() or seq_factory() methods for classes whose work entails >> creating sequence objects and that already give you control over the >> type to be created. > > Can you elaborate/describe this a bit more? See for example the POD of Bio::SeqIO (sorry, the method is called sequence_factory()). > >> The reason that this is a Bio::PrimarySeq and not a Bio::Seq or your >> extension of the latter is that the Perl garbage collector can't deal >> with circular references. > > Doesn't Scalar::Util::weaken solve that? You're welcome to test and try. It should be a simple change in Bio::Seq::add_SeqFeature(). You will see that it is this method and not the feature object that makes sure the wrapped primarySeq gets passed as sequence reference. Just change that to creating a new reference to the sequence object and make it a weak reference before passing it to the feature object. (The feature object has no requirement (or knowledge) that the referenced sequence object is a PrimarySeq.) > >> Having said all that, note that if all what you want to do is >> defining computations on Bio::Seq objects, as opposed to storing >> values for additional attributes, the best design approach is not to >> extend the class but to create a class with those computations as >> static methods (which would accept the seq object on which to compute >> as an argument; e.g., print $seqComputations->message_digest($seq)). > > I could but there are some functionality that I'd by design would > like to > have available on every sequence in the system. This way I would > end up > coding the functionality for getting the message_digest every place > that > I needed to get the value (which would be quite often in this > application), > whereas it by design belongs into the Bio::Seq-stuff. I'm not following you why this would make any difference (it would be $seq->message_digest() compared to $seqCompute->message_digest ($seq)), unless what you are saying is that you would like to cache the result of the computation. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bix at sendu.me.uk Wed Oct 25 10:36:27 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 25 Oct 2006 11:36:27 +0100 Subject: [Bioperl-l] Lagan environment variable Message-ID: <453F3E2B.2040309@sendu.me.uk> Notification to say I'm changing the environmental variable that Bio::Tools::Run::Alignment::Lagan expects to define the location of the lagan executables from LAGANDIR to LAGAN_DIR, since the latter is the default variable that the lagan installation and scripts themselves look for. I hope this isn't too much of a burden, but it seems like the sensible approach to getting Bio::Tools::Run::Alignment::Lagan to actually work. Thank you, Sendu. From n.haigh at sheffield.ac.uk Wed Oct 25 13:07:47 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 25 Oct 2006 13:07:47 +0000 Subject: [Bioperl-l] Lagan environment variable In-Reply-To: <453F3E2B.2040309@sendu.me.uk> References: <453F3E2B.2040309@sendu.me.uk> Message-ID: <453F61A3.4090904@sheffield.ac.uk> Sendu Bala wrote: > Notification to say I'm changing the environmental variable that > Bio::Tools::Run::Alignment::Lagan expects to define the location of the > lagan executables from LAGANDIR to LAGAN_DIR, since the latter is the > default variable that the lagan installation and scripts themselves look > for. > > I hope this isn't too much of a burden, but it seems like the sensible > approach to getting Bio::Tools::Run::Alignment::Lagan to actually work. > > > Thank you, > Sendu. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > Woudn't it make more sense to change the test? That is what I've just done for t/Genscan.t It seemed to fit in with the ENV variable syntax that other modules in Bioperl-run used. Nath -- > A: Yes. >> Q: Are you sure? >> >>> A: Because it reverses the logical flow of conversation. >>> >>>> Q: Why is top posting frowned upon? >>>> Get Thunderbird From bix at sendu.me.uk Wed Oct 25 12:12:00 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Wed, 25 Oct 2006 13:12:00 +0100 Subject: [Bioperl-l] Lagan environment variable In-Reply-To: <453F61A3.4090904@sheffield.ac.uk> References: <453F3E2B.2040309@sendu.me.uk> <453F61A3.4090904@sheffield.ac.uk> Message-ID: <453F5490.7060808@sendu.me.uk> Nathan S. Haigh wrote: > Sendu Bala wrote: >> Notification to say I'm changing the environmental variable that >> Bio::Tools::Run::Alignment::Lagan expects to define the location of the >> lagan executables from LAGANDIR to LAGAN_DIR, since the latter is the >> default variable that the lagan installation and scripts themselves look >> for. >> >> I hope this isn't too much of a burden, but it seems like the sensible >> approach to getting Bio::Tools::Run::Alignment::Lagan to actually work. > > Woudn't it make more sense to change the test? That is what I've just > done for t/Genscan.t For Genscan.t, the test script looked at the wrong environment variable. Here I'm talking about lagan itself (the thing you get from http://lagan.stanford.edu/lagan_web/citing.shtml, nothing to do with Bioperl) needing the environment variable LAGAN_DIR to be set in order to work. Since you need to set LAGAN_DIR to make lagan work, it makes sense that the Bioperl front-end to lagan also use the same variable. From n.haigh at sheffield.ac.uk Wed Oct 25 13:16:16 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Wed, 25 Oct 2006 13:16:16 +0000 Subject: [Bioperl-l] Lagan environment variable In-Reply-To: <453F5490.7060808@sendu.me.uk> References: <453F3E2B.2040309@sendu.me.uk> <453F61A3.4090904@sheffield.ac.uk> <453F5490.7060808@sendu.me.uk> Message-ID: <453F63A0.7040609@sheffield.ac.uk> Sendu Bala wrote: > Nathan S. Haigh wrote: >> Sendu Bala wrote: >>> Notification to say I'm changing the environmental variable that >>> Bio::Tools::Run::Alignment::Lagan expects to define the location of >>> the lagan executables from LAGANDIR to LAGAN_DIR, since the latter >>> is the default variable that the lagan installation and scripts >>> themselves look for. >>> >>> I hope this isn't too much of a burden, but it seems like the >>> sensible approach to getting Bio::Tools::Run::Alignment::Lagan to >>> actually work. >> >> Woudn't it make more sense to change the test? That is what I've just >> done for t/Genscan.t > > For Genscan.t, the test script looked at the wrong environment variable. > > Here I'm talking about lagan itself (the thing you get from > http://lagan.stanford.edu/lagan_web/citing.shtml, nothing to do with > Bioperl) needing the environment variable LAGAN_DIR to be set in order > to work. > > Since you need to set LAGAN_DIR to make lagan work, it makes sense > that the Bioperl front-end to lagan also use the same variable. > Ah, OK! :-[ teach me for speak up about something I know nothing about! :-) FYI, I've been busy this morning installing as much Bioperl-run external software as I could (those that have tests). Will be posting results shorty. Nath From massimo.ubaldi at gmail.com Wed Oct 25 14:28:52 2006 From: massimo.ubaldi at gmail.com (Massimo Ubaldi) Date: Wed, 25 Oct 2006 16:28:52 +0200 Subject: [Bioperl-l] blastxml format Message-ID: <4b5350650610250728s1a421199if2493c9c4660474d@mail.gmail.com> Hi I'm using the script below to parse a blastn output to multiple sequences I got the output from the blast web interface asking for xml formatted output. Everything work fine except that I cannot print the name of each input sequence (see below). That is, using the line (see below) $result->query_description I got just the name of the first sequence. Infact this is defined by the tag. What I really want is to extract the name that is defined by the tag. Now I digged out the bioperl mailing list and other sources but I did not find anything to solve this. Can somebody help me? Thanks alot Massimo This is an example of ouput I got MRDNA_probe 46.1 PREDICTED: Danio rerio similar to mineralocorticoid receptor form B (LOC562171), mRNA 68354945 XM_685568 81.8 Danio rerio VDR-B mRNA, partial cds 68132043 DQ017633 PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA 68420187 XM_684078 This what I'd like to get MRDNA_probe 46.1 PREDICTED: Danio rerio similar to mineralocorticoid receptor form B (LOC562171), mRNA 68354945 XM_685568 VDRacterm_probe 81.8 Danio rerio VDR-B mRNA, partial cds 68132043 DQ017633 ARalpcterm_probe PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA 68420187 XM_684078 This is the script #!/usr/bin/perl use strict; use Bio::SearchIO; my $in = new Bio::SearchIO(-format => 'blast', -file => 'Blastn_danio.bls'); open OUTFILE, ">parsed_blastn_danio.txt" or die "Could not open file, stopped"; my $result = $in->next_result; print OUTFILE $result->algorithm, "\n"; print OUTFILE $result->database_name, "\n"; print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers", "\t", "GenBank Accession", "\n"; while($result = $in->next_result ) { print OUTFILE $result->query_description, "\n"; while( my $hit = $result->next_hit ) { while( my $hsp = $hit->next_hsp ) { my $acc=$hit->name; my $description= $hit->description; $acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/; print OUTFILE $hit->raw_score, "\t", # Score $hit->description, "\t", # Description $1, "\t", $2, "\n"; } } } From cjfields at uiuc.edu Wed Oct 25 15:04:14 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 25 Oct 2006 10:04:14 -0500 Subject: [Bioperl-l] blastxml format In-Reply-To: <4b5350650610250728s1a421199if2493c9c4660474d@mail.gmail.com> Message-ID: <000301c6f846$d6227760$15327e82@pyrimidine> Iterations (which are related to PSIBLAST) aren't currently handled in blastxml, which is why the tag isn't being parsed. I'll give it a look but I don't think it will be properly fixed anytime soon, since we're gearing up for a developer release and are sorting out various bugs in relation to that. In the meantime, you could always try changing the relevant tag in the %MAPPING hash in your local copy of Bio::SearchIO::blastxml from 'BlastOutput_query-def' to 'Iteration_query-def', which may do the trick for you. I'm a bit reluctant to change this in CVS as it would be better to add this in when iterations are handled properly by blastxml, and I'm not sure all BLAST XML varieties have the tag. If you want you can add this to the bioperl bugzilla as an enhancement request to remind us: http://bugzilla.open-bio.org/ Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Massimo Ubaldi > Sent: Wednesday, October 25, 2006 9:29 AM > To: bioperl-l List > Subject: [Bioperl-l] blastxml format > > Hi > I'm using the script below to parse a blastn output to multiple sequences > I got the output from the blast web interface asking for xml formatted > output. > Everything work fine except that I cannot print the name of each input > sequence (see below). > That is, using the line (see below) $result->query_description I got just > the name of the first sequence. Infact this is defined by the > tag. > What I really want is to extract the name that is defined by the > tag. > Now I digged out the bioperl mailing list and other sources but I did not > find anything to solve this. > Can somebody help me? > Thanks alot > Massimo > > > This is an example of ouput I got > > MRDNA_probe > 46.1 PREDICTED: Danio rerio similar to mineralocorticoid receptor form > B > (LOC562171), mRNA 68354945 XM_685568 > 81.8 Danio rerio VDR-B mRNA, partial cds 68132043 DQ017633 > PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA > 68420187 XM_684078 > > This what I'd like to get > MRDNA_probe > 46.1 PREDICTED: Danio rerio similar to mineralocorticoid receptor form > B > (LOC562171), mRNA 68354945 XM_685568 > VDRacterm_probe > 81.8 Danio rerio VDR-B mRNA, partial cds 68132043 DQ017633 > ARalpcterm_probe > PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA > 68420187 XM_684078 > > This is the script > #!/usr/bin/perl > use strict; > use Bio::SearchIO; > my $in = new Bio::SearchIO(-format => 'blast', > -file => 'Blastn_danio.bls'); > open OUTFILE, ">parsed_blastn_danio.txt" or die "Could not open file, > stopped"; > my $result = $in->next_result; > print OUTFILE $result->algorithm, "\n"; > print OUTFILE $result->database_name, "\n"; > > print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers", > "\t", "GenBank Accession", "\n"; > > while($result = $in->next_result ) { > print OUTFILE $result->query_description, "\n"; > while( my $hit = $result->next_hit ) { > while( my $hsp = $hit->next_hsp ) { > > my $acc=$hit->name; > my $description= $hit->description; > > $acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/; > > print OUTFILE > > $hit->raw_score, "\t", # Score > $hit->description, "\t", # Description > > $1, "\t", $2, "\n"; > } > } > } > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From massimo.ubaldi at gmail.com Wed Oct 25 15:20:49 2006 From: massimo.ubaldi at gmail.com (Massimo Ubaldi) Date: Wed, 25 Oct 2006 17:20:49 +0200 Subject: [Bioperl-l] blastxml format In-Reply-To: <000301c6f846$d6227760$15327e82@pyrimidine> References: <4b5350650610250728s1a421199if2493c9c4660474d@mail.gmail.com> <000301c6f846$d6227760$15327e82@pyrimidine> Message-ID: <4b5350650610250820w1498b27dnd155896fbf9a2012@mail.gmail.com> Thanks for the reply. I've already tried this but I got exactly the same results as before. What other can I try? Massimo On 10/25/06, Chris Fields wrote: > > Iterations (which are related to PSIBLAST) aren't currently handled in > blastxml, which is why the tag isn't being parsed. I'll give it a look > but > I don't think it will be properly fixed anytime soon, since we're gearing > up > for a developer release and are sorting out various bugs in relation to > that. > > In the meantime, you could always try changing the relevant tag in the > %MAPPING hash in your local copy of Bio::SearchIO::blastxml from > 'BlastOutput_query-def' to 'Iteration_query-def', which may do the trick > for > you. I'm a bit reluctant to change this in CVS as it would be better to > add > this in when iterations are handled properly by blastxml, and I'm not sure > all BLAST XML varieties have the tag. > > If you want you can add this to the bioperl bugzilla as an enhancement > request to remind us: > > http://bugzilla.open-bio.org/ > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > -----Original Message----- > > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > > bounces at lists.open-bio.org] On Behalf Of Massimo Ubaldi > > Sent: Wednesday, October 25, 2006 9:29 AM > > To: bioperl-l List > > Subject: [Bioperl-l] blastxml format > > > > Hi > > I'm using the script below to parse a blastn output to multiple > sequences > > I got the output from the blast web interface asking for xml formatted > > output. > > Everything work fine except that I cannot print the name of each input > > sequence (see below). > > That is, using the line (see below) $result->query_description I got > just > > the name of the first sequence. Infact this is defined by the > > tag. > > What I really want is to extract the name that is defined by the > > tag. > > Now I digged out the bioperl mailing list and other sources but I did > not > > find anything to solve this. > > Can somebody help me? > > Thanks alot > > Massimo > > > > > > This is an example of ouput I got > > > > MRDNA_probe > > 46.1 PREDICTED: Danio rerio similar to mineralocorticoid receptor > form > > B > > (LOC562171), mRNA 68354945 XM_685568 > > 81.8 Danio rerio VDR-B mRNA, partial cds 68132043 DQ017633 > > PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA > > 68420187 XM_684078 > > > > This what I'd like to get > > MRDNA_probe > > 46.1 PREDICTED: Danio rerio similar to mineralocorticoid receptor > form > > B > > (LOC562171), mRNA 68354945 XM_685568 > > VDRacterm_probe > > 81.8 Danio rerio VDR-B mRNA, partial cds 68132043 DQ017633 > > ARalpcterm_probe > > PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA > > 68420187 XM_684078 > > > > This is the script > > #!/usr/bin/perl > > use strict; > > use Bio::SearchIO; > > my $in = new Bio::SearchIO(-format => 'blast', > > -file => 'Blastn_danio.bls'); > > open OUTFILE, ">parsed_blastn_danio.txt" or die "Could not open file, > > stopped"; > > my $result = $in->next_result; > > print OUTFILE $result->algorithm, "\n"; > > print OUTFILE $result->database_name, "\n"; > > > > print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers", > > "\t", "GenBank Accession", "\n"; > > > > while($result = $in->next_result ) { > > print OUTFILE $result->query_description, "\n"; > > while( my $hit = $result->next_hit ) { > > while( my $hsp = $hit->next_hsp ) { > > > > my $acc=$hit->name; > > my $description= $hit->description; > > > > $acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/; > > > > print OUTFILE > > > > $hit->raw_score, "\t", # Score > > $hit->description, "\t", # Description > > > > $1, "\t", $2, "\n"; > > } > > } > > } > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > From cjfields at uiuc.edu Wed Oct 25 16:56:46 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Wed, 25 Oct 2006 11:56:46 -0500 Subject: [Bioperl-l] blastxml format In-Reply-To: <4b5350650610250820w1498b27dnd155896fbf9a2012@mail.gmail.com> Message-ID: <000001c6f856$8ee44bc0$15327e82@pyrimidine> > Thanks for the reply. I've already tried this but I got exactly the same > > results as before. > What other can I try? > Massimo If you don't mind me asking, what version of perl and Bioperl are you using, and what version of BLAST is used? I want to point out there are a number of problems with your script, now I have had a chance to look at it. 1) You have the SearchIO format set to 'blast'. It should be 'blastxml' if you are parsing XML format. 2) Every time you call next_result() you iterate through each BLAST report. In effect, you're doing something like this: my $result = $in->next_result(); ....# do something here (in first BLAST report) while ($result = $in->next_result()) { # change to second BLAST report # more stuff here (in second BLAST report, if there is one) } I don't know if it's intentional though, but it's something to point out. 3) You also use raw_score(), which doesn't return a value for me (this may be related to the bioperl version, which is why I asked above). If you use $hit->bits() or $hit->significance() you can get the bits or hit evalue, respectively. 4) Also, I didn't see a difference with the two XML tags and using BLAST 2.2.15 output (WebBLAST at NCBI), which makes sense since they should originate from the same query sequence anyway. This could be related to the BLAST version. Here's my version of your script, using WinXP and bioperl-live (CVS): use Bio::SearchIO; my $file = shift @ARGV; my $in = new Bio::SearchIO(-format => 'blastxml', -file => $file); open OUTFILE, ">parsed_blastn_danio.txt" || die "Could not open file, stopped"; while(my $result = $in->next_result ) { print OUTFILE $result->algorithm, "\n"; print OUTFILE $result->database_name, "\n"; print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers", "\t", "GenBank Accession", "\n"; print OUTFILE $result->query_description, "\n"; while( my $hit = $result->next_hit ) { while( my $hsp = $hit->next_hsp ) { my $acc=$hit->name; my $description= $hit->description; if ($acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/) { print OUTFILE $hit->bits, "\t", # Score $hit->description, "\t", # Description $1, "\t", $2, "\n"; } } } } Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign ... From n.haigh at sheffield.ac.uk Thu Oct 26 08:47:27 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 26 Oct 2006 09:47:27 +0100 Subject: [Bioperl-l] More extensive Bioperl-run 1.5.2RC2 tests Message-ID: <4540761F.6010904@sheffield.ac.uk> Oops, I posted this to the Biojava list the other day by mistake! I have recently installed some more software for which there are bioperl-run tests and run the test suite with several versions of the software I could find. I've added info to http://www.bioperl.org/wiki/Release_1.5.2#bioperl-run. If there were any fails in any of the versions I tested I've noted them together with versions that were ok (if any). There maybe another 6 or so programs I'm trying to get hold of to run further tests - I'll update when I get them. Nath From n.haigh at sheffield.ac.uk Thu Oct 26 09:14:07 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 26 Oct 2006 10:14:07 +0100 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally Message-ID: <45407C5F.40104@sheffield.ac.uk> I'm thinking that it's not wise to test for things like overall_percentage_identity etc in alignments that are generated by external software like T-Coffee, Clustalw etc. Changes to software algorithms/efficiency, bug fixes etc may well alter the quality of the alignment produced in different versions and thus affect the value returned by such methods. Therefore, I think these methods should only be tested from alignments loaded directly from t/data. Nath From bix at sendu.me.uk Thu Oct 26 09:48:37 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Thu, 26 Oct 2006 10:48:37 +0100 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <45407C5F.40104@sheffield.ac.uk> References: <45407C5F.40104@sheffield.ac.uk> Message-ID: <45408475.30903@sendu.me.uk> Nathan Haigh wrote: > I'm thinking that it's not wise to test for things like > overall_percentage_identity etc in alignments that are generated by > external software like T-Coffee, Clustalw etc. Changes to software > algorithms/efficiency, bug fixes etc may well alter the quality of the > alignment produced in different versions and thus affect the value > returned by such methods. Therefore, I think these methods should only > be tested from alignments loaded directly from t/data. Did you discover some specific problem cases? From n.haigh at sheffield.ac.uk Thu Oct 26 10:04:54 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 26 Oct 2006 11:04:54 +0100 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <45408475.30903@sendu.me.uk> References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk> Message-ID: <45408846.1050001@sheffield.ac.uk> Sendu Bala wrote: > Nathan Haigh wrote: >> I'm thinking that it's not wise to test for things like >> overall_percentage_identity etc in alignments that are generated by >> external software like T-Coffee, Clustalw etc. Changes to software >> algorithms/efficiency, bug fixes etc may well alter the quality of the >> alignment produced in different versions and thus affect the value >> returned by such methods. Therefore, I think these methods should only >> be tested from alignments loaded directly from t/data. > > Did you discover some specific problem cases? My messages seem to be taking a while to come through, but, yes. It may be due to the software changing default parameters, but it makes testing the output for specific details pretty difficult and inconsistent. For example, running T-Coffee, the following command from t/TCoffee.t results in slightly different alignment: $aln = $factory->run('-type' => 'profile', '-profile' => $aln1, '-seq' => Bio::Root::IO->catfile("t","data","cysprot1b.fa")); Of particular note, is the gaps on the last line of the sequences. In 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in CATH_RAT/1-333 ------mwtalpllcagawllsagat----------aeltvnaiek------------fh ftswmkqhqktyss-reyshrlqvfannwrkiqahn----qrnhtfkmglnqfsdmsfae ikhkylwsepqncsat--ksnylrgtgp--ypssmdwrkkgnvvspvknqgacgscwtfs ttgalesavaiasgkmmtlaeqqlvdcaqnfnnh--------gcqgglpsqafeyilynk gimgedsypyigkngqckfnpekavafvknvv-nitlndeaamveavalynpvsfafevt -edfmmyksgvyssnschktpdkvnhavlavgygeqn-----gllywivknswgsnwgnn gyfliergk-nm---cglaacasypipqv >CATL_HUMAN/1-333 --------------------------------mnptlilaafclgiasatltfdhsleaq wtkwkamhnrlygmnee-gwrravweknmkmielhnqeyregkhsftmamnafgdmtsee frqvmngfqnrkpr----kgkvfqeplfyeaprsvdwrekg-yvtpvknqgqcgscwafs atgalegqmfrktgrlislseqnlvdcsgpqgn--------egcngglmdyafqyvqdng gldseesypyeateesckynpkysvandtgfv-dip-kqekalmkavatvgpisvaidag hesflfykegiyfepdc--ssedmdhgvlvvgygfestesdnn-kywlvknswgeewgmg gyvkmakdrrnh---cgiasaasyptv-- >CATL_RAT/1-334 --------------------------------mtpllllavlclgtalatpkfdqtfnaq whqwksthrrlygtnee-ewrravweknmrmiqlhngeysngkhgftmemnafgdmtnee frqivngyrhqkhk----kgrlfqeplmlqipktvdwrekg-cvtpvknqgqcgscwafs asgclegqmflktgklislseqnlvdcshdqgn--------qgcngglmdfafqyikeng gldseesypyeakdgsckyraeyavandtgfv-dip-qqekalmkavatvgpisvamdas hpslqfyssgiyyepnc--sskdldhgvlvvgygyegtdsnkd-kywlvknswgkewgmd gyikiakdrnnh---cglataasypivn- >PAPA_CARPA/1-345 mamipsiskllfvaiclfvymglsfg-------------dfsivgysqndltsterliql feswmlkhnkiyknidekiyrfeifkdnlkyidetn----kknnsywlglnvfadmsnde fkekytgsiagnytttelsyeevlndgdvnipeyvdwrqkg-avtpvknqgscgscwafs avvtiegiikirtgnlneyseqelldc----------drrsygcnggypwsalqlvaqy- gihyrntypyegvqrycrsrekgpyaaktdgvrqvqpynegallysian-qpvsvvleaa gkdfqlyrggifvgpcgnk----vdhavaavgygpn---------yiliknswgtgwgen gyirikrgtgnsygvcglytssfypvkn- >ALEU_HORVU/1-362 maharvlllalavlataavavassssfadsnpirpvtdraastlesavlgalgrtrhalr farfavrygksyesaaevrrrfrifsesleevrstn----rkglpyrlginrfsdmswee fqatrlg-aaqtcsatlagnhlmrdaaa--lpetkdwredg-ivspvknqahcgscwtfs ttgaleaaytqatgknislseqqlvdcaggfnnf--------gcngglpsqafeyikyng gidteesypykgvngvchykaenaavqvldsv-nitlnaedelknavglvrpvsvafqvi -dgfrqyksgvytsdhcgttpddvnhavlavgygven-----gvpywliknswgadwgdn gyfkmemgk-nm---caiatcasypvvaa >CATH_HUMAN/1-335 ------mwatlpllcagawllg--------vpvcgaaelsvnslek------------fh fkswmskhrktys-teeyhhrlqtfasnwrkinahn----ngnhtfkmalnqfsdmsfae ikhkylwsepqncsatks--nylrgtgp--yppsvdwrkkgnfvspvknqgacgscwtfs ttgalesaiaiatgkmlslaeqqlvdcaqdfnny--------gcqgglpsqafeyilynk gimgedtypyqgkdgyckfqpgkaigfvkdva-nitiydeeamveavalynpvsfafevt -qdfmmyrtgiysstschktpdkvnhavlavgygekn-----gipywivknswgpqwgmn gyfliergk-nm---cglaacasypiplv >CYS1_DICDI/1-343 -----mkvillfvlavftvfvs---------------srgippeeq------------sq flefqdkfnkkys-heeylerfeifksnlgkieelnliainhkadtkfgvnkfadlssde fknyylnnkeaiftddlpvadylddefinsiptafdwrtrg-avtpvknqgqcgscwsfs ttgnvegqhfisqnklvslseqnlvdcdhecmeyegeeacdegcngglqpnaynyiikng giqtessypytaetgtqcnfnsanigakisnf-tmipknetvmagyivstgplaiaadav -e-wqfyiggvfdipcn---pnsldhgilivgysakntifrknmpywivknswgadwgeq gyiylrrgk-nt---cgvsnfvstsii-- While T-Coffee <4.45 returned: >CATH_RAT/1-333 ----------mwtalpllcagawllsagat----------aeltvnaiek---------- --fhftswmkqhqktyss-reyshrlqvfannwrkiqahn----q----rnhtfkmglnq fsdmsfaeikhkylwsepqncsat--ksnylrgtgp--ypssmdwrkkgnvvspvknqga cgscwtfsttgalesavaiasgkmmtlaeqqlvdcaqnfnnh--------gcqgglpsqa feyilynkgimgedsypyigkngqckfnpekavafvknvvn-itlndeaamveavalynp vsfafevt-edfmmyksgvyssnschktpdkvnhavlavgygeqn-----gllywivkns wgsnwgnngyfliergkn----mcglaacasypipqv >PAPA_CARPA/1-345 mamipsiskllfvaiclfvymglsfgdfsivgysqndltsterliqlfeswml------- -------------khnkiyknidekiyrf-----eifkdnlkyidetnkknnsywlglnv fadmsndefkekytgsiagnytttelsyeevlndgdvnipeyvdwrqkg-avtpvknqgs cgscwafsavvtiegiikirtgnlneyseqelldc----------drrsygcnggypwsa lq-lvaqygihyrntypyegvqrycrsrekgpyaaktdgvrqvqpynegallysia-nqp vsvvleaagkdfqlyrggifvgpcgnk----vdhavaavgygpn---------yilikns wgtgwgengyirikrgtgnsygvcglytssfypvkn- >CATL_HUMAN/1-333 -----------------------------------------mnptlilaafclgiasatl tfdhsleaqwtkwkamhnrlygmneegwrravweknmkmielhnqeyregkhsftmamna fgdmtseefrqvmngfqnrkprkgkvfqeplf----yeaprsvdwrekg-yvtpvknqgq cgscwafsatgalegqmfrktgrlislseqnlvdcsgpqgn--------egcngglmdya fqyvqdnggldseesypyeateesckynpkysvandtgfvd--ipkqekalmkavatvgp isvaidaghesflfykegiyfepdc--ssedmdhgvlvvgygfestesdnn-kywlvkns wgeewgmggyvkmakdrrnh---cgiasaasyptv-- >CATL_RAT/1-334 -----------------------------------------mtpllllavlclgtalatp kfdqtfnaqwhqwksthrrlygtneeewrravweknmrmiqlhngeysngkhgftmemna fgdmtneefrqivngyrhqkhkkgrlfqeplm----lqipktvdwrekg-cvtpvknqgq cgscwafsasgclegqmflktgklislseqnlvdcshdqgn--------qgcngglmdfa fqyikenggldseesypyeakdgsckyraeyavandtgfvd--ipqqekalmkavatvgp isvamdashpslqfyssgiyyepnc--sskdldhgvlvvgygyegtdsnkd-kywlvkns wgkewgmdgyikiakdrnnh---cglataasypivn- >ALEU_HORVU/1-362 ----maharvlllalavlataavavassssfadsnpirpvtdraastlesavlgalgrtr halrfarfavrygksyesaaevrrrfrifsesleevrstn----r----kglpyrlginr fsdmsweefqatrlg-aaqtcsatlagnhlmrdaaa--lpetkdwredg-ivspvknqah cgscwtfsttgaleaaytqatgknislseqqlvdcaggfnnf--------gcngglpsqa feyikynggidteesypykgvngvchykaenaavqvldsvn-itlnaedelknavglvrp vsvafqvi-dgfrqyksgvytsdhcgttpddvnhavlavgygven-----gvpywlikns wgadwgdngyfkmemgkn----mcaiatcasypvvaa >CATH_HUMAN/1-335 ----------mwatlpllcagawllg--------vpvcgaaelsvnslek---------- --fhfkswmskhrktys-teeyhhrlqtfasnwrkinahn----n----gnhtfkmalnq fsdmsfaeikhkylwsepqncsatks--nylrgtgp--yppsvdwrkkgnfvspvknqga cgscwtfsttgalesaiaiatgkmlslaeqqlvdcaqdfnny--------gcqgglpsqa feyilynkgimgedtypyqgkdgyckfqpgkaigfvkdvan-itiydeeamveavalynp vsfafevt-qdfmmyrtgiysstschktpdkvnhavlavgygekn-----gipywivkns wgpqwgmngyfliergkn----mcglaacasypiplv >CYS1_DICDI/1-343 ---------mkvillfvlavftvfvs---------------srgippeeq---------- --sqflefqdkfnkkys-heeylerfeifksnlgkieelnliain----hkadtkfgvnk fadlssdefknyylnnkeaiftddlpvadylddefinsiptafdwrtrg-avtpvknqgq cgscwsfsttgnvegqhfisqnklvslseqnlvdcdhecmeyegeeacdegcngglqpna ynyiiknggiqtessypytaetgtqcnfnsanigakisnft-mipknetvmagyivstgp laiaadav-e-wqfyiggvfdipcn---pnsldhgilivgysakntifrknmpywivkns wgadwgeqgyiylrrgkn----tcgvsnfvstsii-- From sanges at biogem.it Thu Oct 26 10:26:36 2006 From: sanges at biogem.it (Remo Sanges) Date: Thu, 26 Oct 2006 11:26:36 +0100 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <45408846.1050001@sheffield.ac.uk> References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk> <45408846.1050001@sheffield.ac.uk> Message-ID: <45408D5C.1000305@biogem.it> Nathan Haigh wrote: > Sendu Bala wrote: > >> Nathan Haigh wrote: >> >>> I'm thinking that it's not wise to test for things like >>> overall_percentage_identity etc in alignments that are generated by >>> external software like T-Coffee, Clustalw etc. Changes to software >>> algorithms/efficiency, bug fixes etc may well alter the quality of the >>> alignment produced in different versions and thus affect the value >>> returned by such methods. Therefore, I think these methods should only >>> be tested from alignments loaded directly from t/data. >>> >> Did you discover some specific problem cases? >> > My messages seem to be taking a while to come through, but, yes. It may > be due to the software changing default parameters, but it makes testing > the output for specific details pretty difficult and inconsistent. For > example, running T-Coffee, the following command from t/TCoffee.t > results in slightly different alignment: > $aln = $factory->run('-type' => 'profile', > '-profile' => $aln1, > '-seq' => > Bio::Root::IO->catfile("t","data","cysprot1b.fa")); > > Of particular note, is the gaps on the last line of the sequences. In > 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in > I'm not a T-coffee user but usually you can come across these problems when you use different scoring parameters when align sequences. Could it be possible that they have simply changed the default parameters for gap penalties and that kind of stuff? It is possible to set them? If so you can just run the test by defining the scores in the param hash without using the default. HTH Remo From n.haigh at sheffield.ac.uk Thu Oct 26 10:33:55 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 26 Oct 2006 11:33:55 +0100 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <45408D5C.1000305@biogem.it> References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk> <45408846.1050001@sheffield.ac.uk> <45408D5C.1000305@biogem.it> Message-ID: <45408F13.9020209@sheffield.ac.uk> Remo Sanges wrote: > Nathan Haigh wrote: >> Sendu Bala wrote: >> >>> Nathan Haigh wrote: >>> >>>> I'm thinking that it's not wise to test for things like >>>> overall_percentage_identity etc in alignments that are generated by >>>> external software like T-Coffee, Clustalw etc. Changes to software >>>> algorithms/efficiency, bug fixes etc may well alter the quality of the >>>> alignment produced in different versions and thus affect the value >>>> returned by such methods. Therefore, I think these methods should only >>>> be tested from alignments loaded directly from t/data. >>>> >>> Did you discover some specific problem cases? >>> >> My messages seem to be taking a while to come through, but, yes. It may >> be due to the software changing default parameters, but it makes testing >> the output for specific details pretty difficult and inconsistent. For >> example, running T-Coffee, the following command from t/TCoffee.t >> results in slightly different alignment: >> $aln = $factory->run('-type' => 'profile', >> '-profile' => $aln1, >> '-seq' => >> Bio::Root::IO->catfile("t","data","cysprot1b.fa")); >> >> Of particular note, is the gaps on the last line of the sequences. In >> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in >> > > I'm not a T-coffee user but usually you can come across > these problems when you use different scoring parameters > when align sequences. > > Could it be possible that they have simply changed the > default parameters for gap penalties and that kind of > stuff? It is possible to set them? > > If so you can just run the test by defining > the scores in the param hash without using the default. > > HTH > > Remo That is true, but it depends on the whether the wrapper is complete enough to be able to set all the parameters provided by the software. Nath From n.haigh at sheffield.ac.uk Thu Oct 26 16:13:03 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 26 Oct 2006 17:13:03 +0100 Subject: [Bioperl-l] Bio::Restriction::Enzyme Message-ID: <4540DE8F.7070501@sheffield.ac.uk> I'm in the middle of writing some code that uses Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using Bioperl from HEAD. I seem to find that $enzyme->is_palindromic always seems to return true. Can anyone verify this? If needs be, I can send some code. Thanks Nathan From info at nanotechcongresssmailer.net Tue Oct 24 14:45:10 2006 From: info at nanotechcongresssmailer.net (International Association of Nanotechnology) Date: Tue, 24 Oct 2006 09:45:10 -0500 Subject: [Bioperl-l] ICNT2006-presents Nanotechnology Workforce Development Message-ID: <200610241445.k9OEjBBA024478@portal.open-bio.org> An HTML attachment was scrubbed... URL: From bosborne11 at verizon.net Thu Oct 26 16:37:06 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Thu, 26 Oct 2006 12:37:06 -0400 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: <4540DE8F.7070501@sheffield.ac.uk> Message-ID: Nathan, Perhaps because most restriction sites are palindromes. Anyway, I added tests for palindromic() and is_palindromic() where the site is not a palindrome, these tests pass (t/RestrictionAnalyis.t). Brian O. On 10/26/06 12:13 PM, "Nathan Haigh" wrote: > I'm in the middle of writing some code that uses > Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using > Bioperl from HEAD. > > I seem to find that $enzyme->is_palindromic always seems to return true. > Can anyone verify this? If needs be, I can send some code. > > Thanks > Nathan > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From n.haigh at sheffield.ac.uk Thu Oct 26 16:49:48 2006 From: n.haigh at sheffield.ac.uk (Nathan Haigh) Date: Thu, 26 Oct 2006 17:49:48 +0100 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: References: Message-ID: <4540E72C.5020800@sheffield.ac.uk> Brian Osborne wrote: > Nathan, > > Perhaps because most restriction sites are palindromes. Anyway, I added > tests for palindromic() and is_palindromic() where the site is not a > palindrome, these tests pass (t/RestrictionAnalyis.t). > > Brian O. > > > On 10/26/06 12:13 PM, "Nathan Haigh" wrote: > > >> I'm in the middle of writing some code that uses >> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using >> Bioperl from HEAD. >> >> I seem to find that $enzyme->is_palindromic always seems to return true. >> Can anyone verify this? If needs be, I can send some code. >> >> Thanks >> Nathan >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > > Ok, thanks - nice to know :-) From cjfields at uiuc.edu Thu Oct 26 16:58:34 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 26 Oct 2006 11:58:34 -0500 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: <4540DE8F.7070501@sheffield.ac.uk> Message-ID: <001301c6f91f$f9611770$15327e82@pyrimidine> > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- > bounces at lists.open-bio.org] On Behalf Of Nathan Haigh > Sent: Thursday, October 26, 2006 11:13 AM > To: Bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] Bio::Restriction::Enzyme > > I'm in the middle of writing some code that uses > Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using > Bioperl from HEAD. > > I seem to find that $enzyme->is_palindromic always seems to return true. > Can anyone verify this? If needs be, I can send some code. > > Thanks > Nathan You should file a bug report if you have found a test case where this method isn't working as it should, especially if Brian's tests pass and you're still getting the wrong results. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Thu Oct 26 16:57:32 2006 From: jason at bioperl.org (Jason Stajich) Date: Thu, 26 Oct 2006 09:57:32 -0700 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <45408F13.9020209@sheffield.ac.uk> References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk> <45408846.1050001@sheffield.ac.uk> <45408D5C.1000305@biogem.it> <45408F13.9020209@sheffield.ac.uk> Message-ID: Nathan - I agree - the values tend to change with different versions of the applications unfortunately. It would make sense to just test that you get out sequences that are in valid alignment format and perhaps have as many ending sequences as you started with. The more restrictive tests probably aren't reliable with mixing and matching versions. One thing we do for PAML is condition tests on the version used - but of course when a new version comes out we have to add more stuff to the tests (or just have some code that skips those tests). -jason On Oct 26, 2006, at 3:33 AM, Nathan Haigh wrote: > Remo Sanges wrote: >> Nathan Haigh wrote: >>> Sendu Bala wrote: >>> >>>> Nathan Haigh wrote: >>>> >>>>> I'm thinking that it's not wise to test for things like >>>>> overall_percentage_identity etc in alignments that are >>>>> generated by >>>>> external software like T-Coffee, Clustalw etc. Changes to software >>>>> algorithms/efficiency, bug fixes etc may well alter the quality >>>>> of the >>>>> alignment produced in different versions and thus affect the value >>>>> returned by such methods. Therefore, I think these methods >>>>> should only >>>>> be tested from alignments loaded directly from t/data. >>>>> >>>> Did you discover some specific problem cases? >>>> >>> My messages seem to be taking a while to come through, but, yes. >>> It may >>> be due to the software changing default parameters, but it makes >>> testing >>> the output for specific details pretty difficult and >>> inconsistent. For >>> example, running T-Coffee, the following command from t/TCoffee.t >>> results in slightly different alignment: >>> $aln = $factory->run('-type' => 'profile', >>> '-profile' => $aln1, >>> '-seq' => >>> Bio::Root::IO->catfile("t","data","cysprot1b.fa")); >>> >>> Of particular note, is the gaps on the last line of the >>> sequences. In >>> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in >>> >> >> I'm not a T-coffee user but usually you can come across >> these problems when you use different scoring parameters >> when align sequences. >> >> Could it be possible that they have simply changed the >> default parameters for gap penalties and that kind of >> stuff? It is possible to set them? >> >> If so you can just run the test by defining >> the scores in the param hash without using the default. >> >> HTH >> >> Remo > That is true, but it depends on the whether the wrapper is complete > enough to be able to set all the parameters provided by the software. > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From cjfields at uiuc.edu Thu Oct 26 22:01:08 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Thu, 26 Oct 2006 17:01:08 -0500 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: Message-ID: <000301c6f94a$3e2a3f10$15327e82@pyrimidine> I have been running into similar issues with EUtilities tests. Since the data on the server is constantly updated I have to try an future-proof the tests so they don't constantly fail. I have been using Test::More and like/unlike or cmp_ok to get around some of those 'fuzzy data' issues. If some methods consistently return a particular type of value, such as an integer, you could use: like($foo->get_value, qr{^\d+$}, 'value test'); #integer or similar. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign > Nathan - > > I agree - the values tend to change with different versions of the > applications unfortunately. It would make sense to just test that > you get out sequences that are in valid alignment format and perhaps > have as many ending sequences as you started with. The more > restrictive tests probably aren't reliable with mixing and matching > versions. > > One thing we do for PAML is condition tests on the version used - but > of course when a new version comes out we have to add more stuff to > the tests (or just have some code that skips those tests). > > -jason > On Oct 26, 2006, at 3:33 AM, Nathan Haigh wrote: > > > Remo Sanges wrote: > >> Nathan Haigh wrote: > >>> Sendu Bala wrote: > >>> > >>>> Nathan Haigh wrote: > >>>> > >>>>> I'm thinking that it's not wise to test for things like > >>>>> overall_percentage_identity etc in alignments that are > >>>>> generated by > >>>>> external software like T-Coffee, Clustalw etc. Changes to software > >>>>> algorithms/efficiency, bug fixes etc may well alter the quality > >>>>> of the > >>>>> alignment produced in different versions and thus affect the value > >>>>> returned by such methods. Therefore, I think these methods > >>>>> should only > >>>>> be tested from alignments loaded directly from t/data. > >>>>> > >>>> Did you discover some specific problem cases? > >>>> > >>> My messages seem to be taking a while to come through, but, yes. > >>> It may > >>> be due to the software changing default parameters, but it makes > >>> testing > >>> the output for specific details pretty difficult and > >>> inconsistent. For > >>> example, running T-Coffee, the following command from t/TCoffee.t > >>> results in slightly different alignment: > >>> $aln = $factory->run('-type' => 'profile', > >>> '-profile' => $aln1, > >>> '-seq' => > >>> Bio::Root::IO->catfile("t","data","cysprot1b.fa")); > >>> > >>> Of particular note, is the gaps on the last line of the > >>> sequences. In > >>> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in > >>> >>> > >> I'm not a T-coffee user but usually you can come across > >> these problems when you use different scoring parameters > >> when align sequences. > >> > >> Could it be possible that they have simply changed the > >> default parameters for gap penalties and that kind of > >> stuff? It is possible to set them? > >> > >> If so you can just run the test by defining > >> the scores in the param hash without using the default. > >> > >> HTH > >> > >> Remo > > That is true, but it depends on the whether the wrapper is complete > > enough to be able to set all the parameters provided by the software. > > > > Nath > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich, PhD > Miller Research Fellow > University of California > Dept of Plant and Microbial Biology > 321 Koshland Hall #3102 > Berkeley, CA 94720-3102 > lab: 510.642.8441 > http://pmb.berkeley.edu/~taylor/people/js.html > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l From gbazykin at Princeton.EDU Thu Oct 26 22:49:56 2006 From: gbazykin at Princeton.EDU (Georgii A Bazykin) Date: Thu, 26 Oct 2006 18:49:56 -0400 Subject: [Bioperl-l] about PAML running within bioperl In-Reply-To: <001901c6dbcf$9af4de50$0915020a@zchou> References: <001901c6dbcf$9af4de50$0915020a@zchou> Message-ID: <185431468.20061026184956@princeton.edu> I just had the exact same problem, which was also (as in Caleb Davis's case) was solved by switching to PAML 3.14 from 3.15. ------------------------------ Tuesday, September 19, 2006, 5:40:07 AM, you wrote: > Hello, every one, > I use code in the PAML HOWTO (running PAML fom within Bioperl) on > my Linux OS. And I set ENV as described by instructions. At the > beginning, it seems that ClustalW run smoothly. However, when the > programme run to call method "get_MLmatrix", somethign happened. The > following information was listed as follows: (What reason or How to solve these problems?) > ........ > Sequences (2:3) Aligned. Score: 87 > Sequences (2:4) Aligned. Score: 88 > Sequences (2:5) Aligned. Score: 87 > Sequences (2:6) Aligned. Score: 87 > Sequences (2:7) Aligned. Score: 87 > Sequences (2:8) Aligned. Score: 87 > Sequences (3:4) Aligned. Score: 93 > Sequences (3:5) Aligned. Score: 93 > Sequences (3:6) Aligned. Score: 93 > Sequences (3:7) Aligned. Score: 92 > Sequences (3:8) Aligned. Score: 92 > Sequences (4:5) Aligned. Score: 99 > Sequences (4:6) Aligned. Score: 99 > Sequences (4:7) Aligned. Score: 98 > Sequences (4:8) Aligned. Score: 98 > Sequences (5:6) Aligned. Score: 100 > Sequences (5:7) Aligned. Score: 99 > Sequences (5:8) Aligned. Score: 99 > Sequences (6:7) Aligned. Score: 99 > Sequences (6:8) Aligned. Score: 99 > Sequences (7:8) Aligned. Score: 100 > Guide tree file created: > [/home/zchou/TMPDIR/8QEqLivAKY/JU833u8OTP.dnd] > Start of Multiple Alignment > There are 7 groups > Aligning... > Group 1: Sequences: 2 Score:5875 > Group 2: Sequences: 2 Score:5877 > Group 3: Sequences: 4 Score:5864 > Group 4: Sequences: 5 Score:5537 > Group 5: Sequences: 6 Score:5727 > Group 6: Sequences: 7 Score:5608 > Group 7: Sequences: 8 Score:5607 > Alignment Score 43650 > GCG-Alignment file created > [/home/zchou/TMPDIR/8QEqLivAKY/CussPD56rZ] > aligned aa sequences were: Bio::SimpleAlign=HASH(0x87b93f4) > Can't call method "get_MLmatrix" on an undefined value at > originalpaml.pl line 57, line 332. > Zhuocheng Hou > Department of Animal Genetics and Breeding > China Agricultural University From himanshu.ardawatia at bccs.uib.no Fri Oct 27 01:54:36 2006 From: himanshu.ardawatia at bccs.uib.no (Himanshu Ardawatia) Date: Fri, 27 Oct 2006 03:54:36 +0200 Subject: [Bioperl-l] Query on tree bootstrap values Message-ID: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> Hi, 2 questions : 1. I have a phylogenetic tree and I wish to set (or modify or query) bootstrap values for all internal nodes. How do I do that using BioPerl ? 2. I tried the example script attached below for general purpose for the example newick tree with bootstrap values (also attached below) and It gives strange results even for branch length. It shows Parent ID as 0.71 which actually is the bootstrap value for the last ancestral node for human and chimp and It shows the Child node ID as 'Human' ! Am I missing something in the tree formatting ? Results also attached below. Also how to extract / modify/ add bootstrap values in this tree ? Thanks Himanshu EXAMPLE TREE (Newick with bootstrap values and branch lengths) : ################################# ( ('Chimp' : 0.052, 'Human' : 0.042) 0.71 : 0.007, 'Gorilla' : 0.060, ('Gibbon' : 0.124, 'Orangutan' : 0.0971) 1 : 0.038 ); ################################# EXAMPLE SCRIPT: ################################# #!/usr/bin/perl -w use Bio::Seq; # use Bio::TreeIO; use Bio::Tree::TreeI; # get a Tree::NodeI somehow # like from a TreeIO use Bio::TreeIO; # read in a clustalw NJ in phylip/newick format my $treeio = new Bio::TreeIO(-format => 'newick', -file => 'example_newick_tree.newick'); my $tree = $treeio->next_tree; # we'll assume it worked for demo purposes # you might want to test that it was defined my $rootnode = $tree->get_root_node; # process just the next generation foreach my $node ( $rootnode->each_Descendent() ) { print "branch len is ", $node->branch_length, "\n"; } # process all the children my $example_leaf_node; foreach my $node ( $rootnode->get_Descendents() ) { if( $node->is_Leaf ) { print "node is a leaf ... "; # for example use below $example_leaf_node = $node unless defined $example_leaf_node; } print "branch len is ", $node->branch_length, "\n"; } # The ancestor() method points to the parent of a node # A node can only have one parent my $parent = $example_leaf_node->ancestor; # parent won't likely have an description because it is an internal node # but child will because it is a leaf print "Parent id: ", $parent->id," child id: ", $example_leaf_node->id, "\n"; ########################################## RESULTS: branch len is 0.007 branch len is 0.060 branch len is 0.038 node is a leaf ... branch len is 0.042 node is a leaf ... branch len is 0.052 branch len is 0.007 node is a leaf ... branch len is 0.060 node is a leaf ... branch len is 0.0971 node is a leaf ... branch len is 0.124 branch len is 0.038 Parent id: _0.71_ child id: ___'Human'__ From n.haigh at sheffield.ac.uk Fri Oct 27 08:42:23 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 08:42:23 +0000 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: References: Message-ID: <4541C66F.1020404@sheffield.ac.uk> Hi Brian, I wonder if i'm using is_prototype() correctly as I don't seem to get any returning true: my $enz_coll = Bio::Restriction::EnzymeCollection->new(); my $prototype = 0; foreach my $enz ($enz_coll->each_enzyme) { $prototype++ if $enz->is_prototype; } print "$prototype have unique recognition sites\n"; prints: 0 have unique recognition sites Thanks Nath Brian Osborne wrote: > Nathan, > > Perhaps because most restriction sites are palindromes. Anyway, I added > tests for palindromic() and is_palindromic() where the site is not a > palindrome, these tests pass (t/RestrictionAnalyis.t). > > Brian O. > > > On 10/26/06 12:13 PM, "Nathan Haigh" wrote: > > >> I'm in the middle of writing some code that uses >> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using >> Bioperl from HEAD. >> >> I seem to find that $enzyme->is_palindromic always seems to return true. >> Can anyone verify this? If needs be, I can send some code. >> >> Thanks >> Nathan >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > > -- > A: Yes. >> Q: Are you sure? >> >>> A: Because it reverses the logical flow of conversation. >>> >>>> Q: Why is top posting frowned upon? >>>> Get Thunderbird From n.haigh at sheffield.ac.uk Fri Oct 27 08:47:21 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 08:47:21 +0000 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: <001301c6f91f$f9611770$15327e82@pyrimidine> References: <001301c6f91f$f9611770$15327e82@pyrimidine> Message-ID: <4541C799.4090507@sheffield.ac.uk> Chris Fields wrote: >> -----Original Message----- >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- >> bounces at lists.open-bio.org] On Behalf Of Nathan Haigh >> Sent: Thursday, October 26, 2006 11:13 AM >> To: Bioperl-l at lists.open-bio.org >> Subject: [Bioperl-l] Bio::Restriction::Enzyme >> >> I'm in the middle of writing some code that uses >> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using >> Bioperl from HEAD. >> >> I seem to find that $enzyme->is_palindromic always seems to return true. >> Can anyone verify this? If needs be, I can send some code. >> >> Thanks >> Nathan >> > > You should file a bug report if you have found a test case where this method > isn't working as it should, especially if Brian's tests pass and you're > still getting the wrong results. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > I was doing some filtering of the default set of enzymes and happened to removed the 2 that are not palindromic before I used is_palindromic(). Thus, I didn't see any that were not palindromic - if that makes sense! Since I know very little about restriction enzymes, I'll trust that these are correct :-) and I'm getting the correct results. Thanks Nath From n.haigh at sheffield.ac.uk Fri Oct 27 09:04:40 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 09:04:40 +0000 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <000301c6f94a$3e2a3f10$15327e82@pyrimidine> References: <000301c6f94a$3e2a3f10$15327e82@pyrimidine> Message-ID: <4541CBA8.10006@sheffield.ac.uk> Chris Fields wrote: > I have been running into similar issues with EUtilities tests. Since the > data on the server is constantly updated I have to try an future-proof the > tests so they don't constantly fail. > > I have been using Test::More and like/unlike or cmp_ok to get around some of > those 'fuzzy data' issues. If some methods consistently return a particular > type of value, such as an integer, you could use: > > like($foo->get_value, qr{^\d+$}, 'value test'); #integer > > or similar. > > Chris > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > >> Nathan - >> >> I agree - the values tend to change with different versions of the >> applications unfortunately. It would make sense to just test that >> you get out sequences that are in valid alignment format and perhaps >> have as many ending sequences as you started with. The more >> restrictive tests probably aren't reliable with mixing and matching >> versions. >> >> One thing we do for PAML is condition tests on the version used - but >> of course when a new version comes out we have to add more stuff to >> the tests (or just have some code that skips those tests). >> >> -jason >> On Oct 26, 2006, at 3:33 AM, Nathan Haigh wrote: >> >> I think it makes sense to test that data of the expected type was returned by the xternal resource but not to test the specifics of what was retured. If specifics are tested we are then in the realm of testing whether we believe the data returned by the external resource or not. We should assume that the domain experts for these resources know what they are doing - in some cases this might not be true :-) but I think we should stick to testing that the objects created hold the expected type of data. I like what Chris had to say (above) but wonder whether tests would/should be tested for in the module itself - i.e. testing that a stored value is an integer and warn/throw if not? Nath From bix at sendu.me.uk Fri Oct 27 09:08:18 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 27 Oct 2006 10:08:18 +0100 Subject: [Bioperl-l] Query on tree bootstrap values In-Reply-To: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> Message-ID: <4541CC82.2040705@sendu.me.uk> Himanshu Ardawatia wrote: > Hi, > > 2 questions : > > 1. I have a phylogenetic tree and I wish to set (or modify or query) > bootstrap values for all internal nodes. How do I do that using BioPerl ? Does bootstrap() not do what you need? > 2. I tried the example script attached below for general purpose for the > example newick tree with bootstrap values (also attached below) and It gives > strange results even for branch length. It shows Parent ID as 0.71 which > actually is the bootstrap value for the last ancestral node for human and > chimp and It shows the Child node ID as 'Human' ! Am I missing something in > the tree formatting ? Results also attached below. Also how to extract / > modify/ add bootstrap values in this tree ? [snip] > EXAMPLE TREE (Newick with bootstrap values and branch lengths) : > ################################# > ( > ('Chimp' : 0.052, > 'Human' : 0.042) 0.71 : 0.007, > 'Gorilla' : 0.060, > ('Gibbon' : 0.124, > 'Orangutan' : 0.0971) 1 : 0.038 > ); > ################################# Are you sure this is in the correct format? For example, with the tree: ( ('Chimp':0.052,'Human':0.042)'Human_Chimp_Ancestor':0.007, 'Gorilla':0.060, ('Gibbon':0.124,'Orangutan':0.0971)'Gibbon_Orangutan_Ancestor':0.038); and your script (with a print "--\n" between the two printing loops for clarity) I get... > ########################################## > > RESULTS: > branch len is 0.007 > branch len is 0.060 > branch len is 0.038 > node is a leaf ... branch len is 0.042 > node is a leaf ... branch len is 0.052 > branch len is 0.007 > node is a leaf ... branch len is 0.060 > node is a leaf ... branch len is 0.0971 > node is a leaf ... branch len is 0.124 > branch len is 0.038 > Parent id: _0.71_ child id: ___'Human'__ ... branch len is 0.007 branch len is 0.060 branch len is 0.038 -- branch len is 0.007 node is a leaf ... branch len is 0.052 node is a leaf ... branch len is 0.042 node is a leaf ... branch len is 0.060 branch len is 0.038 node is a leaf ... branch len is 0.124 node is a leaf ... branch len is 0.0971 Parent id: 'Human_Chimp_Ancestor' child id: 'Chimp' This seems reasonable to me. What were you expecting? From n.haigh at sheffield.ac.uk Fri Oct 27 11:36:10 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 11:36:10 +0000 Subject: [Bioperl-l] Query on tree bootstrap values In-Reply-To: <4541CC82.2040705@sendu.me.uk> References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> <4541CC82.2040705@sendu.me.uk> Message-ID: <4541EF2A.4050600@sheffield.ac.uk> Sendu Bala wrote: > Himanshu Ardawatia wrote: > >> Hi, >> >> 2 questions : >> >> 1. I have a phylogenetic tree and I wish to set (or modify or query) >> bootstrap values for all internal nodes. How do I do that using BioPerl ? >> > > Does bootstrap() not do what you need? > > > >> 2. I tried the example script attached below for general purpose for the >> example newick tree with bootstrap values (also attached below) and It gives >> strange results even for branch length. It shows Parent ID as 0.71 which >> actually is the bootstrap value for the last ancestral node for human and >> chimp and It shows the Child node ID as 'Human' ! Am I missing something in >> the tree formatting ? Results also attached below. Also how to extract / >> modify/ add bootstrap values in this tree ? >> > [snip] > >> EXAMPLE TREE (Newick with bootstrap values and branch lengths) : >> ################################# >> ( >> ('Chimp' : 0.052, >> 'Human' : 0.042) 0.71 : 0.007, >> 'Gorilla' : 0.060, >> ('Gibbon' : 0.124, >> 'Orangutan' : 0.0971) 1 : 0.038 >> ); >> ################################# >> > > Are you sure this is in the correct format? > He/she may have a tree that already contains bootstrap values output from another program. If this is so, which program did you use? Without reminding myself of the formats, you should lookup newick format and whther it is possible to store bootstraps in it. In addition you should also look up the nhx format. > For example, with the tree: > ( ('Chimp':0.052,'Human':0.042)'Human_Chimp_Ancestor':0.007, > 'Gorilla':0.060, > ('Gibbon':0.124,'Orangutan':0.0971)'Gibbon_Orangutan_Ancestor':0.038); > > This tree does not contain any bootstrap values - only branch lengths. Sorry I can't be much more help at the moment - if i get a spare 10 mins i'll have a closer look. Nath From bix at sendu.me.uk Fri Oct 27 11:16:08 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 27 Oct 2006 12:16:08 +0100 Subject: [Bioperl-l] Query on tree bootstrap values In-Reply-To: <4541EF2A.4050600@sheffield.ac.uk> References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> <4541CC82.2040705@sendu.me.uk> <4541EF2A.4050600@sheffield.ac.uk> Message-ID: <4541EA78.3050404@sendu.me.uk> Nathan S. Haigh wrote: > Sendu Bala wrote: >> Himanshu Ardawatia wrote: >>> >>> EXAMPLE TREE (Newick with bootstrap values and branch lengths) : >>> ################################# >>> ( >>> ('Chimp' : 0.052, >>> 'Human' : 0.042) 0.71 : 0.007, >>> 'Gorilla' : 0.060, >>> ('Gibbon' : 0.124, >>> 'Orangutan' : 0.0971) 1 : 0.038 >>> ); >>> ################################# >>> >> Are you sure this is in the correct format? >> > > He/she may have a tree that already contains bootstrap values output > from another program. If this is so, which program did you use? Without > reminding myself of the formats, you should lookup newick format and > whther it is possible to store bootstraps in it. In addition you should > also look up the nhx format. Ah, well from a brief google it seemed like some software do store boostrap values for internal nodes as the node ids when outputting in Newick format. I don't think Bioperl should be able to tell the difference between a normal id and a bootstrap value, so you'll have to detect that yourself and manually use bootstrap() when you get an id that looks like a number. Or should Bioperl be making this assumption for you? Is that a safe thing to do? Maybe as an option only? From n.haigh at sheffield.ac.uk Fri Oct 27 12:24:49 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 12:24:49 +0000 Subject: [Bioperl-l] Query on tree bootstrap values In-Reply-To: <4541EA78.3050404@sendu.me.uk> References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> <4541CC82.2040705@sendu.me.uk> <4541EF2A.4050600@sheffield.ac.uk> <4541EA78.3050404@sendu.me.uk> Message-ID: <4541FA91.3040505@sheffield.ac.uk> --snip-- > > Ah, well from a brief google it seemed like some software do store > boostrap values for internal nodes as the node ids when outputting in > Newick format. I don't think Bioperl should be able to tell the > difference between a normal id and a bootstrap value, so you'll have > to detect that yourself and manually use bootstrap() when you get an > id that looks like a number. If I remember rightly, in programs like Clustal you can specify where bootstrap values are stored - node or branch. I can't remember which is the default way, but TreeView can only see bootstraps in they are stored using the "non-default" setting. This "could" be the same issue here. > > Or should Bioperl be making this assumption for you? Is that a safe > thing to do? Maybe as an option only? I don't know without a closer look - i'd also need to look at the newick format definition as to whether this is an "extension" to the format or if something is just flouting the newick rules. Nath From n.haigh at sheffield.ac.uk Fri Oct 27 12:59:51 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 12:59:51 +0000 Subject: [Bioperl-l] Caching sequences Message-ID: <454202C7.1040701@sheffield.ac.uk> I have a script that is capable of downloading sequences from GenBank based on GI numbers. I retrieve them if fasta format in order to save bandwidth, but I'd like to take this one step further and cache the sequences in case the user want to rerun the script using some of the GI's they used previously. Does anyone have any guidance on how best to do this? Cheers Nath From bix at sendu.me.uk Fri Oct 27 12:35:13 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Fri, 27 Oct 2006 13:35:13 +0100 Subject: [Bioperl-l] Caching sequences In-Reply-To: <454202C7.1040701@sheffield.ac.uk> References: <454202C7.1040701@sheffield.ac.uk> Message-ID: <4541FD01.6090803@sendu.me.uk> Nathan S. Haigh wrote: > I have a script that is capable of downloading sequences from GenBank > based on GI numbers. I retrieve them if fasta format in order to save > bandwidth, but I'd like to take this one step further and cache the > sequences in case the user want to rerun the script using some of the > GI's they used previously. > > Does anyone have any guidance on how best to do this? You'd probably write the sequences out in some suitable format and access them via Bio::Index Or, I'm sure bioperl-db excels at this kind of thing, but is a little more involved if this is only a simple situation. From bosborne11 at verizon.net Fri Oct 27 13:09:30 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Fri, 27 Oct 2006 09:09:30 -0400 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: <4541C66F.1020404@sheffield.ac.uk> Message-ID: Nathan, I don't know how this is supposed to work, there would be different ways to make is_prototype true. One way would be to make the enzyme with the first occurrence of a given restriction site the prototype (and the next enzymes with the same site are isoschizomers). Or, one could wait until one site had appeared twice, with 2 different enzymes, then make the first the prototype, etc. I would have done it the first way myself but I took a quick look at IO/withrefm.pm and it looks like it's doing it the second way. That means one can read an enzyme file and end up with no duplicated restriction sites, or prototypes and isoschizomers. Brian O. On 10/27/06 4:42 AM, "Nathan S. Haigh" wrote: > Hi Brian, > > I wonder if i'm using is_prototype() correctly as I don't seem to get > any returning true: > > my $enz_coll = Bio::Restriction::EnzymeCollection->new(); > my $prototype = 0; > foreach my $enz ($enz_coll->each_enzyme) { > $prototype++ if $enz->is_prototype; > } > print "$prototype have unique recognition sites\n"; > > prints: > 0 have unique recognition sites > > Thanks > Nath > > Brian Osborne wrote: >> Nathan, >> >> Perhaps because most restriction sites are palindromes. Anyway, I added >> tests for palindromic() and is_palindromic() where the site is not a >> palindrome, these tests pass (t/RestrictionAnalyis.t). >> >> Brian O. >> >> >> On 10/26/06 12:13 PM, "Nathan Haigh" wrote: >> >> >>> I'm in the middle of writing some code that uses >>> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using >>> Bioperl from HEAD. >>> >>> I seem to find that $enzyme->is_palindromic always seems to return true. >>> Can anyone verify this? If needs be, I can send some code. >>> >>> Thanks >>> Nathan >>> _______________________________________________ >>> Bioperl-l mailing list >>> Bioperl-l at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l >>> >> >> >> > From n.haigh at sheffield.ac.uk Fri Oct 27 14:19:02 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 14:19:02 +0000 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: References: Message-ID: <45421556.9060300@sheffield.ac.uk> Brian Osborne wrote: > Nathan, > > I don't know how this is supposed to work, there would be different ways to > make is_prototype true. One way would be to make the enzyme with the first > occurrence of a given restriction site the prototype (and the next enzymes > with the same site are isoschizomers). Or, one could wait until one site had > appeared twice, with 2 different enzymes, then make the first the prototype, > etc. I would have done it the first way myself but I took a quick look at > IO/withrefm.pm and it looks like it's doing it the second way. That means > one can read an enzyme file and end up with no duplicated restriction sites, > or prototypes and isoschizomers. > > Brian O. > > Hmm, I'd have done it the first way also. Doing it the second way would mean you only ended up with something as a prototype if there were multiple enzymes with the same restriction site - is that correct biologically? Nath From n.haigh at sheffield.ac.uk Fri Oct 27 14:23:20 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 14:23:20 +0000 Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage Message-ID: <45421658.5000103@sheffield.ac.uk> As you may be aware by now, i'm working with Bio::Restriction::Analysis and friends. I'm doing restriction analysis on large sequences - chromosomes. I need to identify an appropriate enzyme based on the total length of fragments that are of a certain size (e.g. 100 - 500 bp). However, the amount of memory used by Bio::Restriction::Analysis::fragments() is prohibative. I have the following code (bottom) which downloads 2 thaliana chromosomes (mito and chloro - so pretty small) and runs an analysis and then loops through the fragments for all enzymes in the default collection. My memory usage just keep on climbing and none seems to get freed up even when a $ra goes out of scope (start dealing with the next sequence). Is this a memory leak of some sort, is there a way to free up memory as I go? I'd appreciate any help/advice on how to reduce the amount of memory being consumed as I'd like to use all the thaliana chromosomes (not just mito and chloro), which at the moment probably won't work. Cheers Nath use strict; use Bio::DB::GenBank; use Bio::Restriction::Analysis; use Bio::Restriction::EnzymeCollection; my @seq_objs; my @gis = ( 7525012, 26556996 ); my $db = Bio::DB::GenBank->new(-format => "fasta"); foreach my $gi (@gis) { print "Getting GI: $gi\n"; push @seq_objs, $db->get_Seq_by_id($gi) } my $min_fragment_size = 100; my $max_fragment_size = 500; my $enz_Coll = Bio::Restriction::EnzymeCollection->new(); foreach my $seq (@seq_objs) { my $tot_size = 0; print "Processing ", $seq->primary_id,"\n"; my $ra = Bio::Restriction::Analysis->new( -seq=>$seq, -enzymes=>$enz_Coll, ); my @all_enzymes = $ra->cutters->each_enzyme; print " Calc total length of fragments in range: $min_fragment_size - $max_fragment_size\n"; foreach my $enzyme ( @all_enzymes ) { # fragments() is a real memory hog foreach my $frag ($ra->fragments($enzyme)) { next if $min_fragment_size && (length $frag < $min_fragment_size); next if $max_fragment_size && (length $frag > $max_fragment_size); $tot_size += length $frag; } # do something based on value of $tot_size #print " ", $enzyme->name, " total = $tot_size\n"; } print "DONE\n"; } From avilella at gmail.com Fri Oct 27 13:39:41 2006 From: avilella at gmail.com (Albert Vilella) Date: Fri, 27 Oct 2006 14:39:41 +0100 Subject: [Bioperl-l] scale branch lengths of a tree to sum 1 In-Reply-To: <358f4d650610270623t13875c8bv50278af5ee5b3c44@mail.gmail.com> References: <358f4d650610270623t13875c8bv50278af5ee5b3c44@mail.gmail.com> Message-ID: <358f4d650610270639q14870a6erae2e3c4e9063105d@mail.gmail.com> I respond to myself: I think I found the way: my $tree = $treeio->next_tree; my $total_branch_length = 0; foreach my $node ($tree->get_nodes) { $total_branch_length += $node->branch_length; } foreach my $node ($tree->get_nodes) { my $branch_length = $node->branch_length; next unless (defined($branch_length)); $node->branch_length($branch_length/$total_branch_length); 1; } my $new_branch_length; foreach my $node ($tree->get_nodes) { $new_branch_length += $node->branch_length; } 1; On 10/27/06, Albert Vilella wrote: > Hi all, > > I am in need of a method that would scale the different branch lengths > of a tree so that after the scaling they all sum up to exactly 1. > > Any pointers? Has anyone done that before? > > Thanks in advance, > > Albert. > From cjfields at uiuc.edu Fri Oct 27 14:35:35 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 09:35:35 -0500 Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally In-Reply-To: <4541CBA8.10006@sheffield.ac.uk> Message-ID: <001501c6f9d5$2e33e120$15327e82@pyrimidine> ... > I think it makes sense to test that data of the expected type was > returned by the xternal resource but not to test the specifics of what > was retured. If specifics are tested we are then in the realm of testing > whether we believe the data returned by the external resource or not. We > should assume that the domain experts for these resources know what they > are doing - in some cases this might not be true :-) but I think we > should stick to testing that the objects created hold the expected type > of data. > > I like what Chris had to say (above) but wonder whether tests > would/should be tested for in the module itself - i.e. testing that a > stored value is an integer and warn/throw if not? > > Nath Yeah, sorry about the top post (stupid Outlook always sticks the sig at the top of the page!). Testing in the module would be best but can be tricky for the very same reasons that writing tests entail, even more so. For instance, for NCBI esummary data, I parse the data in a very generic way in order to have access to as much data as possible. For tests, I have to assume that NCBI will always return a particular type of value (string, integer, date). I can test for each of those with a regex in the module fairly simply and throw/wanr, as you indicate. However, if they decide to add new data with a data tag other that the ones I test for in the module (i.e. String, Integer, Date), I suddenly have warns/throws showing up and cluttering/clobbering the code for perfectly valid data. However, if these are caught in tests and the tests fail, no big loss. The actual module still works, even if the tests are failing based on an new unknown value being returned. For me, failed tests are sort of a warning light to let me know that something has changed, but it doesn't necessarily mean a module doesn't work. I generally use throw/warn for something truly catastrophic, like no response from the server or an error in the XML, which affects downstream methods. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Fri Oct 27 15:09:36 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 10:09:36 -0500 Subject: [Bioperl-l] Caching sequences In-Reply-To: <454202C7.1040701@sheffield.ac.uk> Message-ID: <001601c6f9d9$ebd8c7f0$15327e82@pyrimidine> > I have a script that is capable of downloading sequences from GenBank > based on GI numbers. I retrieve them if fasta format in order to save > bandwidth, but I'd like to take this one step further and cache the > sequences in case the user want to rerun the script using some of the > GI's they used previously. > > Does anyone have any guidance on how best to do this? > > Cheers > Nath There is Bio::DB::InMemoryCache, which is really an interface but appears to have several methods defined; you could look for modules which implement it. Sendu's suggestion of the Bio::Index modules and bioperl-db are also good starting points. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Fri Oct 27 15:21:49 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 10:21:49 -0500 Subject: [Bioperl-l] Bio::Restriction::Enzyme In-Reply-To: <45421556.9060300@sheffield.ac.uk> Message-ID: <001701c6f9db$9f90d160$15327e82@pyrimidine> > Brian Osborne wrote: > > Nathan, > > > > I don't know how this is supposed to work, there would be different ways > to > > make is_prototype true. One way would be to make the enzyme with the > first > > occurrence of a given restriction site the prototype (and the next > enzymes > > with the same site are isoschizomers). Or, one could wait until one site > had > > appeared twice, with 2 different enzymes, then make the first the > prototype, > > etc. I would have done it the first way myself but I took a quick look > at > > IO/withrefm.pm and it looks like it's doing it the second way. That > means > > one can read an enzyme file and end up with no duplicated restriction > sites, > > or prototypes and isoschizomers. > > > > Brian O. > > > > > Hmm, I'd have done it the first way also. Doing it the second way would > mean you only ended up with something as a prototype if there were > multiple enzymes with the same restriction site - is that correct > biologically? > > Nath I had a look at all the Restriction::IO modules a while back; most need serious updating! It just hasn't been a top priority unfortunately. I think the prototype issue may depend on the IO format and whether or not one is defined explicitly in the file being parsed or is just chosen based on what Brian said (order in the file, similar cutting site). By the strictest definition (and cheating by looking at the Fermentas web site), the prototype is supposed to be the first enzyme discovered which cleaves a unique sequence, so it may not be the first enzyme found in the file. Isoschizomers are those discovered to cleave the same sequence subsequent to the prototype. Neoschizomers cleave the same sequence as a prototype but at a different site. So this calls into question whether the prototype should be defined at all unless it is specifically indicated in the file. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Fri Oct 27 16:47:53 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Fri, 27 Oct 2006 16:47:53 +0000 Subject: [Bioperl-l] Caching sequences In-Reply-To: <8273f6c20610270838y76bf0dd1haed77615f46a82fc@mail.gmail.com> References: <454202C7.1040701@sheffield.ac.uk> <001601c6f9d9$ebd8c7f0$15327e82@pyrimidine> <8273f6c20610270838y76bf0dd1haed77615f46a82fc@mail.gmail.com> Message-ID: <45423839.9040503@sheffield.ac.uk> Jason Stajich wrote: > Bio::DB::FileCache does one better and lets you cache the data in a > persistent file. Not sure this index is shareable among users though > - bioperl-db is a better soln when that is desired. Thanks I'll have a look into it. No need for being sharable among users - not unless the script becomes heavily used. Thanks Nath From cjfields at uiuc.edu Fri Oct 27 16:15:00 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 11:15:00 -0500 Subject: [Bioperl-l] StandAloneFasta.t bioperl-run tests Message-ID: <000101c6f9e3$0e5e95d0$15327e82@pyrimidine> Nathan, The test fails you posted on the wiki seem to indicate that using the wrapper works but the order of the returned hits is off. Does the order of the returned hits match the actual FASTA report order? If it does then the tests need to be fixed in a way to make it more flexible, to account for some data 'fuzziness' due to variations in output based on different versions. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Fri Oct 27 16:50:54 2006 From: jason at bioperl.org (Jason Stajich) Date: Fri, 27 Oct 2006 09:50:54 -0700 Subject: [Bioperl-l] Query on tree bootstrap values In-Reply-To: <4541EA78.3050404@sendu.me.uk> References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com> <4541CC82.2040705@sendu.me.uk> <4541EF2A.4050600@sheffield.ac.uk> <4541EA78.3050404@sendu.me.uk> Message-ID: <1230E110-01AB-4D4E-842F-20B939555299@bioperl.org> I've answered to this effect this multiple times in the past on the mailing list. newick format does not distinguish between internal ids and bootstrap values (or whatever else you want to attach there). Different programs have different conventions. when both values are present and encoded so that we can parse out the bootstrap like this: [BOOTSTRAP] the parser grabs it out. If you know all the internal ids are boostraps you can just copy the values over manually very simply for my $node ( grep { ! $_->is_Leaf } $tree->get_nodes ) { # get all the internal nodes $node->bootstrap($node->id) if defined $node->id && length($node- >id); # copy id to boostrap $node->id(''); # set internal id to empty } If someone can make this clearer on a wiki page that would be great. On Oct 27, 2006, at 4:16 AM, Sendu Bala wrote: > Nathan S. Haigh wrote: >> Sendu Bala wrote: >>> Himanshu Ardawatia wrote: >>>> >>>> EXAMPLE TREE (Newick with bootstrap values and branch lengths) : >>>> ################################# >>>> ( >>>> ('Chimp' : 0.052, >>>> 'Human' : 0.042) 0.71 : 0.007, >>>> 'Gorilla' : 0.060, >>>> ('Gibbon' : 0.124, >>>> 'Orangutan' : 0.0971) 1 : 0.038 >>>> ); >>>> ################################# >>>> >>> Are you sure this is in the correct format? >>> >> >> He/she may have a tree that already contains bootstrap values output >> from another program. If this is so, which program did you use? >> Without >> reminding myself of the formats, you should lookup newick format and >> whther it is possible to store bootstraps in it. In addition you >> should >> also look up the nhx format. > > Ah, well from a brief google it seemed like some software do store > boostrap values for internal nodes as the node ids when outputting in > Newick format. I don't think Bioperl should be able to tell the > difference between a normal id and a bootstrap value, so you'll > have to > detect that yourself and manually use bootstrap() when you get an id > that looks like a number. > > Or should Bioperl be making this assumption for you? Is that a safe > thing to do? Maybe as an option only? > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From avilella at gmail.com Fri Oct 27 13:23:07 2006 From: avilella at gmail.com (Albert Vilella) Date: Fri, 27 Oct 2006 14:23:07 +0100 Subject: [Bioperl-l] scale branch lengths of a tree to sum 1 Message-ID: <358f4d650610270623t13875c8bv50278af5ee5b3c44@mail.gmail.com> Hi all, I am in need of a method that would scale the different branch lengths of a tree so that after the scaling they all sum up to exactly 1. Any pointers? Has anyone done that before? Thanks in advance, Albert. From cjfields at uiuc.edu Fri Oct 27 18:34:57 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 13:34:57 -0500 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign Message-ID: <000001c6f9f6$9ab12710$15327e82@pyrimidine> I am working an refactoring the AlignIO::stockholm parser to get it reading and writing Pfam/Rfam alignments, and noticed that many alignments have EMBL-like annotations attached, which pertain to the entire alignment: # STOCKHOLM 1.0 #=GF ID ykkC-yxkD #=GF AC RF00442 #=GF DE ykkC-yxkD element #=GF AU Moxon SJ #=GF GA 20.0 #=GF NC 0.1 #=GF TC 59.4 #=GF SE Barrick JE, Breaker RR #=GF SS Predicted; Barrick JE, Breaker RR #=GF TP Cis-reg; riboswitch; #=GF BM cmbuild CM SEED #=GF BM cmsearch -W 175 CM SEQDB #=GF RN [1] #=GF RM 15096624 #=GF RT New RNA motifs suggest an expanded scope for riboswitches in #=GF RT bacterial genetic control. #=GF RA Barrick JE, Corbino KA, Winkler WC, Nahvi A, Mandal M, Collins J, Lee #=GF RA M, Roth A, Sudarsan N, Jona I, Wickiser JK, Breaker RR; #=GF RL Proc Natl Acad Sci U S A 2004;101:6421-6426. #=GF CC This family represents the bacterial ykkC/yxkD element. The function of #=GF CC this family is unclear although it has been suggested that it may function #=GF CC to switch on efflux pumps and detoxification systems in response to harmful #=GF CC environmental molecules [1]. The Thermoanaerobacter tengcongensis sequence #=GF CC EMBL:AE013027 overlaps with Rfam:RF00167 suggesting that the two #=GF CC riboswitches may work in conjunction to regulate the the upstream gene #=GF CC which codes for Swiss:Q8RC62, a member of Pfam:PF00860 (Personal obs. Moxon #=GF CC SJ). #=GF SQ 16 SimpleAlign, as implemented, seemingly doesn't have a way to store this information. I'll work on getting the core alignment IO working, but would there be any interest in having a way to store annotations in Bio::SimpleAlign? I'm guessing the methods would be similar to the various Bio::Seq Annotation methods. Chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From hlapp at gmx.net Fri Oct 27 20:23:46 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Fri, 27 Oct 2006 16:23:46 -0400 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: <000001c6f9f6$9ab12710$15327e82@pyrimidine> References: <000001c6f9f6$9ab12710$15327e82@pyrimidine> Message-ID: You could make SimpleAlign be a Bio::AnnotationHolderI. (I suppose this is what you meant by the 'various Bio::Seq Annotation methods' too.) Just to make sure I'm not misunderstanding, I suppose the annotation pertains to the entire alignment? -hilmar On Oct 27, 2006, at 2:34 PM, Chris Fields wrote: > I am working an refactoring the AlignIO::stockholm parser to get it > reading > and writing Pfam/Rfam alignments, and noticed that many alignments > have > EMBL-like annotations attached, which pertain to the entire alignment: > > # STOCKHOLM 1.0 > #=GF ID ykkC-yxkD > #=GF AC RF00442 > #=GF DE ykkC-yxkD element > #=GF AU Moxon SJ > #=GF GA 20.0 > #=GF NC 0.1 > #=GF TC 59.4 > #=GF SE Barrick JE, Breaker RR > #=GF SS Predicted; Barrick JE, Breaker RR > #=GF TP Cis-reg; riboswitch; > #=GF BM cmbuild CM SEED > #=GF BM cmsearch -W 175 CM SEQDB > #=GF RN [1] > #=GF RM 15096624 > #=GF RT New RNA motifs suggest an expanded scope for > riboswitches in > #=GF RT bacterial genetic control. > #=GF RA Barrick JE, Corbino KA, Winkler WC, Nahvi A, Mandal M, > Collins J, > Lee > #=GF RA M, Roth A, Sudarsan N, Jona I, Wickiser JK, Breaker RR; > #=GF RL Proc Natl Acad Sci U S A 2004;101:6421-6426. > #=GF CC This family represents the bacterial ykkC/yxkD element. The > function of > #=GF CC this family is unclear although it has been suggested > that it may > function > #=GF CC to switch on efflux pumps and detoxification systems in > response > to harmful > #=GF CC environmental molecules [1]. The Thermoanaerobacter > tengcongensis > sequence > #=GF CC EMBL:AE013027 overlaps with Rfam:RF00167 suggesting that > the two > #=GF CC riboswitches may work in conjunction to regulate the the > upstream > gene > #=GF CC which codes for Swiss:Q8RC62, a member of Pfam:PF00860 > (Personal > obs. Moxon > #=GF CC SJ). > #=GF SQ 16 > > SimpleAlign, as implemented, seemingly doesn't have a way to store > this > information. > > I'll work on getting the core alignment IO working, but would there > be any > interest in having a way to store annotations in Bio::SimpleAlign? > I'm > guessing the methods would be similar to the various Bio::Seq > Annotation > methods. > > Chris > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Fri Oct 27 20:38:17 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 15:38:17 -0500 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: Message-ID: <000001c6fa07$d8659990$15327e82@pyrimidine> Hilmar Lapp wrote: > You could make SimpleAlign be a Bio::AnnotationHolderI. (I > suppose this is what you meant by the 'various Bio::Seq Annotation > methods' too.) > > Just to make sure I'm not misunderstanding, I suppose the > annotation pertains to the entire alignment? > > -hilmar ... Yes, that's correct. I would probably use Bio::Seq::Meta for the sequence-specific markup lines. I would have to add another new method to deal with non-sequence-based consensus data (like sec. structure) for now. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Fri Oct 27 15:38:05 2006 From: jason at bioperl.org (Jason Stajich) Date: Fri, 27 Oct 2006 08:38:05 -0700 Subject: [Bioperl-l] Caching sequences In-Reply-To: <001601c6f9d9$ebd8c7f0$15327e82@pyrimidine> References: <454202C7.1040701@sheffield.ac.uk> <001601c6f9d9$ebd8c7f0$15327e82@pyrimidine> Message-ID: <8273f6c20610270838y76bf0dd1haed77615f46a82fc@mail.gmail.com> Bio::DB::FileCache does one better and lets you cache the data in a persistent file. Not sure this index is shareable among users though - bioperl-db is a better soln when that is desired. -jason On 10/27/06, Chris Fields wrote: > > > I have a script that is capable of downloading sequences from GenBank > > based on GI numbers. I retrieve them if fasta format in order to save > > bandwidth, but I'd like to take this one step further and cache the > > sequences in case the user want to rerun the script using some of the > > GI's they used previously. > > > > Does anyone have any guidance on how best to do this? > > > > Cheers > > Nath > > There is Bio::DB::InMemoryCache, which is really an interface but appears > to > have several methods defined; you could look for modules which implement > it. > Sendu's suggestion of the Bio::Index modules and bioperl-db are also good > starting points. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Jason Stajich jason at bioperl.org http://www.duke.edu/~jes12/ From cjfields at uiuc.edu Sat Oct 28 01:57:58 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Fri, 27 Oct 2006 20:57:58 -0500 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: References: <000001c6f9f6$9ab12710$15327e82@pyrimidine> Message-ID: <24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu> On Oct 27, 2006, at 3:23 PM, Hilmar Lapp wrote: > You could make SimpleAlign be a Bio::AnnotationHolderI. (I suppose > this is what you meant by the 'various Bio::Seq Annotation methods' > too.) > > Just to make sure I'm not misunderstanding, I suppose the annotation > pertains to the entire alignment? > > -hilmar BTW, was that supposed to be Bio::AnnotatableI, or Bio::AnnotationHolderI? The latter isn't present in CVS HEAD. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From eric.ross at neuro.utah.edu Sat Oct 28 21:24:30 2006 From: eric.ross at neuro.utah.edu (Eric Ross) Date: Sat, 28 Oct 2006 15:24:30 -0600 Subject: [Bioperl-l] PAML References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu> Message-ID: <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu> I am trying to extract the "Naive Empirical Bayes (NEB) probabilities" from a Bio::Tools::Phylo::PAML::Result object. I am able to extract other data from the report, but there seems to be a conflict in the documentation. One doc implies that there should be a get_posteriors method. (It's used as an example in the Bio::Tools::Phylo::PAML doc), but the method does not appear to exist in the Bio::Tools::Phylo::PAML::Result object. I have been trying various methods, in the event I'm just "confused", but I've had no luck, thus far. Anyone have suggestions? code: ----begin code------- #!/usr/bin/perl -w use strict; use Bio::Tools::Phylo::PAML; my $parser = new Bio::Tools::Phylo::PAML (-file => "mlc"); my $result = $parser->next_result; my @posteriors = $result->get_posteriors(); print "@posteriors"; exit(0); ---------end code------------- --------------- Eric Ross Computer Analyst II ejr at neuro.utah.edu Howard Hughes Medical Institute University of Utah S?nchez Lab From avilella at gmail.com Sun Oct 29 10:52:04 2006 From: avilella at gmail.com (Albert Vilella) Date: Sun, 29 Oct 2006 10:52:04 +0000 Subject: [Bioperl-l] PAML In-Reply-To: <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu> References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu> <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu> Message-ID: <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com> I don't know if this method is implemented. I can't grep-find it. Maybe it's simply not there yet, but was planned when the documentation was written. On 10/28/06, Eric Ross wrote: > I am trying to extract the "Naive Empirical Bayes (NEB) probabilities" from a Bio::Tools::Phylo::PAML::Result object. > > I am able to extract other data from the report, but there seems to be a conflict in the documentation. One doc implies that there should be a get_posteriors method. (It's used as an example in the Bio::Tools::Phylo::PAML doc), but the method does not appear to exist in the Bio::Tools::Phylo::PAML::Result object. > > > I have been trying various methods, in the event I'm just "confused", but I've had no luck, thus far. Anyone have suggestions? > > > code: > > ----begin code------- > #!/usr/bin/perl -w > > use strict; > > > use Bio::Tools::Phylo::PAML; > my $parser = new Bio::Tools::Phylo::PAML > (-file => "mlc"); > my $result = $parser->next_result; > my @posteriors = $result->get_posteriors(); > > print "@posteriors"; > > exit(0); > > ---------end code------------- > > > > --------------- > Eric Ross > Computer Analyst II > ejr at neuro.utah.edu > Howard Hughes Medical Institute > University of Utah > S?nchez Lab > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at uiuc.edu Sun Oct 29 14:23:45 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 29 Oct 2006 08:23:45 -0600 Subject: [Bioperl-l] PAML In-Reply-To: <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com> References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu> <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu> <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com> Message-ID: <9E776AD6-76E1-403A-8003-C27958F21D90@uiuc.edu> Does the data show up in the object using Data::Dumper? This should be filed as a bug since the docs imply the method exists. This could be written up fairly quickly if one had test data and and a script to work with (hint hint...) Chris On Oct 29, 2006, at 4:52 AM, Albert Vilella wrote: > I don't know if this method is implemented. I can't grep-find it. > Maybe it's simply not there yet, but was planned when the > documentation was written. > > On 10/28/06, Eric Ross wrote: >> I am trying to extract the "Naive Empirical Bayes (NEB) >> probabilities" from a Bio::Tools::Phylo::PAML::Result object. >> >> I am able to extract other data from the report, but there seems >> to be a conflict in the documentation. One doc implies that there >> should be a get_posteriors method. (It's used as an example in the >> Bio::Tools::Phylo::PAML doc), but the method does not appear to >> exist in the Bio::Tools::Phylo::PAML::Result object. >> >> >> I have been trying various methods, in the event I'm just >> "confused", but I've had no luck, thus far. Anyone have suggestions? >> >> >> code: >> >> ----begin code------- >> #!/usr/bin/perl -w >> >> use strict; >> >> >> use Bio::Tools::Phylo::PAML; >> my $parser = new Bio::Tools::Phylo::PAML >> (-file => "mlc"); >> my $result = $parser->next_result; >> my @posteriors = $result->get_posteriors(); >> >> print "@posteriors"; >> >> exit(0); >> >> ---------end code------------- >> >> >> >> --------------- >> Eric Ross >> Computer Analyst II >> ejr at neuro.utah.edu >> Howard Hughes Medical Institute >> University of Utah >> S?nchez Lab >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From eric.ross at neuro.utah.edu Sun Oct 29 17:06:54 2006 From: eric.ross at neuro.utah.edu (Eric Ross) Date: Sun, 29 Oct 2006 10:06:54 -0700 Subject: [Bioperl-l] PAML References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu> <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu> <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com> <9E776AD6-76E1-403A-8003-C27958F21D90@uiuc.edu> Message-ID: <2DCC13A920E6C640AD6257DEFA01F78408A3F5@CAMPUSV1.xds.umail.utah.edu> Thanks for all the help. I've been looking at the code for the PAML rst parser. It's a bit tricky. We have written a parser specific for our needs, but it looks to be a pretty complicated matter to make it generic. The output of PAML can vary a lot depending upon your options and this section can be repeated multiple times. I'm sure someone with a good grasp of the potential output of PAML could come up with something, but I'll admit to being at a loss. --------------- Eric Ross Computer Analyst II ejr at neuro.utah.edu Howard Hughes Medical Institute University of Utah S?nchez Lab -----Original Message----- From: Chris Fields [mailto:cjfields at uiuc.edu] Sent: Sun 2006-10-29 7:23 AM To: Albert Vilella Cc: Eric Ross; Bioperl-l at lists.open-bio.org Subject: Re: [Bioperl-l] PAML Does the data show up in the object using Data::Dumper? This should be filed as a bug since the docs imply the method exists. This could be written up fairly quickly if one had test data and and a script to work with (hint hint...) Chris On Oct 29, 2006, at 4:52 AM, Albert Vilella wrote: > I don't know if this method is implemented. I can't grep-find it. > Maybe it's simply not there yet, but was planned when the > documentation was written. > > On 10/28/06, Eric Ross wrote: >> I am trying to extract the "Naive Empirical Bayes (NEB) >> probabilities" from a Bio::Tools::Phylo::PAML::Result object. >> >> I am able to extract other data from the report, but there seems >> to be a conflict in the documentation. One doc implies that there >> should be a get_posteriors method. (It's used as an example in the >> Bio::Tools::Phylo::PAML doc), but the method does not appear to >> exist in the Bio::Tools::Phylo::PAML::Result object. >> >> >> I have been trying various methods, in the event I'm just >> "confused", but I've had no luck, thus far. Anyone have suggestions? >> >> >> code: >> >> ----begin code------- >> #!/usr/bin/perl -w >> >> use strict; >> >> >> use Bio::Tools::Phylo::PAML; >> my $parser = new Bio::Tools::Phylo::PAML >> (-file => "mlc"); >> my $result = $parser->next_result; >> my @posteriors = $result->get_posteriors(); >> >> print "@posteriors"; >> >> exit(0); >> >> ---------end code------------- >> >> >> >> --------------- >> Eric Ross >> Computer Analyst II >> ejr at neuro.utah.edu >> Howard Hughes Medical Institute >> University of Utah >> S?nchez Lab >> >> >> >> >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From n.haigh at sheffield.ac.uk Sun Oct 29 17:43:20 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Sun, 29 Oct 2006 17:43:20 +0000 Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage In-Reply-To: <45421658.5000103@sheffield.ac.uk> References: <45421658.5000103@sheffield.ac.uk> Message-ID: <4544E838.7090400@sheffield.ac.uk> Sorry for the repeat post but I haven't had a response. Just wondered if anyone had any idea about this? Thanks Nath Nathan S. Haigh wrote: > As you may be aware by now, i'm working with Bio::Restriction::Analysis > and friends. > > I'm doing restriction analysis on large sequences - chromosomes. I need > to identify an appropriate enzyme based on the total length of fragments > that are of a certain size (e.g. 100 - 500 bp). However, the amount of > memory used by Bio::Restriction::Analysis::fragments() is prohibative. I > have the following code (bottom) which downloads 2 thaliana chromosomes > (mito and chloro - so pretty small) and runs an analysis and then loops > through the fragments for all enzymes in the default collection. > > My memory usage just keep on climbing and none seems to get freed up > even when a $ra goes out of scope (start dealing with the next > sequence). Is this a memory leak of some sort, is there a way to free up > memory as I go? I'd appreciate any help/advice on how to reduce the > amount of memory being consumed as I'd like to use all the thaliana > chromosomes (not just mito and chloro), which at the moment probably > won't work. > > Cheers > Nath > > use strict; > use Bio::DB::GenBank; > use Bio::Restriction::Analysis; > use Bio::Restriction::EnzymeCollection; > > my @seq_objs; > my @gis = ( 7525012, 26556996 ); > > my $db = Bio::DB::GenBank->new(-format => "fasta"); > foreach my $gi (@gis) { > print "Getting GI: $gi\n"; > push @seq_objs, $db->get_Seq_by_id($gi) > } > > my $min_fragment_size = 100; > my $max_fragment_size = 500; > my $enz_Coll = Bio::Restriction::EnzymeCollection->new(); > > foreach my $seq (@seq_objs) { > my $tot_size = 0; > print "Processing ", $seq->primary_id,"\n"; > my $ra = Bio::Restriction::Analysis->new( > -seq=>$seq, > -enzymes=>$enz_Coll, > ); > > my @all_enzymes = $ra->cutters->each_enzyme; > print " Calc total length of fragments in range: $min_fragment_size - > $max_fragment_size\n"; > foreach my $enzyme ( @all_enzymes ) { > # fragments() is a real memory hog > foreach my $frag ($ra->fragments($enzyme)) { > next if $min_fragment_size && (length $frag < $min_fragment_size); > next if $max_fragment_size && (length $frag > $max_fragment_size); > $tot_size += length $frag; > } > # do something based on value of $tot_size > #print " ", $enzyme->name, " total = $tot_size\n"; > } > print "DONE\n"; > } > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From cjfields at uiuc.edu Sun Oct 29 18:09:54 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 29 Oct 2006 12:09:54 -0600 Subject: [Bioperl-l] PAML In-Reply-To: <2DCC13A920E6C640AD6257DEFA01F78408A3F5@CAMPUSV1.xds.umail.utah.edu> References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu> <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu> <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com> <9E776AD6-76E1-403A-8003-C27958F21D90@uiuc.edu> <2DCC13A920E6C640AD6257DEFA01F78408A3F5@CAMPUSV1.xds.umail.utah.edu> Message-ID: On Oct 29, 2006, at 11:06 AM, Eric Ross wrote: > Thanks for all the help. > > I've been looking at the code for the PAML rst parser. It's a bit > tricky. > > We have written a parser specific for our needs, but it looks to be > a pretty complicated matter to make it generic. > > The output of PAML can vary a lot depending upon your options and > this section can be repeated multiple times. I'm sure someone with > a good grasp of the potential output of PAML could come up with > something, but I'll admit to being at a loss. Eric, I planned on looking at ways to integrate the protein-based PAML programs but I'm working on a different area at the moment. I agree it may be hard to adequately genericize parsing/methods to accomplish this, but if you have any ideas feel free to post them. Again, I would suggest adding any proposed enhancements or bugs to Bugzilla: http://bugzilla.open-bio.org/ Suggestions or bug reports on the list sometimes get lost in the shuffle, esp. since we're planning on a new developer release soon. Chris Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Sun Oct 29 18:16:37 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Sun, 29 Oct 2006 12:16:37 -0600 Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage In-Reply-To: <4544E838.7090400@sheffield.ac.uk> References: <45421658.5000103@sheffield.ac.uk> <4544E838.7090400@sheffield.ac.uk> Message-ID: <6D9EAA04-199C-4BDD-AA60-4833BC1CE250@uiuc.edu> On Oct 29, 2006, at 11:43 AM, Nathan S. Haigh wrote: > Sorry for the repeat post but I haven't had a response. Just > wondered if > anyone had any idea about this? > > Thanks > Nath ... I think Warnock applies here. Likely no one is really sure, hence they aren't answering. It probably bears investigating by submitting and tracking as a bug. My guess is something isn't garbage-collected properly (i.e. there are circular references present), leading to a memory leak. Christopher Fields Postdoctoral Researcher Lab of Dr. Robert Switzer Dept of Biochemistry University of Illinois Urbana-Champaign From chhalling at alumni.ls.berkeley.edu Sun Oct 29 19:16:36 2006 From: chhalling at alumni.ls.berkeley.edu (Conrad Halling) Date: Sun, 29 Oct 2006 14:16:36 -0500 Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage In-Reply-To: <4544E838.7090400@sheffield.ac.uk> References: <45421658.5000103@sheffield.ac.uk> <4544E838.7090400@sheffield.ac.uk> Message-ID: <4544FE14.7030701@alumni.ls.berkeley.edu> Nathan S. Haigh wrote: > Sorry for the repeat post but I haven't had a response. Just wondered if > anyone had any idea about this? > > Thanks > Nath > > Nathan S. Haigh wrote: > >> As you may be aware by now, i'm working with Bio::Restriction::Analysis >> and friends. >> >> I'm doing restriction analysis on large sequences - chromosomes. I need >> to identify an appropriate enzyme based on the total length of fragments >> that are of a certain size (e.g. 100 - 500 bp). However, the amount of >> memory used by Bio::Restriction::Analysis::fragments() is prohibative. I >> have the following code (bottom) which downloads 2 thaliana chromosomes >> (mito and chloro - so pretty small) and runs an analysis and then loops >> through the fragments for all enzymes in the default collection. >> >> My memory usage just keep on climbing and none seems to get freed up >> even when a $ra goes out of scope (start dealing with the next >> sequence). Is this a memory leak of some sort, is there a way to free up >> memory as I go? I'd appreciate any help/advice on how to reduce the >> amount of memory being consumed as I'd like to use all the thaliana >> chromosomes (not just mito and chloro), which at the moment probably >> won't work. >> >> Cheers >> Nath >> >> use strict; >> use Bio::DB::GenBank; >> use Bio::Restriction::Analysis; >> use Bio::Restriction::EnzymeCollection; >> >> my @seq_objs; >> my @gis = ( 7525012, 26556996 ); >> >> my $db = Bio::DB::GenBank->new(-format => "fasta"); >> foreach my $gi (@gis) { >> print "Getting GI: $gi\n"; >> push @seq_objs, $db->get_Seq_by_id($gi) >> } >> >> my $min_fragment_size = 100; >> my $max_fragment_size = 500; >> my $enz_Coll = Bio::Restriction::EnzymeCollection->new(); >> >> foreach my $seq (@seq_objs) { >> my $tot_size = 0; >> print "Processing ", $seq->primary_id,"\n"; >> my $ra = Bio::Restriction::Analysis->new( >> -seq=>$seq, >> -enzymes=>$enz_Coll, >> ); >> >> my @all_enzymes = $ra->cutters->each_enzyme; >> print " Calc total length of fragments in range: $min_fragment_size - >> $max_fragment_size\n"; >> foreach my $enzyme ( @all_enzymes ) { >> # fragments() is a real memory hog >> foreach my $frag ($ra->fragments($enzyme)) { >> next if $min_fragment_size && (length $frag < $min_fragment_size); >> next if $max_fragment_size && (length $frag > $max_fragment_size); >> $tot_size += length $frag; >> } >> # do something based on value of $tot_size >> #print " ", $enzyme->name, " total = $tot_size\n"; >> } >> print "DONE\n"; >> } >> >> Try this code, which creates a new Bio::Restriction::Analysis object for each digest. On my PowerBook, this doesn't use more than 13 Mb of memory. Reading the code for Bio::Restriction::Analysis reveals that the fragments() method calls the cut() method. The documentation for the cut method states: Note: cut doesn't now re-initialize everything before figuring out cuts. This is so that you can do multiple digests, or add more data or whatever. You'll have to use new to reset everything. This means there is no memory leak; it's just that the Bio::Restriction::Analysis object is retaining cut information for each enzyme, which takes a lot of memory. use strict; use warnings; use Bio::DB::GenBank; use Bio::Restriction::Analysis; use Bio::Restriction::EnzymeCollection; my @seq_objs; my @gis = ( 7525012, 26556996 ); my $db = Bio::DB::GenBank->new(-format => "fasta"); foreach my $gi (@gis) { print "Getting GI: $gi\n"; push @seq_objs, $db->get_Seq_by_id($gi) } my $min_fragment_size = 100; my $max_fragment_size = 500; my $enz_Coll = Bio::Restriction::EnzymeCollection->new(); foreach my $seq (@seq_objs) { print "Processing ", $seq->primary_id, "\n"; foreach my $enzyme ( $enz_Coll->each_enzyme() ) { my $ra = Bio::Restriction::Analysis->new( -seq => $seq, -enzymes => $enzyme ); my $tot_size = 0; print " Calc total length of fragments in range: $min_fragment_size -" . " $max_fragment_size\n"; foreach my $frag ($ra->fragments($enzyme)) { next if $min_fragment_size && (length $frag < $min_fragment_size); next if $max_fragment_size && (length $frag > $max_fragment_size); $tot_size += length $frag; } # do something based on value of $tot_size print " ", $enzyme->name, " total = $tot_size\n"; } print "DONE\n"; } -- Conrad Halling chhalling at alumni.ls.berkeley.edu From n.haigh at sheffield.ac.uk Mon Oct 30 08:51:49 2006 From: n.haigh at sheffield.ac.uk (Nathan S. Haigh) Date: Mon, 30 Oct 2006 08:51:49 +0000 Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id() Message-ID: <4545BD25.3030107@sheffield.ac.uk> In my script I retrieve sequences from GenBank in FASTA format by GI numbers and optionally store the sequence in a cache using Bio::DB::Fasta. On subsequent runs of the script, the cache is first checked for the GI and returns the sequence if it is found or the sequence is obtained from GenBank as above. I would have thought that Bio::DB::Fasta::get_Seq_by_id() would have returned a Bio::Seq object but rather it returns a Bio::PrimarySeq object which is defined within the Bio::DB::Fasta file. This is annoying, since $seq_obj in my script would be either a Bio::Seq if it was obtained from GenBank or a Bio::PrimarySeq if obtained from the cache and calling primary_id() on it doesn't do the expected thing with Bio::PrimarySeq: ID: Bio::PrimarySeq::Fasta=HASH(0x89b4508) Is there a reason why Bio::DB::Fasta doesn't return a Bio::Seq object? Nath From yuhki at ncifcrf.gov Mon Oct 30 13:57:35 2006 From: yuhki at ncifcrf.gov (Naoya Yuhki) Date: Mon, 30 Oct 2006 08:57:35 -0500 Subject: [Bioperl-l] bptutorial.pl 0 Message-ID: <46170A95-13C8-4839-BD70-EF086ED1E0DA@ncifcrf.gov> Hello, I run perl bptutorial.pl 0 and I got the following error. -------------------- WARNING --------------------- MSG: id (ROA1_HUMAN) does not exist --------------------------------------------------- Can't call method "display_id" on an undefined value at bptutorial.pl line 3945. other tests all worked. I thank any suggestions from you. NAOYA YUHKI. From cjfields at uiuc.edu Mon Oct 30 17:42:21 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Mon, 30 Oct 2006 11:42:21 -0600 Subject: [Bioperl-l] bptutorial.pl 0 In-Reply-To: <46170A95-13C8-4839-BD70-EF086ED1E0DA@ncifcrf.gov> Message-ID: <000601c6fc4a$c3e43450$15327e82@pyrimidine> > Hello, > I run > > perl bptutorial.pl 0 > > and I got the following error. > > -------------------- WARNING --------------------- > MSG: id (ROA1_HUMAN) does not exist > --------------------------------------------------- > Can't call method "display_id" on an undefined value at bptutorial.pl > line 3945. > > other tests all worked. > > I thank any suggestions from you. > > NAOYA YUHKI. What version of Bioperl are you running? As a warning, the bptutorial.pl script has been removed from CVS and will not be included in future versions of Bioperl. It can be found on the bioperl wiki instead: http://www.bioperl.org/wiki/Bptutorial chris Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Mon Oct 30 18:08:15 2006 From: jason at bioperl.org (Jason Stajich) Date: Mon, 30 Oct 2006 10:08:15 -0800 Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id() In-Reply-To: <4545BD25.3030107@sheffield.ac.uk> References: <4545BD25.3030107@sheffield.ac.uk> Message-ID: <29F47393-D134-4093-8751-E948BF521843@bioperl.org> Bio::PrimarySeq makes sense because Fasta databases only provide sequences without features. But you are actually getting a Bio::PrimarySeq::Fasta object which is a proxy object since the module won't pull a whole sequence into memory unless seq() is requested. The problem is really why you are getting something useless set for primary_id. What do you want it to be - the GI number? you'll need to explicitly set it because DB::Fasta has no concept of GI numbers encoded in the header line. AFAIK you cannot also set the primary_id to a value of your liking because this a proxy object. The best bet is to create a Bio::Seq object out of one of these and set the primary_id and display_id to values that you can compute from the display_id. At least that has been my strategy when using this - maybe someone wants to code something new into the object itsself. -jason On Oct 30, 2006, at 12:51 AM, Nathan S. Haigh wrote: > In my script I retrieve sequences from GenBank in FASTA format by GI > numbers and optionally store the sequence in a cache using > Bio::DB::Fasta. On subsequent runs of the script, the cache is first > checked for the GI and returns the sequence if it is found or the > sequence is obtained from GenBank as above. > > I would have thought that Bio::DB::Fasta::get_Seq_by_id() would have > returned a Bio::Seq object but rather it returns a Bio::PrimarySeq > object which is defined within the Bio::DB::Fasta file. This is > annoying, since $seq_obj in my script would be either a Bio::Seq if it > was obtained from GenBank or a Bio::PrimarySeq if obtained from the > cache and calling primary_id() on it doesn't do the expected thing > with > Bio::PrimarySeq: > ID: Bio::PrimarySeq::Fasta=HASH(0x89b4508) > > Is there a reason why Bio::DB::Fasta doesn't return a Bio::Seq object? > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From golharam at umdnj.edu Mon Oct 30 20:11:51 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Mon, 30 Oct 2006 15:11:51 -0500 Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String? Message-ID: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1> I'm trying to parse some blast output w/o actually creating the output file. Instead, I'm capturing the output in a variable and would like to use IO::String to represent the file: $_ = `megablast -d somedatabase -i somesequence -D 2`; my $blast_file = new IO::String($_); my $searchio = new Bio::SearchIO(-format => 'blast', -fh => $blast_file); my $results = $searchio->next_result; my $hit = $results->next_hit; if (! defined($hit)) { warn "No BLAST hit for $accession on chr $chr for Seq/$orth_id/$organism\n\n"; return; } Now, when Bio::SearchIO tries to read the output line by line, instead it reads the entire output as 1 line. If I provide the output in a file and use: my $searchio = new Bio::SearchIO(-format => 'blast', -file => '/tmp/somefile.blast'); This works...so is it possible to use IO::String to provide Bio::SearchIO with BLAST output? Ryan From golharam at umdnj.edu Mon Oct 30 20:54:29 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Mon, 30 Oct 2006 15:54:29 -0500 Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String? In-Reply-To: <716af09c0610301244q1c6d4cd6y23e72371dda4fa4d@mail.gmail.com> Message-ID: <00e801c6fc65$9849aee0$e6028a0a@GOLHARMOBILE1> Thanks. How are you getting the output? system()? BTW- I'm using v1.5.1... > -----Original Message----- > From: Bernd Web [mailto:bernd.web at gmail.com] > Sent: Monday, October 30, 2006 3:45 PM > To: golharam at umdnj.edu > Cc: bioperl-l > Subject: Re: [Bioperl-l] Is it possible to parse BLAST output > using IO:String? > > > Hi Ryan, > > I parse blastn output using IO::String w/o problems: > > my $stringfh = new IO::String($input); > my $in = new Bio::SearchIO(-format => 'blast', -fh => $stringfh); > > however this is input does not come via backticks. > > > bernd > > On 10/30/06, Ryan Golhar wrote: > > I'm trying to parse some blast output w/o actually creating > the output > > file. Instead, I'm capturing the output in a variable and > would like > > to use IO::String to represent the file: > > > > $_ = `megablast -d somedatabase -i somesequence -D 2`; > > my $blast_file = new IO::String($_); > > my $searchio = new Bio::SearchIO(-format => 'blast', -fh => > > $blast_file); > > my $results = $searchio->next_result; > > my $hit = $results->next_hit; > > if (! defined($hit)) { > > warn "No BLAST hit for $accession on chr $chr for > > Seq/$orth_id/$organism\n\n"; > > return; > > } > > > > Now, when Bio::SearchIO tries to read the output line by > line, instead > > it reads the entire output as 1 line. > > > > If I provide the output in a file and use: > > > > my $searchio = new Bio::SearchIO(-format => > 'blast', -file => > > '/tmp/somefile.blast'); > > > > This works...so is it possible to use IO::String to provide > > Bio::SearchIO with BLAST output? > > > > Ryan > > > > _______________________________________________ > > Bioperl-l mailing list > > Bioperl-l at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioperl-l > > > From bix at sendu.me.uk Mon Oct 30 21:27:58 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Mon, 30 Oct 2006 21:27:58 +0000 Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String? In-Reply-To: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1> References: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1> Message-ID: <45466E5E.9000504@sendu.me.uk> Ryan Golhar wrote: > I'm trying to parse some blast output w/o actually creating the output > file. Instead, I'm capturing the output in a variable and would like to > use IO::String to represent the file: > > $_ = `megablast -d somedatabase -i somesequence -D 2`; > my $blast_file = new IO::String($_); > my $searchio = new Bio::SearchIO(-format => 'blast', -fh => > $blast_file); > my $results = $searchio->next_result; > my $hit = $results->next_hit; > if (! defined($hit)) { > warn "No BLAST hit for $accession on chr $chr for > Seq/$orth_id/$organism\n\n"; > return; > } > > Now, when Bio::SearchIO tries to read the output line by line, instead > it reads the entire output as 1 line. > > If I provide the output in a file and use: > > my $searchio = new Bio::SearchIO(-format => 'blast', -file => > '/tmp/somefile.blast'); > > This works...so is it possible to use IO::String to provide > Bio::SearchIO with BLAST output? Why must it be IO::String? Why not just open() your megablast and provide $searchio the real filehandle? It would be faster that way as well. Read the docs for `. Your usage above is inappropriate. From golharam at umdnj.edu Mon Oct 30 21:54:45 2006 From: golharam at umdnj.edu (Ryan Golhar) Date: Mon, 30 Oct 2006 16:54:45 -0500 Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String? In-Reply-To: Message-ID: <00f901c6fc6e$03916460$e6028a0a@GOLHARMOBILE1> Hmmm. Yes, I suppose I could. I did it with the backtick because I based my code off of the "To and >From a String" from the SeqIO HOWTO... -----Original Message----- From: Jason Stajich [mailto:jason.stajich at gmail.com] On Behalf Of Jason Stajich Sent: Monday, October 30, 2006 4:44 PM To: Sendu Bala Cc: golharam at umdnj.edu; 'bioperl-l' Subject: Re: [Bioperl-l] Is it possible to parse BLAST output using IO:String? right - can't you just do: my $fh; open($fh, "megablast -d ... | ") || die $!; my $searchIO = Bio::SearchIO->new(-format => 'blast', -fh => $fh); On Oct 30, 2006, at 1:27 PM, Sendu Bala wrote: Ryan Golhar wrote: I'm trying to parse some blast output w/o actually creating the output file. Instead, I'm capturing the output in a variable and would like to use IO::String to represent the file: $_ = `megablast -d somedatabase -i somesequence -D 2`; my $blast_file = new IO::String($_); my $searchio = new Bio::SearchIO(-format => 'blast', -fh => $blast_file); my $results = $searchio->next_result; my $hit = $results->next_hit; if (! defined($hit)) { warn "No BLAST hit for $accession on chr $chr for Seq/$orth_id/$organism\n\n"; return; } Now, when Bio::SearchIO tries to read the output line by line, instead it reads the entire output as 1 line. If I provide the output in a file and use: my $searchio = new Bio::SearchIO(-format => 'blast', -file => '/tmp/somefile.blast'); This works...so is it possible to use IO::String to provide Bio::SearchIO with BLAST output? Why must it be IO::String? Why not just open() your megablast and provide $searchio the real filehandle? It would be faster that way as well. Read the docs for `. Your usage above is inappropriate. _______________________________________________ Bioperl-l mailing list Bioperl-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From bernd.web at gmail.com Mon Oct 30 20:44:31 2006 From: bernd.web at gmail.com (Bernd Web) Date: Mon, 30 Oct 2006 21:44:31 +0100 Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String? In-Reply-To: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1> References: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1> Message-ID: <716af09c0610301244q1c6d4cd6y23e72371dda4fa4d@mail.gmail.com> Hi Ryan, I parse blastn output using IO::String w/o problems: my $stringfh = new IO::String($input); my $in = new Bio::SearchIO(-format => 'blast', -fh => $stringfh); however this is input does not come via backticks. bernd On 10/30/06, Ryan Golhar wrote: > I'm trying to parse some blast output w/o actually creating the output > file. Instead, I'm capturing the output in a variable and would like to > use IO::String to represent the file: > > $_ = `megablast -d somedatabase -i somesequence -D 2`; > my $blast_file = new IO::String($_); > my $searchio = new Bio::SearchIO(-format => 'blast', -fh => > $blast_file); > my $results = $searchio->next_result; > my $hit = $results->next_hit; > if (! defined($hit)) { > warn "No BLAST hit for $accession on chr $chr for > Seq/$orth_id/$organism\n\n"; > return; > } > > Now, when Bio::SearchIO tries to read the output line by line, instead > it reads the entire output as 1 line. > > If I provide the output in a file and use: > > my $searchio = new Bio::SearchIO(-format => 'blast', -file => > '/tmp/somefile.blast'); > > This works...so is it possible to use IO::String to provide > Bio::SearchIO with BLAST output? > > Ryan > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From jason at bioperl.org Mon Oct 30 21:44:18 2006 From: jason at bioperl.org (Jason Stajich) Date: Mon, 30 Oct 2006 13:44:18 -0800 Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String? In-Reply-To: <45466E5E.9000504@sendu.me.uk> References: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1> <45466E5E.9000504@sendu.me.uk> Message-ID: right - can't you just do: my $fh; open($fh, "megablast -d ... | ") || die $!; my $searchIO = Bio::SearchIO->new(-format => 'blast', -fh => $fh); On Oct 30, 2006, at 1:27 PM, Sendu Bala wrote: > Ryan Golhar wrote: >> I'm trying to parse some blast output w/o actually creating the >> output >> file. Instead, I'm capturing the output in a variable and would >> like to >> use IO::String to represent the file: >> >> $_ = `megablast -d somedatabase -i somesequence -D 2`; >> my $blast_file = new IO::String($_); >> my $searchio = new Bio::SearchIO(-format => 'blast', -fh => >> $blast_file); >> my $results = $searchio->next_result; >> my $hit = $results->next_hit; >> if (! defined($hit)) { >> warn "No BLAST hit for $accession on chr $chr for >> Seq/$orth_id/$organism\n\n"; >> return; >> } >> >> Now, when Bio::SearchIO tries to read the output line by line, >> instead >> it reads the entire output as 1 line. >> >> If I provide the output in a file and use: >> >> my $searchio = new Bio::SearchIO(-format => 'blast', -file => >> '/tmp/somefile.blast'); >> >> This works...so is it possible to use IO::String to provide >> Bio::SearchIO with BLAST output? > > Why must it be IO::String? Why not just open() your megablast and > provide $searchio the real filehandle? It would be faster that way > as well. > > Read the docs for `. Your usage above is inappropriate. > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From lstein at cshl.edu Mon Oct 30 18:59:29 2006 From: lstein at cshl.edu (Lincoln Stein) Date: Mon, 30 Oct 2006 13:59:29 -0500 Subject: [Bioperl-l] Last minute changes to Bio::Graphics::FeatureBase Message-ID: <6dce9a0b0610301059t6bfae183n8d359f4331985c62@mail.gmail.com> Hi All, I found a bug in Bio::Graphics::FeatureBase that was causing GFF3 output not to validate. I have committed a new version to live and to the release candidate branch. I hope it isn't too late to get this into the release. Lincoln -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From huangyi1 at hkusua.hku.hk Tue Oct 31 05:46:20 2006 From: huangyi1 at hkusua.hku.hk (Huang Yi) Date: Tue, 31 Oct 2006 13:46:20 +0800 Subject: [Bioperl-l] bioperl1.5 and GD2.35 Message-ID: <200610310546.k9V5kQGT010481@hkusua.hku.hk> Hi, I just installed bioperl 1.4 from CPAN to my Gentoo linux computer. But the installation was failed. I had to install by force. However, the GD module couldn't be installed for some unknown reasons. I therefore use "emerge" tool of Gentoo to get bioperl and GD again. They are fine. The version of bioperl became upgrade to1.5 and GD was 2.35. However, when I tested it by using the program in HOWTO wiki page (http://www.bioperl.org/wiki/HOWTO:Graphics), it always told me: Can't locate object method "png" via package "GD::Image" at /usr/lib/perl5/site_perl/5.8.8/Bio/Graphics/Panel.pm line 799, <> line 9. In my other computer, bioperl1.4 and GD2.34 work fine. I therefore want to remove the CPAN bioperl from the system and re-install it, but it seems to be impossible. Would you please give me some advices on how to let my GD and bioperl work. Thanks! Huang Yi From bix at sendu.me.uk Tue Oct 31 08:20:21 2006 From: bix at sendu.me.uk (Sendu Bala) Date: Tue, 31 Oct 2006 08:20:21 +0000 Subject: [Bioperl-l] Last minute changes to Bio::Graphics::FeatureBase In-Reply-To: <6dce9a0b0610301059t6bfae183n8d359f4331985c62@mail.gmail.com> References: <6dce9a0b0610301059t6bfae183n8d359f4331985c62@mail.gmail.com> Message-ID: <45470745.1050605@sendu.me.uk> Lincoln Stein wrote: > Hi All, > > I found a bug in Bio::Graphics::FeatureBase that was causing GFF3 output not > to validate. I have committed a new version to live and to the release > candidate branch. I hope it isn't too late to get this into the release. It isn't too late, thank you. From avilella at gmail.com Tue Oct 31 13:54:39 2006 From: avilella at gmail.com (Albert Vilella) Date: Tue, 31 Oct 2006 13:54:39 +0000 Subject: [Bioperl-l] catfile and catdir Message-ID: <358f4d650610310554v5eb57db1x137ac307a73fa64@mail.gmail.com> Hi, I was testing the bioperl-run/t/PAML.t and stumbled upon this a catdir/catfile error: Can't locate object method "catdir" via package "Bio::Root::IO" at /home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line 113. BEGIN failed--compilation aborted at /home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line 143. Compilation failed in require at t/PAML.t line 64. BEGIN failed--compilation aborted at t/PAML.t line 64. Should be be using File::Spec for catdir and catfile instead of Root::IO? Cheers, Albert. From Kevin.M.Brown at asu.edu Tue Oct 31 15:34:34 2006 From: Kevin.M.Brown at asu.edu (Kevin Brown) Date: Tue, 31 Oct 2006 08:34:34 -0700 Subject: [Bioperl-l] bioperl1.5 and GD2.35 Message-ID: <1A4207F8295607498283FE9E93B775B4023B5F3C@EX02.asurite.ad.asu.edu> Not really a Bioperl issue per se, but sounds like when you had Gentoo emerge GD it didn't include libpng and so didn't build the needed parts to create PNG type graphics. > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Huang Yi > Sent: Monday, October 30, 2006 10:46 PM > To: bioperl-l at lists.open-bio.org > Subject: [Bioperl-l] bioperl1.5 and GD2.35 > > Hi, > > > > I just installed bioperl 1.4 from CPAN to my Gentoo linux > computer. But the > installation was failed. I had to install by force. > > > > However, the GD module couldn't be installed for some unknown reasons. > > > > I therefore use "emerge" tool of Gentoo to get bioperl and GD > again. They > are fine. The version of bioperl became upgrade to1.5 and GD was 2.35. > > > > However, when I tested it by using the program in HOWTO wiki page > (http://www.bioperl.org/wiki/HOWTO:Graphics), it always told me: > > > > Can't locate object method "png" via package "GD::Image" at > /usr/lib/perl5/site_perl/5.8.8/Bio/Graphics/Panel.pm line > 799, <> line 9. > > > > In my other computer, bioperl1.4 and GD2.34 work fine. I > therefore want to > remove the CPAN bioperl from the system and re-install it, > but it seems to > be impossible. > > > > Would you please give me some advices on how to let my GD and > bioperl work. > > > > Thanks! > > > > Huang Yi > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > From hlapp at gmx.net Tue Oct 31 16:21:40 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 31 Oct 2006 11:21:40 -0500 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: <24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu> References: <000001c6f9f6$9ab12710$15327e82@pyrimidine> <24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu> Message-ID: <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net> On Oct 27, 2006, at 9:57 PM, Chris Fields wrote: > BTW, was that supposed to be Bio::AnnotatableI, or > Bio::AnnotationHolderI? Sorry, the former. I guess I got confused with FeatureHolders. Too bad Featureable isn't an English word. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From hlapp at gmx.net Tue Oct 31 17:01:44 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 31 Oct 2006 12:01:44 -0500 Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id() In-Reply-To: <4545BD25.3030107@sheffield.ac.uk> References: <4545BD25.3030107@sheffield.ac.uk> Message-ID: <375EA8B5-EE6E-47BA-A335-AAC44CC454C3@gmx.net> The only thing I would add to Jason's reply is that it is easy to do if (! $seq->isa("Bio::SeqI")) { my $bioseq = Bio::Seq->new(); $bioseq->primary_seq($seq); $seq = $bioseq; } and from that point on all your objects are Bio::SeqI compliant regardless of whether they were obtained that way or not. Aside from that I wonder why there isn't a -primary_seq option in Bio::Seq::new - this would shorten the above into a (more perl'ish) single line: $seq = Bio::Seq->new(-primary_seq=>$seq) unless $seq->isa("Bio::SeqI"); Anyone takers to add that capability? -hilmar On Oct 30, 2006, at 3:51 AM, Nathan S. Haigh wrote: > In my script I retrieve sequences from GenBank in FASTA format by GI > numbers and optionally store the sequence in a cache using > Bio::DB::Fasta. On subsequent runs of the script, the cache is first > checked for the GI and returns the sequence if it is found or the > sequence is obtained from GenBank as above. > > I would have thought that Bio::DB::Fasta::get_Seq_by_id() would have > returned a Bio::Seq object but rather it returns a Bio::PrimarySeq > object which is defined within the Bio::DB::Fasta file. This is > annoying, since $seq_obj in my script would be either a Bio::Seq if it > was obtained from GenBank or a Bio::PrimarySeq if obtained from the > cache and calling primary_id() on it doesn't do the expected thing > with > Bio::PrimarySeq: > ID: Bio::PrimarySeq::Fasta=HASH(0x89b4508) > > Is there a reason why Bio::DB::Fasta doesn't return a Bio::Seq object? > > Nath > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Tue Oct 31 17:08:56 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 31 Oct 2006 11:08:56 -0600 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net> Message-ID: <001401c6fd0f$4239aa50$15327e82@pyrimidine> >> BTW, was that supposed to be Bio::AnnotatableI, or >> Bio::AnnotationHolderI? > > Sorry, the former. I guess I got confused with > FeatureHolders. Too bad Featureable isn't an English word. > > -hilmar Having SimpleAlign be AnnotatableI shouldn't be too much of a burden, since the only additional implemented method is annotation(). So, I think all the various Stockholm tags can be placed somewhere. A bit OT: were we planning on getting rid of the various *_tag_* methods in AnnotatableI at some point? I'm a bit confused as to why they were added. Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From jason at bioperl.org Tue Oct 31 17:09:26 2006 From: jason at bioperl.org (Jason Stajich) Date: Tue, 31 Oct 2006 09:09:26 -0800 Subject: [Bioperl-l] catfile and catdir In-Reply-To: <358f4d650610310554v5eb57db1x137ac307a73fa64@mail.gmail.com> References: <358f4d650610310554v5eb57db1x137ac307a73fa64@mail.gmail.com> Message-ID: <1AD4DB38-E08D-4E47-8A59-6539068474CB@bioperl.org> Yep. Unless we want this to also exist in Root::IO and delegate to File::Spec. -jason On Oct 31, 2006, at 5:54 AM, Albert Vilella wrote: > Hi, > > I was testing the bioperl-run/t/PAML.t and stumbled upon this a > catdir/catfile error: > > Can't locate object method "catdir" via package "Bio::Root::IO" at > /home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line > 113. > BEGIN failed--compilation aborted at > /home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line > 143. > Compilation failed in require at t/PAML.t line 64. > BEGIN failed--compilation aborted at t/PAML.t line 64. > > Should be be using File::Spec for catdir and catfile instead of > Root::IO? > > Cheers, > > Albert. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From jason at bioperl.org Tue Oct 31 17:10:51 2006 From: jason at bioperl.org (Jason Stajich) Date: Tue, 31 Oct 2006 09:10:51 -0800 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net> References: <000001c6f9f6$9ab12710$15327e82@pyrimidine> <24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu> <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net> Message-ID: <65F92B54-33FD-4D8F-90B7-49E2697CDBA2@bioperl.org> It just needs to have an annotation collection - so it would be Bio::AnnotateableI On Oct 31, 2006, at 8:21 AM, Hilmar Lapp wrote: > > On Oct 27, 2006, at 9:57 PM, Chris Fields wrote: > >> BTW, was that supposed to be Bio::AnnotatableI, or >> Bio::AnnotationHolderI? > > Sorry, the former. I guess I got confused with FeatureHolders. Too > bad Featureable isn't an English word. > > -hilmar > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== > > > > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich, PhD Miller Research Fellow University of California Dept of Plant and Microbial Biology 321 Koshland Hall #3102 Berkeley, CA 94720-3102 lab: 510.642.8441 http://pmb.berkeley.edu/~taylor/people/js.html From hlapp at gmx.net Tue Oct 31 17:44:58 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 31 Oct 2006 12:44:58 -0500 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: References: Message-ID: Well isn't this a result of conflating some of the SeqFeatureI methods into the annotation collection? If I'm not mistaken on this then those methods were introduced in 1.5.0 and hence can go away without deprecation. -hilmar On Oct 31, 2006, at 11:37 AM, Brian Osborne wrote: > Chris, > > I don't think the intent was to remove the methods, rather we'd > just call > deprecated(). Example from AnnotatableI: > > sub remove_tag { > my ($self, at args) = @_; > > #uncomment in 1.6 > #$self->deprecated('remove_tag() is deprecated, use > remove_Annotations()'); > > return $self->annotation->remove_Annotations(@args); > } > > With regards to "why", I can't reconstruct the entire rationale > myself but I > can say that the newer names make more sense. Take that example > above - it's > function is to remove entire Annotations not just to remove tags, so > remove_Annotations is a better name. > > Brian O. > > > On 10/31/06 1:08 PM, "Chris Fields" wrote: > >> A bit OT: were we planning on getting rid of the various *_tag_* >> methods in >> AnnotatableI at some point? I'm a bit confused as to why they >> were added. > > -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From bosborne11 at verizon.net Tue Oct 31 16:37:01 2006 From: bosborne11 at verizon.net (Brian Osborne) Date: Tue, 31 Oct 2006 12:37:01 -0400 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: <001401c6fd0f$4239aa50$15327e82@pyrimidine> Message-ID: Chris, I don't think the intent was to remove the methods, rather we'd just call deprecated(). Example from AnnotatableI: sub remove_tag { my ($self, at args) = @_; #uncomment in 1.6 #$self->deprecated('remove_tag() is deprecated, use remove_Annotations()'); return $self->annotation->remove_Annotations(@args); } With regards to "why", I can't reconstruct the entire rationale myself but I can say that the newer names make more sense. Take that example above - it's function is to remove entire Annotations not just to remove tags, so remove_Annotations is a better name. Brian O. On 10/31/06 1:08 PM, "Chris Fields" wrote: > A bit OT: were we planning on getting rid of the various *_tag_* methods in > AnnotatableI at some point? I'm a bit confused as to why they were added. From cjfields at uiuc.edu Tue Oct 31 18:44:02 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 31 Oct 2006 12:44:02 -0600 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: Message-ID: <001e01c6fd1c$891fc4b0$15327e82@pyrimidine> Hilmar Lapp wrote: > Well isn't this a result of conflating some of the > SeqFeatureI methods into the annotation collection? > > If I'm not mistaken on this then those methods were > introduced in 1.5.0 and hence can go away without deprecation. > > -hilmar > > On Oct 31, 2006, at 11:37 AM, Brian Osborne wrote: > >> Chris, >> >> I don't think the intent was to remove the methods, rather we'd just >> call deprecated(). Example from AnnotatableI: >> >> sub remove_tag { >> my ($self, at args) = @_; >> >> #uncomment in 1.6 >> #$self->deprecated('remove_tag() is deprecated, use >> remove_Annotations()'); >> >> return $self->annotation->remove_Annotations(@args); } >> >> With regards to "why", I can't reconstruct the entire rationale >> myself but I can say that the newer names make more sense. Take that >> example above - it's function is to remove entire Annotations not >> just to remove tags, so remove_Annotations is a better name. >> >> Brian O. >> >> >> On 10/31/06 1:08 PM, "Chris Fields" wrote: >> >>> A bit OT: were we planning on getting rid of the various *_tag_* >>> methods in AnnotatableI at some point? I'm a bit confused as to why >>> they were added. Sorry Brian, what I meant was, based on CVS history, the various *tag* methods in AnnotatableI were added all at once, with deprecations already present in the commit. So the methods weren't there to begin with, then added only to be deprecated later? Hence the confusion... I think Hilmar's right; the CVS history indicates these were added just prior to rel. 1.5 by Allen and seem to be related to SeqFeatureI. I'm sure the intent was good, but they contradict methods in the Feature/Annotation HOWTO on retrieving Annotation objects via the Annotation::Collection object. I think that agrees with your point about the various Annotation* method names being the more appropriate ones. Does everybody agree we should just remove them? Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From cjfields at uiuc.edu Tue Oct 31 18:53:16 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 31 Oct 2006 12:53:16 -0600 Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id() In-Reply-To: <375EA8B5-EE6E-47BA-A335-AAC44CC454C3@gmx.net> Message-ID: <000001c6fd1d$d4359c80$15327e82@pyrimidine> > -----Original Message----- > From: bioperl-l-bounces at lists.open-bio.org > [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar Lapp > Sent: Tuesday, October 31, 2006 11:02 AM > To: n.haigh at sheffield.ac.uk > Cc: Bioperl-l at lists.open-bio.org > Subject: Re: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id() > > The only thing I would add to Jason's reply is that it is easy to do > > if (! $seq->isa("Bio::SeqI")) { > my $bioseq = Bio::Seq->new(); > $bioseq->primary_seq($seq); > $seq = $bioseq; > } > > and from that point on all your objects are Bio::SeqI > compliant regardless of whether they were obtained that way or not. > > Aside from that I wonder why there isn't a -primary_seq > option in Bio::Seq::new - this would shorten the above into a > (more perl'ish) single line: > > $seq = Bio::Seq->new(-primary_seq=>$seq) unless > $seq->isa("Bio::SeqI"); > > Anyone takers to add that capability? > > -hilmar Sounds good to me! Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From nhansen at nhgri.nih.gov Tue Oct 31 19:51:23 2006 From: nhansen at nhgri.nih.gov (Nancy Hansen) Date: Tue, 31 Oct 2006 14:51:23 -0500 (EST) Subject: [Bioperl-l] Bio::SeqIO::scf header/comments handling Message-ID: Hello, As sequencing centers begin to deposit trace data from "Medical Sequencing" projects into the public archives, there is now the need to "anonymize" sequence trace files by removing embedded information which might be used to identify the individual who was the original source of the DNA being sequenced. I was hoping I might be able to use Bio::SeqIO to manipulate the comments contained in an SCF-formatted trace file, but I'm finding that Bio::SeqIO/Bio::Seq::SequenceTrace doesn't seem to store this information. Since SCF is a widely-accepted standard for trace files, would it be reasonable to include fields like "scf_comments" and "scf_header" in a Bio::Seq::SequenceTrace object and have Bio::SeqIO::scf populate them? Likewise, it would be great if write_seq could pull these values right from a SequenceTrace object rather than requiring them as arguments. I'd be happy to help in this effort if necessary. Thanks, --Nancy ************************************* Nancy F. Hansen, PhD nhansen at nhgri.nih.gov Bioinformatics Group NIH Intramural Sequencing Center (NISC) 5625 Fishers Lane Rockville, MD 20852 Phone: (301) 435-1560 Fax: (301) 435-6170 From lincoln.stein at gmail.com Tue Oct 31 20:24:17 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Tue, 31 Oct 2006 15:24:17 -0500 Subject: [Bioperl-l] Bioperl versioning In-Reply-To: <000001c6f78b$d1c65a30$15327e82@pyrimidine> References: <453E309B.9090007@sendu.me.uk> <000001c6f78b$d1c65a30$15327e82@pyrimidine> Message-ID: <6dce9a0b0610311224x79256b29sf102eb5c35865caf@mail.gmail.com> Are you going to go ahead with 1.52_XX ? If so, I will code GBrowse to look for 1.52 or higher. Lincoln On 10/24/06, Chris Fields wrote: > > .. > > > > 'handle'? I think it shows up as '6.2.13' simply because it was uploaded > > with the filename Perl6-Pugs-6.2.13.tar.gz > > Sorry, my point was that when Audrey T. uses '6.2.13', her $VERSION is > '6.002013'. So maybe we should follow a similar convention. Seems easier > and less confusing to me, at least. > > > As you point out, the code has the kind of $VERSION number we've been > > suggesting in this thread: > > > > > From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013': > > > > > > our $VERSION = 6.002013; > > > > > > That's also a very perlish-way to do it. And there are no developer > > > versions of Pugs, since it is always under active development. We > could > > try > > > something like: > > > > > > our $VERSION = 1.005002_01; > > > > Yes, this was already like one of my suggestions (1.0502_01), but I > > brought up the concern that 1.05 might be < 1.4. > > > > So then we have a question: do we try and fumble a 1.4 compatible number > > by using 1.60_10, or do we have a clean break, remove 1.4 from CPAN if > > it causes problems, and go for the 'proper' 1.006000 (1.6.0) with no > > room for RC numbering, or 1.006000010 (1.6.0.10) - the first final > > release following some 1.006000_001 (1.6.0.01 == rc1) RCs? > > I would go for the clean break if it follows perl/CPAN convention. > '1.60_10' looks like '1.60.10', not 1.6 or 1.6 RC1, so it's too confusing. > > If we can use '1.00600x' for 1.6, 1.6.1, etc, and '1.006000_00x' for 1.6 > RC1, 1.6 RC2 etc then that would be consistent and perl-compatible. > > BTW, the reason I looked at Pugs was to see what some of the Perl6 > developers were using. Who knows; they'll probably change it! > > .. > > > I don't think it would be a hassle; on the contrary it would be very > > useful to know the CPAN distribution actually works. I'm very happy with > > the idea that a release candidate gets fully tested... > > So you obviously feel strongly about it! ;> > > I don't have a problem as long as we stick with doing this from now on ( > i.e. > have a consistent versioning scheme, release policy, CPAN release policy, > etc). Would be nice for Jason/Brian/Hilmar to chime in as to the > reasoning > behind the older versioning scheme. > > Christopher Fields > Postdoctoral Researcher - Switzer Lab > Dept. of Biochemistry > University of Illinois Urbana-Champaign > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu From hlapp at gmx.net Tue Oct 31 21:53:58 2006 From: hlapp at gmx.net (Hilmar Lapp) Date: Tue, 31 Oct 2006 16:53:58 -0500 Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign In-Reply-To: <001e01c6fd1c$891fc4b0$15327e82@pyrimidine> References: <001e01c6fd1c$891fc4b0$15327e82@pyrimidine> Message-ID: On Oct 31, 2006, at 1:44 PM, Chris Fields wrote: > Does everybody agree we should just remove them? I wish you could but I'm afraid that would break stuff? Otherwise why were they added in the first place? I thought Bio::SeqFeature::Annotated needs them maybe? -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From cjfields at uiuc.edu Tue Oct 31 22:41:17 2006 From: cjfields at uiuc.edu (Chris Fields) Date: Tue, 31 Oct 2006 16:41:17 -0600 Subject: [Bioperl-l] AnnotatableI tag methods, was Rfam/Pfam annotations and SimpleAlign In-Reply-To: Message-ID: <000001c6fd3d$ae37c240$15327e82@pyrimidine> > On Oct 31, 2006, at 1:44 PM, Chris Fields wrote: > > > Does everybody agree we should just remove them? > > I wish you could but I'm afraid that would break stuff? > Otherwise why were they added in the first place? I thought > Bio::SeqFeature::Annotated needs them maybe? > > -hilmar > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : > =========================================================== Yep, removing them clobbers a ton of tests, including anything that requires SeqIO::FTHelper. Looks like SeqFeature::Generic and a few others use them. I could understand if these were meant to be permanent methods, but why add these in if they were to be deprecated in 1.6? Something that was meant to be a transition but wasn't finished? That seems to be indicated in the commented out lines for all the *tag* methods: #uncomment in 1.6 #$self->deprecated('remove_tag() is deprecated, use remove_Annotations()'); Christopher Fields Postdoctoral Researcher - Switzer Lab Dept. of Biochemistry University of Illinois Urbana-Champaign From lincoln.stein at gmail.com Tue Oct 31 23:18:07 2006 From: lincoln.stein at gmail.com (Lincoln Stein) Date: Tue, 31 Oct 2006 18:18:07 -0500 Subject: [Bioperl-l] Bio::DB::GFF::Util::Binning In-Reply-To: References: Message-ID: <6dce9a0b0610311518l3bec852q5d04a9b488621377@mail.gmail.com> Hi Keith, The current Bio/DB/GFF/Util/Binning.pm file just contains the hierarchical binning system that I implemented some time ago. Where is the R-tree system that you describe? How much of an improvement did the R-tree scheme give over the hierarchical scheme? FTYI the GFF3 implementation uses a different binning scheme in which there is a fixed-size bin. Every time a feature overlaps a bin, it creates a new row in a table. So big features will have multiple rows and little features that fit inside a bin will have only one row. The query for this is simpler and seems to give the same relative speedup as the hierarchical binning system. I'd really like to get these queries to go as fast as possible and would love to work with you on this if you're interested. Lincoln On 10/19/06, Keith Player wrote: > > I know that there may be some changes resulting from new GFF3 > implementations, > but thought I would see if the following is useful anyway. > > I implemented the R-tree binning schema as used by > Bio::DB::GFF::Util::Binning > and as mention in this article: > > I tested the following query on a normal table (no binning), but it > assumes > that you know the longest range in the table. So for example with a table > of > human genes, where the longest gene we know of is around 2.4Mb. > > SELECT COUNT(*) as count FROM groups WHERE start > max(0,[start-2.4Mb]) > AND > g.start < [end] AND g.end > [start] AND g.chromosome = '1' > > so for 100Mb:101Mb > > SELECT COUNT(*) as count FROM groups WHERE start > 97600000 AND g.start < > 101000000 AND g.end > 100000000 AND g.chromosome = '1' > > > where [start] and [end] define the region of interest. This query > outperforms > the R-Tree implementation on all tests that I have performed (for lengths > of > 200bp to 10Mb across a whole chromsome). Could this be of some practical > use? > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l > -- Lincoln D. Stein Cold Spring Harbor Laboratory 1 Bungtown Road Cold Spring Harbor, NY 11724 (516) 367-8380 (voice) (516) 367-8389 (fax) FOR URGENT MESSAGES & SCHEDULING, PLEASE CONTACT MY ASSISTANT, SANDRA MICHELSEN, AT michelse at cshl.edu